USING STRUCTURAL INFORMATION IN FUNCTIONAL GENOMICS: IDENTIFICATION OF PROTEIN KINASE SUBSTRATES

 

Bostjan Kobe, Robert A. Breinl and Ross I. Brinkworth

 

Department of Biochemistry and Molecular Biology, University of Queensland, Brisbane, Qld 4072, Australia (b.kobe@mailbox.uq.edu.au).

 

 

Protein structure determines its function. Structural genomics initiatives take advantage of this relationship, aiming to contribute to the global "functional genomics" effort to functionally annotate every gene product. We hypothesized that for certain large families of proteins, structural information available for some members of the family can be used in a comprehensive fashion to identify the functions of all members of the family. We tested this hypothesis on the protein Ser/Thr kinase family.

Protein kinases are responsible for protein phosphorylation, a post-translational modification that regulates essentially every cellular process including metabolism, growth, differentiation, motility, membrane transport, learning and memory. To ensure the signalling fidelity, protein kinases must be specific and act only on a defined subset of cellular targets. Understanding the substrate specificity of a protein kinase therefore defines its cellular role.

The large number of protein kinases makes it impractical to determine their specificities and substrates experimentally. Based primarily on three-dimensional structural information on protein kinases, we developed a web-interfaced bioinformatic tool Predikin that predicts an optimal substrate sequence for any Ser/Thr kinase, and this optimal sequence can then be used to search for substrate proteins [1]. First, we developed a set of rules governing the binding of a heptapeptide substrate motif (surrounding the phosphorylation site) to the kinase, using the available crystal structures, molecular modelling and sequence analyses of kinases and substrates. We then implemented these rules in a web-interfaced program for automated prediction of optimal substrate motifs, taking only the amino acid sequence of a protein kinase as input. We adapted the available algorithms (e.g., Ref. [2]) to search protein databases (using the heptapeptide motif) for putative substrate proteins.

We show the utility of the method by analyzing yeast signal transduction pathways. Our tool also allows us to assign likely kinases to all the phosphorylation sites identified by the yeast phosphoproteome analysis via mass spectrometry [2].

Our method is the only available predictive method generally applicable for identifying possible substrate proteins for protein kinases, and helps in silico construction of signalling pathways. The accuracy of prediction is comparable to the accuracy of data obtained from systematic large-scale experimental functional genomics approaches.

 

References

1       Brinkworth, R.I, Breinl, R.A. and Kobe, B. (2003) Proc. Natl. Acad. Sci. USA. 100, 74-79.

2       Yaffe, M.B., Leparc G.G., Lai, J., Obata, T., Volinia, S. and Cantley, L.C. (2001) Nature Biotech. 19, 348-353.

3       Ficarro, S.B., McCleland, M.L., Stukenberg, P.T., Burke, D.J., Ross, M.M., Shabanowitz, J., Hunt, D.F. and White, F.M. (2002) Nature Biotech. 20, 301-305.