CAZypedia needs your help!
We have many unassigned pages in need of Authors and Responsible Curators. See a page that's out-of-date and just needs a touch-up? - You are also welcome to become a CAZypedian. Here's how.
Scientists at all career stages, including students, are welcome to contribute.
Learn more about CAZypedia's misson here and in this article.
Totally new to the CAZy classification? Read this first.
Difference between revisions of "Sequence-based classification"
Line 3: | Line 3: | ||
* Responsible Curator: ^^^Spencer Williams^^^ | * Responsible Curator: ^^^Spencer Williams^^^ | ||
---- | ---- | ||
− | Sequence classification methods require knowledge of at least part of the amino acid sequence for an protein. Algorithmic methods are then used to compare sequences. Each of the resulting families contain proteins that are related by sequence, and by corollary, 3D fold. An obvious shortcoming of sequence-based classifications is that they can only be applied to proteins for which sequence information is available. On the other hand sequence-based classification schemes allow classification of proteins for which no biochemical evidence has been obtained such as the thousands of uncharacterized sequences of [[carbohydrate-active enzymes]] that originate from genome sequencing efforts worldwide. Sequence based classification methods are rather different (and in many ways | + | Sequence classification methods require knowledge of at least part of the amino acid sequence for an protein. Algorithmic methods are then used to compare sequences. Each of the resulting families contain proteins that are related by sequence, and by corollary, 3D fold. An obvious shortcoming of sequence-based classifications is that they can only be applied to proteins for which sequence information is available. On the other hand sequence-based classification schemes allow classification of proteins for which no biochemical evidence has been obtained such as the thousands of uncharacterized sequences of [[carbohydrate-active enzymes]] that originate from genome sequencing efforts worldwide. Sequence based classification methods are rather different (and in many ways complementary) to the Enzyme Commission classification scheme, which assigns proteins to groups based on the nature of the reactions that they catalyze. |
== Classification of glycoside hydrolases == | == Classification of glycoside hydrolases == |
Revision as of 03:25, 26 September 2010
This page has been approved by the Responsible Curator as essentially complete. CAZypedia is a living document, so further improvement of this page is still possible. If you would like to suggest an addition or correction, please contact the page's Responsible Curator directly by e-mail.
- Authors: ^^^Steve Withers^^^, ^^^Spencer Williams^^^
- Responsible Curator: ^^^Spencer Williams^^^
Sequence classification methods require knowledge of at least part of the amino acid sequence for an protein. Algorithmic methods are then used to compare sequences. Each of the resulting families contain proteins that are related by sequence, and by corollary, 3D fold. An obvious shortcoming of sequence-based classifications is that they can only be applied to proteins for which sequence information is available. On the other hand sequence-based classification schemes allow classification of proteins for which no biochemical evidence has been obtained such as the thousands of uncharacterized sequences of carbohydrate-active enzymes that originate from genome sequencing efforts worldwide. Sequence based classification methods are rather different (and in many ways complementary) to the Enzyme Commission classification scheme, which assigns proteins to groups based on the nature of the reactions that they catalyze.
Classification of glycoside hydrolases
Families
Using a combination of comparison algorithms the glycoside hydrolases have been classified into more than 100 GH families [1]. This classification is permanently available through the Carbohydrate Active enZyme database [2]. Classification of glycoside hydrolases into families allows many useful predictions to be made since it has long been noted that the catalytic machinery and molecular mechanism is conserved for the vast majority of the GH families [3] as well as the geometry around the glycosidic bond (irrespective of naming conventions) [4]. Usually, the mechanism used (ie retaining or inverting) is conserved within a GH family. One notable exception is the glycoside hydrolases of family GH97, which contains both retaining and inverting enzymes; a glutamate acts as a general base in inverting members, whereas an aspartate likely acts as a catalytic nucleophile in retaining members [5]. Another mechanistic curiosity are the glycoside hydrolases of familes GH4 and GH109 which operate through an NAD-dependent hydrolysis mechanism that proceeds through oxidation-elimination-addition-reduction steps via anionic transition states [6]. This allows a single enzyme to hydrolyze both alpha- and beta-glycosides.
Deleted families
Various GH families have been deleted. These include GH21, GH40, GH41, GH60 and GH69. Once deleted, family numbers are never reused in order to prevent confusion.
Clans
Classification of GH families into larger groups, termed "clans", has been proposed [7]. A clan is a group of families that possess significant similarity in their tertiary structure, catalytic residues and mechanism. Thus knowledge of three-dimensional structure and the functional assignment of catalytic residues is required for classification into clans. Families within clans are thought to have a common evolutionary ancestry. For an updated table of glycoside hydrolase clans see the CAZy Database [8].
Classification of glycosyltransferases
Using sequence comparison algorithms glycosyltransferases that use nucleotide diphospho-sugar, nucleotide monophospho-sugars and sugar phosphates have been grouped into over 90 GT families [9, 10]. This classification is permanently available through the Carbohydrate Active enZyme database[2]. As for the GH families above, the same three-dimensional fold is expected to occur within each of the GT families. Just as for the glycoside hydrolases, several of the families defined on the basis of sequence similarities turn out to have similar three-dimensional structures.
Deleted families
Various GT families have been deleted. These include GT36. Once deleted, family numbers are never reused in order to prevent confusion.
References
Error fetching PMID 1618761:
Error fetching PMID 8687420:
Error fetching PMID 9334165:
Error fetching PMID 12691742:
Error fetching PMID 18848471:
Error fetching PMID 17676871:
- Error fetching PMID 1747104:
-
Carbohydrate Active Enzymes database; URL http://www.cazy.org/
- Error fetching PMID 1618761:
- Henrissat B, Callebaut I, Fabrega S, Lehn P, Mornon JP, and Davies G. (1995). Conserved catalytic machinery and the prediction of a common fold for several families of glycosyl hydrolases. Proc Natl Acad Sci U S A. 1995;92(15):7090-4. DOI:10.1073/pnas.92.15.7090 |
- Error fetching PMID 18848471:
- Error fetching PMID 17676871:
- Error fetching PMID 8687420:
-
Carbohydrate Active Enzymes database, glycoside hydrolase classification; URL http://www.cazy.org/Glycoside-Hydrolases.html
- Error fetching PMID 9334165:
- Error fetching PMID 12691742: