CAZypedia needs your help!
We have many unassigned pages in need of Authors and Responsible Curators. See a page that's out-of-date and just needs a touch-up? - You are also welcome to become a CAZypedian. Here's how.
Scientists at all career stages, including students, are welcome to contribute.
Learn more about CAZypedia's misson here and in this article.
Totally new to the CAZy classification? Read this first.

Difference between revisions of "Carbohydrate-binding modules"

From CAZypedia
Jump to navigation Jump to search
 
(41 intermediate revisions by 4 users not shown)
Line 1: Line 1:
<!-- RESPONSIBLE CURATORS: Please replace the {{UnderConstruction}} tag below with {{CuratorApproved}} when the page is ready for wider public consumption -->
+
{{CuratorApproved}}
{{UnderConstruction}}
+
* [[Author]]: [[User:Alicia Lammerts van Bueren|Alicia Lammerts van Bueren]] and [[User:Elizabeth Ficko-Blean|Elizabeth Ficko-Blean]]
* [[Author]]: ^^^Alicia Lammerts van Bueren^^^ and ^^^Elizabeth Ficko-Blean^^^
+
* [[Responsible Curator]]:  [[User:Al Boraston|Al Boraston]] and [[User:Spencer Williams|Spencer Williams]]
* [[Responsible Curator]]:  ^^^Al Boraston^^^ and ^^^Spencer Williams^^^
 
 
----
 
----
  
 
== Overview ==
 
== Overview ==
 
[[Image:MvGH33modularity3.jpg||thumb|right|300px|'''Figure 1. An example of modularity in a CBM-containing glycoside hydrolase.''' Sialidase from ''Micromonospora viridifaciens'' contains an N-terminal [[CBM32]] (red) X20 linker (yellow) and a C-terminal catalytic [[GH33]] module (green) <cite>Gaskell1995</cite>. Graphical representation of modularity in amino acid sequence (top) and 3D crystal structure (bottom) PDB ID [{{PDBlink}}1eut 1eut].]]
 
[[Image:MvGH33modularity3.jpg||thumb|right|300px|'''Figure 1. An example of modularity in a CBM-containing glycoside hydrolase.''' Sialidase from ''Micromonospora viridifaciens'' contains an N-terminal [[CBM32]] (red) X20 linker (yellow) and a C-terminal catalytic [[GH33]] module (green) <cite>Gaskell1995</cite>. Graphical representation of modularity in amino acid sequence (top) and 3D crystal structure (bottom) PDB ID [{{PDBlink}}1eut 1eut].]]
Carbohydrate-binding modules (CBMs) <cite>Boraston2004 Hashimoto2006 Shoseyov2006 Guillen2010 Gilbert2013 Armenta2017</cite> are a class of sugar-binding proteins that are generally defined as an amino acid sequence within a larger encoded protein sequence that fold into a structurally discreet module, forming part of a larger multi-modular enzyme (Figure 1).
+
Carbohydrate-binding modules (CBMs) <cite>Boraston2004 Hashimoto2006 Shoseyov2006 Guillen2010 Gilbert2013 Armenta2017</cite> are a class of sugar-binding proteins that comprise amino acid sequences within a larger encoded protein sequence that fold into a structurally discrete module, typically forming part of a larger multi-modular enzyme <cite>Ficko-Blean2012</cite> (Figure 1).
The role of a CBM is to bind to carbohydrate ligand and direct the catalytic machinery onto its substrate, thus enhancing the catalytic efficiency of the multimodular carbohydrate-active enzyme; however, there are several key exceptions of divergent evolution in the functions of CBMs <cite>Taylor2014</cite> which are [[Carbohydrate-binding_modules#Blurred Lines: CBMs, Lectins and Outliers|discussed below]]. The individual CBMs are themselves devoid of any catalytic activity and are most commonly associated with [[Glycoside Hydrolases]] but have also been identified in [[:Category:Polysaccharide_Lyase_Families|Polysaccharide Lyases]], [[Auxiliary_Activity_Families|polysaccharide oxidases]], [[Glycosyltransferases]] and plant cell wall-binding expansins <cite>Georgelis2011</cite>.
+
The conventional role of a CBM is to bind to carbohydrate ligand and direct the catalytic machinery onto its substrate, thus enhancing the catalytic efficiency of the multimodular carbohydrate-active enzyme; however, there are several key exceptions of divergent evolution in the functions of CBMs <cite>Taylor2014</cite> which are [[Carbohydrate-binding_modules#Blurred Lines: CBMs, Lectins and Outliers|discussed below]]. The individual CBMs are themselves devoid of any catalytic activity and are most commonly associated with [[Glycoside Hydrolases]] but have also been identified in [[:Category:Polysaccharide_Lyase_Families|Polysaccharide Lyases]], [[Auxiliary_Activity_Families|polysaccharide oxidases]], [[Glycosyltransferases]], plant cell wall-binding expansins <cite>Georgelis2011</cite> and in some lectins <cite>Taylor2014</cite>.
  
CBMs themselves do not generally undergo any conformational changes when binding ligand. Rather, the topography of the carbohydrate-binding site is preformed to be complementary to the shape of the target ligand (see [[#Types|Types]]). This is achieved by the presence of amino acid side chains and loops within the CBM binding pocket or cleft. However, multimodular enzymes as a whole may be quite flexible and undergo significant conformational changes when binding substrate. Flexible Ser-Thr-Pro sequences, which are often O-glycosylated, link adjacent modules and can allow for shifts in the orientation and direction of the catalytic module with respect to the CBM on the target substrate <cite>Ficko2009</cite>.  
+
CBMs themselves do not undergo any conformational changes when binding ligand. Rather, the topography of the carbohydrate-binding site is preformed to be complementary to the shape of the target ligand (see [[#Types|Types]]). This is achieved by the presence of amino acid side chains present on the CBM binding surface or within the CBM binding cleft or pocket. Multimodular enzymes that include CBMs may as a whole be quite flexible and undergo significant conformational changes when binding substrate. Flexible Ser-Thr-Pro sequences often link adjacent modules and can allow for shifts in the orientation and direction of the catalytic module with respect to the CBM on the target substrate. In other enzymes the linking regions may be quite rigid, such as the 5-helical bundle linker module linking a [[CBM32]] to a [[GH84]] module <cite>Ficko2009</cite>.  
  
 
==History of CBMs==
 
==History of CBMs==
 
CBMs were initially characterized as cellulose binding domains (CBDs) in cellobiohydrolases CBHI and CBHII from ''Trichoderma reesei'' <cite>VanTilbeurgh1986 Tomme1988</cite> and cellulases CenA and CexA from ''Cellulomonas fimi'' <cite>Gilkes1988</cite>. Limited proteolysis experiments on these enzymes yielded truncated enzyme products that showed a reduced or complete loss in their ability to hydrolyze cellulose substrates. The reduction in enzymatic activity was attributed to the loss of ~100 amino acid C-terminal domains which prevented the adsorbption of the enzymes onto cellulose substrate. Thus it was proposed that these independent "domains" are critical for targeting the enzymes onto its substrate and enhancing their hydrolytic activity. It rapidly became evident that CBDs were not only appended to cellulases but were also found in a range of other plant cell wall degrading enzymes <cite>Kellett1990 Ferriera1990 Ferriera1993</cite>.
 
CBMs were initially characterized as cellulose binding domains (CBDs) in cellobiohydrolases CBHI and CBHII from ''Trichoderma reesei'' <cite>VanTilbeurgh1986 Tomme1988</cite> and cellulases CenA and CexA from ''Cellulomonas fimi'' <cite>Gilkes1988</cite>. Limited proteolysis experiments on these enzymes yielded truncated enzyme products that showed a reduced or complete loss in their ability to hydrolyze cellulose substrates. The reduction in enzymatic activity was attributed to the loss of ~100 amino acid C-terminal domains which prevented the adsorbption of the enzymes onto cellulose substrate. Thus it was proposed that these independent "domains" are critical for targeting the enzymes onto its substrate and enhancing their hydrolytic activity. It rapidly became evident that CBDs were not only appended to cellulases but were also found in a range of other plant cell wall degrading enzymes <cite>Kellett1990 Ferriera1990 Ferriera1993</cite>.
  
CBDs were previously categorized into 13 Types based on amino acid sequence similarities <cite>Tomme1995</cite>. This classification system became complicated when similar functional domains from non-cellulolytic carbohydrate-active enzymes were discovered that did not bind cellulose but met all of the [[#Criteria for Defining a new CBM family|criteria]] of a CBD (for example see <cite>Svensson1989</cite>). The term carbohydrate-binding module was proposed to solve this problem to be inclusive of all ancillary modules with non-catalytic carbohydrate-binding function (for a review see <cite>Boraston2004</cite>). Since this time, CBMs have been found appended to enzymes that interact with almost all characterized carbohydrate sources found on Earth (Table 1).
+
Initially, CBDs were categorized into 13 Types based on amino acid sequence similarities <cite>Tomme1995</cite>. This classification system became complicated when similar functional domains from non-cellulolytic carbohydrate-active enzymes were discovered that did not bind cellulose but met all of the [[#Criteria for Defining a new CBM family|criteria]] of a CBD (for example see <cite>Svensson1989</cite>). The term carbohydrate-binding module was proposed to solve this problem to be inclusive of all ancillary modules with non-catalytic carbohydrate-binding function (for a review see <cite>Boraston2004</cite>). Since this time, CBMs have been found appended to enzymes that interact with almost all characterized carbohydrate materials found on Earth (Table 1).
  
 
<div style="width:550px;" align="left">
 
<div style="width:550px;" align="left">
{| {{Prettytable}}  
+
{| {{Prettytable}} class="mw-collapsible mw-collapsed"
 
|-
 
|-
|{{Hl2}} colspan="2" align="center"|'''Table 1: List of Carbohydrates and Interacting CBM Families<sup>a</sup>'''
+
| bgcolor="#F5F5F5" colspan="2" align="center"|'''Table 1: List of Carbohydrates and Interacting CBM Families<sup>a</sup>'''
 
|-
 
|-
 
|'''Cellulose'''     
 
|'''Cellulose'''     
Line 59: Line 58:
  
 
=== Fold ===
 
=== Fold ===
[[Image:CBMfold.jpg|thumb|right|500px|'''Figure 2. Classical CBM beta-sandwich fold.''' C-terminal family CBM27 from ''Thermotoga maritima''mannanase, a Type B CBM (A)(side and front view, PDB ID [{{PDBlink}}1OF4 1OF4]) <cite>Boraston20031</cite> and C-terminal family CBM6 from''Clostridium stercorarium'' xylanase (B) (PDB ID [{{PDBlink}}1NAE 1NAE]) <cite>Boraston20032</cite> showing binding sites on the face (A) and edge (B) of the beta sandwich fold respectively.]]
+
[[Image:CBMfold.jpg|thumb|right|500px|'''Figure 2. Classical CBM beta-sandwich fold.''' C-terminal family CBM27 from ''Thermotoga maritima'' mannanase, a Type B CBM (A)(side and front view, PDB ID [{{PDBlink}}1OF4 1OF4]) <cite>Boraston20031</cite> and C-terminal family CBM6 from ''Clostridium stercorarium'' xylanase (B) (PDB ID [{{PDBlink}}1NAE 1NAE]) <cite>Boraston20032</cite> showing binding sites on the face (A) and loop region (B) of the beta sandwich fold respectively.]]
CBMs fall into one of 7 fold families <cite>Boraston2004</cite>. The most common fold exhibited by CBMs is the beta-sandwich fold which is comprised of two overlapping beta-sheets consisting of three to six antiparallel beta strands (Figure 2). The ligand binding site is located primarily on the same face of a beta-sheet (Figure 2A), but may also be positioned on the edge of the beta-sheet within the joining loop region (Figure 2B). There are examples of CBMs in the beta-sandwich fold family exhibiting dual binding sites such as [[CBM6]] <cite>Pires2004</cite> and dual starch-binding sites in [[CBM20]] <cite>Lawson1994</cite>. Other fold families include the beta-trefoil fold, cysteine knot, OB fold, the hevein and hevein-like and unique folds <cite>Boraston2004</cite>. CBMs of the beta-trefoil fold family ([[CBM13]], [[CBM42]]) present multivalent sugar-binding sites, as demonstrated for their interaction with xylan and arabinoxylan respectively <cite>Fujimoto2013</cite>.
+
CBMs fall into one of 7 fold families <cite>Boraston2004</cite>. The most common fold exhibited by CBMs is the beta-sandwich fold comprised of two overlapping beta-sheets each consisting of three to six antiparallel beta strands (Figure 2). The ligand binding site may be located on one face of the beta-sheet (Figure 2A) or may be positioned within the variable loop region of the beta-sheet (Figure 2B). There are examples of CBMs in the beta-sandwich fold family exhibiting dual binding sites such as [[CBM6]] <cite>Pires2004</cite> and dual starch-binding sites in [[CBM20]] <cite>Lawson1994</cite>. Other fold families include the beta-trefoil fold, cystine knot, OB fold, the hevein and hevein-like and unique folds <cite>Boraston2004</cite>. CBMs of the beta-trefoil fold family ([[CBM13]], [[CBM42]]) present multivalent sugar-binding sites, as demonstrated for their interaction with xylan and arabinoxylan respectively <cite>Fujimoto2013</cite>.
  
 
=== Types ===
 
=== Types ===
 
[[Image:TypeAsurface.png|thumb|right|400px|'''Figure 3. CBM Types.''' (A) Schematic of different CBM Types binding with different regions of a polysaccharide substrate. (B) Type A [[CBM2]]b from ''Pyrococcus furiosis'' [[GH18]] chitinase(PDB ID [{{PDBlink}}2CRW 2CRW]) <cite>Nakamura2008</cite>. Aromatic side chains of Type A CBMs form the planar binding surface.]]
 
[[Image:TypeAsurface.png|thumb|right|400px|'''Figure 3. CBM Types.''' (A) Schematic of different CBM Types binding with different regions of a polysaccharide substrate. (B) Type A [[CBM2]]b from ''Pyrococcus furiosis'' [[GH18]] chitinase(PDB ID [{{PDBlink}}2CRW 2CRW]) <cite>Nakamura2008</cite>. Aromatic side chains of Type A CBMs form the planar binding surface.]]
CBMs are classified into three main Types defined by the shape and degree of polymerization of their target ligand (Figure 3A). The architecture of the binding site determines what region within a polysaccharide the enzyme will [[#Functional Roles of CBMs|target]]. A review on CBM plant cell wall recognition <cite>Gilbert2013</cite> has modified the classification of CBM Types to be as follows:  
+
CBMs are classified into three main Types defined by the shape and degree of polymerization of their target ligand (Figure 3A). The architecture of the binding site determines what region within a saccharide macromolecule the enzyme will [[#Functional Roles of CBMs|target]]. The classification of CBM Types is as follows <cite>Gilbert2013, Armenta2017, Boraston2004</cite>:  
* Type A: bind to crystalline surfaces of cellulose and chitin (example families [[CBM1]], [[CBM2]], [[CBM3]], [[CBM5]], [[CBM10]]). Their binding sites are planar and rich in aromatic amino acid residues creating a flat platform to bind to the planar polycrystalline chitin/cellulose surface (Figure 3B). Type A CBMs are unique and differ significantly from Type B or C.
+
* Type A: bind to crystalline surfaces of the polysaccharides cellulose and chitin (example families [[CBM1]], [[CBM2]], [[CBM3]], [[CBM5]], [[CBM10]]). Their binding sites are planar and rich in aromatic amino acid residues creating a flat platform to bind to the planar polycrystalline chitin/cellulose surface (Figure 3B). Type A CBMs are unique and differ significantly from Type B or C.
* Type B: bind internal glycan chains (''endo''-type). Type B are the most abundant form of CBMs reported to date. Type B binding sites appear as extended grooves or clefts comprised of binding subsites to generally accommodate longer sugar chains with four or more monosaccharide units (see Figure 2A for an example). There are some examples in [[CBM6]], [[CBM36]] and [[CBM60]] that contain only two subsites.  
+
* Type B: bind internal glycan chains (''endo''-type). Type B are the most abundant form of CBMs reported to date. Type B binding sites appear as extended grooves or clefts comprised of binding subsites to generally accommodate longer sugar chains with four or more monosaccharide units (see Figure 2A for an example). There are some examples of CBMs, in families [[CBM6]], [[CBM13]], [[CBM20]], [[CBM36]] and [[CBM60]], that contain two subsites.  
 
* Type C: bind termini of glycans (reducing/non-reducing ends, ''exo''-type). Type C binding sites are short pockets for recognizing short sugar ligands containing one to three monosaccharide units (example families [[CBM9]], [[CBM13]], [[CBM32]], [[CBM47]], [[CBM66]], [[CBM67]]).  Families containing Type C CBMs are considered 'lectin-like' and may include lectins and CBMs with no appended catalytic modules as members.
 
* Type C: bind termini of glycans (reducing/non-reducing ends, ''exo''-type). Type C binding sites are short pockets for recognizing short sugar ligands containing one to three monosaccharide units (example families [[CBM9]], [[CBM13]], [[CBM32]], [[CBM47]], [[CBM66]], [[CBM67]]).  Families containing Type C CBMs are considered 'lectin-like' and may include lectins and CBMs with no appended catalytic modules as members.
 
A table relating CBM type to CBM family is available in a 2017 review <cite>Armenta2017</cite>; see also the classic CBM review by Boraston et al. <cite>Boraston2004</cite>.
 
  
 
== Properties of CBM Carbohydrate Binding Interactions ==
 
== Properties of CBM Carbohydrate Binding Interactions ==
Line 75: Line 72:
 
CBMs carry out four main functional roles:  
 
CBMs carry out four main functional roles:  
  
*''Targeting Effect'': CBMs target the enzyme to distinct regions within a larger macromolecular polysaccharide substrate (reducing end, non-reducing end, internal polysaccharide chains), depending on the architecture of its binding site (see [[#Types|Types]]).   
+
*''Targeting Effect'': CBMs target the enzyme to distinct regions on a saccharide substrate (reducing end, non-reducing end, internal polysaccharide chains), depending on the architecture of its binding site (see [[#Types|Types]]).   
  
*''Proximity Effect'': CBMs increase the concentration of enzyme in close proximity to its polysaccharide substrate. This leads to more rapid and efficient substrate degradation.   
+
*''Proximity Effect'': CBMs increase the concentration of enzyme in close proximity to its saccharide substrate. This leads to more rapid and efficient substrate degradation.   
  
 
An excellent example demonstrating targeting and proximity effects of plant cell wall specific CBMs is available <cite>Herve2010</cite>.  
 
An excellent example demonstrating targeting and proximity effects of plant cell wall specific CBMs is available <cite>Herve2010</cite>.  
Line 83: Line 80:
 
*''Disruptive Effect'': Some CBMs have been shown to disrupt the surface of tightly packed polysaccharides, such as cellulose fibres and starch granules, causing the substrate to loosen and become more exposed to the catalytic module for more efficient degradation. Disruptive roles have been described for cellulose binding [[CBM2]]a <cite>Din1991</cite> and [[CBM44]] <cite>Gourlay2012</cite>. Dual starch-binding domains of family [[CBM20]] from ''Aspergillus niger'' glucoamylase have been shown to disrupt the surface of starch <cite>Southall1999</cite> while dual-associated [[CBM41]] modules may have a disruptive role in degrading glycogen granules <cite>vanBueren2007</cite>. [[CBM33]] was thought to have a disruptive effect on chitin, however these have now been reclassified as copper-dependent lytic polysaccharide monooxygenases <cite>Vaaje2010</cite> and are found in CAZy [[Auxiliary Activity Family 10]].  
 
*''Disruptive Effect'': Some CBMs have been shown to disrupt the surface of tightly packed polysaccharides, such as cellulose fibres and starch granules, causing the substrate to loosen and become more exposed to the catalytic module for more efficient degradation. Disruptive roles have been described for cellulose binding [[CBM2]]a <cite>Din1991</cite> and [[CBM44]] <cite>Gourlay2012</cite>. Dual starch-binding domains of family [[CBM20]] from ''Aspergillus niger'' glucoamylase have been shown to disrupt the surface of starch <cite>Southall1999</cite> while dual-associated [[CBM41]] modules may have a disruptive role in degrading glycogen granules <cite>vanBueren2007</cite>. [[CBM33]] was thought to have a disruptive effect on chitin, however these have now been reclassified as copper-dependent lytic polysaccharide monooxygenases <cite>Vaaje2010</cite> and are found in CAZy [[Auxiliary Activity Family 10]].  
  
*''Adhesion'': CBMs have been shown to adhere enzymes onto the surface of bacterial cell wall components while exhibiting catalytic activity on an external neighboring carbohydrate substrate. For example, [[CBM35]] modules have been shown to interact with the surface glucuronic acid containing sugars in the cell wall of ''Amycolatopsis orientalis'' while the catalytic module is active on external chitosan likely originating from the cell wall of competing soil fungal species <cite>Montanier2009</cite>.
+
*''Adhesion'': CBMs have been shown to adhere enzymes onto the surface of bacterial cell wall components while exhibiting catalytic activity on an external neighboring carbohydrate substrate. For example, [[CBM35]] modules have been shown to interact with the surface glucuronic acid containing sugars in the cell wall of ''Amycolatopsis orientalis'' while the catalytic module is active on external chitosan likely originating from the cell wall of competing soil fungal species <cite>Montanier2009</cite>. ''Streptococcus pneumoniae'' uses a [[CBM71]] as an adhesin, to mediate adherence to host cell surfaces displaying lactose or N-acetyllactosamine <cite>king2014</cite>.
  
There are two examples in the literature of CBMs extending the active site sub-sites of their appended glycosidase modules.  The glycogen-degrading pneumococcal virulence factor SpuA has its active site extended by one of two tandem [[CBM41]]s <cite>Lammerts2011</cite>. The glucan starch phosphatase Starch Excess4 has its active site extended by a [[CBM48]] <cite>Meekins2014</cite>.  
+
There are examples in the literature of CBMs extending the active site sub-sites of their appended glycosidase modules.  The glycogen-degrading pneumococcal virulence factor SpuA has its active site extended by one of two tandem [[CBM41]]s <cite>Lammerts2011</cite>. The glucan starch phosphatase Starch Excess4 has its active site extended by a [[CBM48]] <cite>Meekins2014</cite>.  
  
Several lectins are classified as CBMs even though they are not on the same polypeptide chain as a carbohydrate-active enzyme.  See [[Carbohydrate-binding_modules#CBMs, Lectins and Outliers|Blurred Lines: CBMs, Lectins and Outliers]] for a more complete discussion.  
+
Several lectins are classified as CBMs even though they are not on the same polypeptide chain as a carbohydrate-active enzyme.  See [[Carbohydrate-binding_modules#CBMs, Lectins and Outliers|Blurred Lines: CBMs, Lectins and Outliers]] for a more complete discussion.
  
 
===Driving Forces of CBM-Carbohydrate Interactions===
 
===Driving Forces of CBM-Carbohydrate Interactions===
There are two key features that drive CBM/carbohydrate interactions. Extensive hydrogen bonding occurs between the hydroxyl groups of carbohydrate ligands and polar amino acid residues within the binding site. Additional water-mediated hydrogen bonding networks between these groups can also be found in the binding site. By far the most important characteristic driving force mediating protein-carbohydrate interactions is the position and orientation of aromatic amino acid residues (Try, Tyr and sometimes Phe) within the binding site. These essential planar residues provide a hydrophobic platform for the planar face of sugar rings, an interaction resembling hydrophobic stacking interactions. Weak intermolecular electrostatic interactions occur between C-H and pi electrons in the planar ring systems and contribute 1.5 - 2.5 kcal/mol energy to the binding reaction <cite>Meyer2003</cite>. CBMs may also use coordinated metal ions within the binding site to directly interact with their target ligand. For example, families [[CBM36]] <cite>Jamal2004</cite> and [[CBM60]] <cite>Montanier2010</cite> exhibit calcium-dependent binding to xylooligosaccharides.
+
There are two key features that drive CBM-carbohydrate interactions. Extensive hydrogen bonding occurs between the hydroxyl groups of carbohydrate ligands and polar amino acid residues within the binding site. Additional water-mediated hydrogen bonding networks between these groups can also be found in the binding site. By far the most important characteristic driving force mediating protein-carbohydrate interactions is the position and orientation of aromatic amino acid residues (Try, Tyr and sometimes Phe) within the binding site. These essential planar residues provide a hydrophobic platform for the planar face of sugar rings, an interaction resembling hydrophobic stacking interactions. Weak intermolecular electrostatic interactions occur between C-H and pi electrons in the planar ring systems and contribute 1.5 - 2.5 kcal/mol energy to the binding reaction <cite>Meyer2003</cite>. CBMs may also use coordinated metal ions within the binding site to directly interact with their target ligand. For example, families [[CBM36]] <cite>Jamal2004</cite> and [[CBM60]] <cite>Montanier2010</cite> exhibit calcium-dependent binding to xylooligosaccharides.
  
CBM-carbohydrate interactions in general are quite weak (K<sub>a</sub> affinities in mM<sup>-1</sup> to uM<sup>-1</sup> range) making the interaction easily reversible. This feature allows for "recycling" of the appended enzyme to bind to a new region on the substrate once catalysis has been completed at a given site.  Multivalent effects (more than one saccharide-binding site or multiple CBMs within the polypeptide) may act to increase the overall affinity relative to a single binding site interaction.  
+
CBM-carbohydrate interactions in general are quite weak (K<sub>a</sub> affinities in mM<sup>-1</sup> to uM<sup>-1</sup> range) making the interaction easily reversible. This feature allows for "recycling" of the appended enzyme to bind to a new region on the substrate once catalysis has been completed at a given site. Some CBMs that bind crystalline ligands, typified by CBM2a, bind with apparent irreversibility (they do not desorb when free CBM is diluted), displaying surface mobility and exchanging with free CBM <cite>McLean2002, Jervis1997</cite>. Multivalent effects (more than one saccharide-binding site or multiple CBMs within the polypeptide) may act to increase the overall affinity relative to a single binding site interaction.  
  
 
=== CBM Promiscuity ===
 
=== CBM Promiscuity ===
Because of the diversity of carbohydrate structures and motifs found in plant and mammalian glycans, some CBMs have become adapted to recognize more than one type of monosaccharide or glycosidic bond linkage within the binding pocket, a feature called CBM promiscuity. For example a family [[CBM32]] from ''Clostridium perfringens'' NagH binds N-acetyl-glucosamine in the primary subsite but can accommodate N-acetyl-galactosamine or mannose in the secondary site <cite>Ficko20092</cite>. There are several examples of ligand promiscuity within family [[CBM32]]. In plant cell wall recognizing CBMs, they are often able to accommodate both cellulose and hemicelluloses. For example, several family [[CBM6]] members interact with cellulose, xylose or laminarin <cite>Boraston20032 Lammerts2005</cite>. Family [[CBM37]] exhibit broad binding specificity for xylan, chitin and cellulose (ref). Family [[CBM41]] appended to a [[GH13]] pullulanase can accommodate both alpha-1,4- and alpha-1,6-linked glucose found in amylopectin (from starch/glycogen) <cite>Lammerts2007</cite>.  The flexibility in carbohydrate recognition by CBMs contributes to the [[#Functional Roles of CBMs|targeting]] efficiency of carbohydrate-active enzymes in environments where there is diverse range of polysaccharides present (such as the plant cell wall or mammalian tissues).
+
Because of the diversity of carbohydrate structures and motifs found in plant and mammalian glycans, some CBMs have evolved to recognize more than one type of monosaccharide or glycosidic bond linkage within the binding pocket, a feature called CBM promiscuity. For example a family [[CBM32]] from ''Clostridium perfringens'' NagH binds N-acetyl-glucosamine in the primary subsite but can accommodate N-acetyl-galactosamine or mannose in the secondary site <cite>Ficko20092</cite>. There are several examples of ligand promiscuity within family [[CBM32]]. In plant cell wall recognizing CBMs, they are often able to accommodate both cellulose and hemicelluloses. For example, several family [[CBM6]] members interact with cellulose, xylose or laminarin <cite>Boraston20032 Lammerts2005</cite>. Family [[CBM41]] appended to a [[GH13]] pullulanase can accommodate both alpha-1,4- and alpha-1,6-linked glucose found in amylopectin (from starch/glycogen) <cite>Lammerts2007</cite>.  The flexibility in carbohydrate recognition by CBMs contributes to the [[#Functional Roles of CBMs|targeting]] efficiency of carbohydrate-active enzymes in environments where there is diverse range of saccharides present (such as the plant cell wall or mammalian tissues).
  
 
=== CBMs and Multivalency ===
 
=== CBMs and Multivalency ===
Multivalency is the collective strength of several interactions with a given ligand. Because CBM-carbohydrate interactions are relatively weak, some carbohydrate-active enzymes, mainly glycoside hydrolases, have developed ways to increase their interaction with substrate via a multivalent effect. Individually, some CBMs may contain multiple binding sites to form a multivalent interaction with their target ligand, although this form of multivalency is quite rare with only a few examples ([[CBM6]], [[CBM13]], [[CBM20]]). More commonly, glycoside hydrolases may contain more than one CBM within its modular architecture, either arranged in tandem or at opposing N and C terminal ends of the protein sequence, or both. These CBMs may target the same carbohydrate ligand, different regions within the same ligand, or different ligands within a larger polysaccharide amalgam. A multivalent interaction enhances the overall affinity of an enzyme for its substrate but more importantly, tandem CBMs will cooperatively target the enzyme towards specific regions within a larger polysaccharide substrate based on the orientation and position of binding sites with respect to one another.  (Insert some examples here).
+
Multivalency is the collective strength of several interactions with a given ligand. Because CBM-carbohydrate interactions are relatively weak, some carbohydrate-active enzymes, mainly glycoside hydrolases, have developed ways to increase their interaction with substrate via a multivalent effect. Individually, some CBMs may contain multiple binding sites to form a multivalent interaction with their target ligand, although this form of multivalency is quite rare (for example [[CBM6]], [[CBM13]] and [[CBM20]]). More commonly, glycoside hydrolases may contain more than one CBM within their modular architecture, either arranged in tandem or at opposing N and C terminal ends of the protein sequence, or both. These CBMs may target the same carbohydrate ligand, different regions within the same ligand, or different ligands in a complex saccharide amalgam. A multivalent interaction enhances the overall affinity of an enzyme for its substrate. Furthermore, tandem CBMs may cooperatively target the enzyme towards specific saccharide regions based on their ligand specificity and the orientation and position of the binding sites with respect to one another.   
  
 
=== Blurred Lines: CBMs, Lectins and Outliers ===
 
=== Blurred Lines: CBMs, Lectins and Outliers ===
 
+
While CBMs are generally considered to be discrete entities within a polypeptide chain, there are some exceptions. The glycogen-degrading pneumococcal virulence factor SpuA has its active site extended by one of two tandem [[CBM41]]s <cite>Lammerts2011</cite> and the glucan starch phosphatase Starch Excess4 has its active site extended by a [[CBM48]] <cite>Meekins2014</cite>. Thus, the full biological contribution to carbohydrate-binding within the polypeptide is contributed by a multivalent interaction as an extension of the catalytic module's carbohydrate-binding properties. The PA14 domain is found in bacterial toxins, enzymes, adhesins and signaling molecules <cite>Rigden2004</cite>.  It has been described as appended to the polypeptide sequence of some glycoside hydrolase enzymes (for example some [[GH31]]s) and the crystal structure of a [[GH31]] reveals the PA14 domain is closely associated with the catalytic module, on the side of the substrate-binding cleft, potentially facilitating the binding of longer oligosaccharides <cite>Larsbrink2011</cite>. It has also been described as a domain integrated into the core of some [[GH3]] glycoside hydrolase modules. In one example, the [[GH3]] integrated PA14 domain demonstrates carbohydrate-binding function and acts to block the active site cleft, thus conferring substrate specificity for disaccharide substrates <cite>Yoshida2010</cite>. Similarly, in a [[GH2]] mannosidase, the PA14 domain determines exo- rather than endo-activity for the catalytic module <cite>Tailford2007</cite>. Evidently, more research needs to go into the structure and function of these domains as they are found in a wide variety of polypeptide sequences and the functions of the PA14 domains may be diverse.    They have not yet been classified into the CAZy classification system, though they are mentioned here as the domains have been referred to as CBMs in the literature <cite>Taylor2014</cite>.
While CBMs are generally considered to be discreet entities within a polypeptide chain, there are some exceptions. The glycogen-degrading pneumococcal virulence factor SpuA has its active site extended by one of two tandem [[CBM41]]s <cite>Lammerts2011</cite> and the glucan starch phosphatase Starch Excess4 has its active site extended by a [[CBM48]] <cite>Meekins2014</cite>. Thus, the full biological contribution to carbohydrate-binding within the polypeptide is contributed by a multivalent interaction as an extension of the catalytic module's carbohydrate-binding properties. The PA14 domain is found in bacterial toxins, enzymes, adhesins and signaling molecules <cite>Rigden2004</cite>.  It has been described as appended to the polypeptide sequence of some glycoside hydrolase enzymes (for example some [[GH31]]s) and the crystal structure of a [[GH31]] reveals the PA14 domain is closely associated with the catalytic module, on the side of the substrate-binding cleft, potentially facilitating the binding of longer oligosaccharides <cite>Larsbrink2011</cite>. It has also been described as a domain integrated into the core of some [[GH3]] glycoside hydrolase modules. In one example  the integrated PA14 domain demonstrates carbohydrate-binding function and acts to block the active site cleft, thus conferring specificity of the [[GH3]] for disaccharide substrates <cite>Yoshida2010</cite>. Similarly, in a [[GH2]] mannosidase, the PA14 domain determines exo- rather than endo-activity for the catalytic module <cite>Tailford2007</cite>. Evidently, more research needs to go into the structure and function of these domains as they are found in a wide variety of polypeptide sequences and the functions of the PA14 domains may be diverse.    They have not yet been classified into the CAZy classification system, though they are mentioned here as the domains have been referred to as CBMs in the literature <cite>Taylor2014</cite>.
 
  
 
Unrelated sugar-binding proteins have converged on similar biochemical mechanisms of saccharide recognition <cite>Taylor2014</cite>. The direct interaction of Ca<sup>2+</sup> ions with saccharides in sugar binding sites was first described in C-type animal lectins <cite>Weis1992</cite>, named thusly because of their sugar-binding requirement for Ca<sup>2+</sup>.  Other sugar-binding proteins that also require Ca<sup>2+</sup> for binding, include yeast flocculation proteins <cite>Veelders2010</cite> and other yeast adhesins <cite>Maestre-Reyna2012, Ielasi2012</cite>, and two CBM families, [[CBM36]] and [[CBM60]] <cite>Montanier2010</cite>.  
 
Unrelated sugar-binding proteins have converged on similar biochemical mechanisms of saccharide recognition <cite>Taylor2014</cite>. The direct interaction of Ca<sup>2+</sup> ions with saccharides in sugar binding sites was first described in C-type animal lectins <cite>Weis1992</cite>, named thusly because of their sugar-binding requirement for Ca<sup>2+</sup>.  Other sugar-binding proteins that also require Ca<sup>2+</sup> for binding, include yeast flocculation proteins <cite>Veelders2010</cite> and other yeast adhesins <cite>Maestre-Reyna2012, Ielasi2012</cite>, and two CBM families, [[CBM36]] and [[CBM60]] <cite>Montanier2010</cite>.  
  
Several lectins <cite>SharonLis2004 SharonLis2007</cite> are classified as CBMs in the [http://www.cazy.org/Carbohydrate-Binding-Modules.html Carbohydrate Active enZyme database] as they share amino acid sequence similarity, exhibit similar folds and display similar carbohydrate binding properties. For example, ricin toxin B chain from ''Ricinus communis'' resides in family [[CBM13]], while wheat germ agglutinin (WGA) can be found in family [[CBM18]]. The human lectin malectin is classified as family CBM57 and plays a role in N-linked glycan processing of polypeptides in the endoplasmic reticulum <cite> Shallus2008 Galli2011</cite>. CBMs also share properties with lectins that are not (yet) incorporated in the [http://www.cazy.org/Carbohydrate-Binding-Modules.html Carbohydrate Active enZyme database]. For example the fucose-specific ''Anquila anguila'' lectin AAA is similar to Type C CBMs found in family [[CBM6]] and [[CBM32]] <cite>Boraston20032</cite>. Lectins which are classified as CBMs are incorporated into a family because they were found to share amino acid sequence identity with a known CBM appended to a carbohydrate-active enzyme. A brief historical overview of the discovery and characterization of lectins is available <cite>SharonLis2004</cite> as is a review describing the convergent and divergent mechanisms of sugar recognition across the kingdoms of life <cite>Taylor2014</cite>.   
+
Several lectins <cite>SharonLis2004 SharonLis2007</cite> are classified as CBMs in the [http://www.cazy.org/Carbohydrate-Binding-Modules.html Carbohydrate Active enZyme database] as they share amino acid sequence similarity, exhibit similar folds and display similar carbohydrate binding properties. For example, ricin toxin B chain from ''Ricinus communis'' resides in family [[CBM13]], while wheat germ agglutinin (WGA) can be found in family [[CBM18]]. The human lectin malectin is classified as family [[CBM57]] and plays a role in N-linked glycan processing of polypeptides in the endoplasmic reticulum <cite> Shallus2008 Galli2011</cite>. CBMs may also share properties with lectins that are not (yet) incorporated in the [http://www.cazy.org/Carbohydrate-Binding-Modules.html Carbohydrate Active enZyme database]. For example, the fucose-specific ''Anquila anguila'' lectin AAA was described as similar to Type C CBMs found in family [[CBM6]] and [[CBM32]] <cite>Boraston20032</cite> and is now classified as a [[CBM47]] <cite>Boraston2006</cite>. Lectins which are classified as CBMs are incorporated into a family because they were found to share amino acid sequence identity with a known CBM appended to a carbohydrate-active enzyme. A brief historical overview of the discovery and characterization of lectins is available <cite>SharonLis2004</cite> as is a review describing the convergent and divergent mechanisms of sugar recognition across the kingdoms of life <cite>Taylor2014</cite>.   
  
 
The biological reaction of agglutination is when particles that are suspended in a liquid collect into clumps, such as that occuring as a serologic response to a specific antibody. The most prominent feature that is genarally considered to separate CBMs from lectins is the involvement of lectins in agglutination of sugar-containing molecules or glycoconjugates.  Lectins exploit multivalency, often forming quaternary structures as homodimers, trimers or tetramers with several binding sites which then agglutinate the target glycocongugate <cite>SharonLis2004 SharonLis2007</cite>. Few studies have been done on the agglutinating effects of CBMs or CBM tandems; however, a [[CBM26]]/[[CBM25]] pair from ''Bacillus halodurans'' is described as strongly agglutinating on soluble amylopectin (and pullulan), suggesting multivalent binding of the individual CBMs to sites on separate glucan chains <cite>Boraston2006</cite>. CBMs individually are not known to be directly involved in the formation of quaternary structures and are not known to have agglutinating properties - in common with sugar-recognition modules of all glycan-binding proteins, including lectins <cite>Taylor2014</cite>. Other examples of CBMs participating in quaternary structures but not directly implicated in quaternary structure formation are found in cellulosome complexes <cite>Freelove2001 Poole1992 Morag1995</cite> and in some secreted pathogenic bacterial enzymes complexes <cite>Adams2008 Ficko2009</cite> where complex formation is mediated through specific cohesin-dockerin module interactions.  
 
The biological reaction of agglutination is when particles that are suspended in a liquid collect into clumps, such as that occuring as a serologic response to a specific antibody. The most prominent feature that is genarally considered to separate CBMs from lectins is the involvement of lectins in agglutination of sugar-containing molecules or glycoconjugates.  Lectins exploit multivalency, often forming quaternary structures as homodimers, trimers or tetramers with several binding sites which then agglutinate the target glycocongugate <cite>SharonLis2004 SharonLis2007</cite>. Few studies have been done on the agglutinating effects of CBMs or CBM tandems; however, a [[CBM26]]/[[CBM25]] pair from ''Bacillus halodurans'' is described as strongly agglutinating on soluble amylopectin (and pullulan), suggesting multivalent binding of the individual CBMs to sites on separate glucan chains <cite>Boraston2006</cite>. CBMs individually are not known to be directly involved in the formation of quaternary structures and are not known to have agglutinating properties - in common with sugar-recognition modules of all glycan-binding proteins, including lectins <cite>Taylor2014</cite>. Other examples of CBMs participating in quaternary structures but not directly implicated in quaternary structure formation are found in cellulosome complexes <cite>Freelove2001 Poole1992 Morag1995</cite> and in some secreted pathogenic bacterial enzymes complexes <cite>Adams2008 Ficko2009</cite> where complex formation is mediated through specific cohesin-dockerin module interactions.  
  
Amino acid sequence-based classification of a CBM family may lead to the incorporation of other non-catalytic-associated CBMs within a given family. Some examples of families containing CBMs without appended catalytic modules include those with lectins (such as tachycitin ([[CBM14]]), wheat germ agglutinin ([[CBM18]]), fucolectin ([[CBM47]]), and malectin ([[CBM57]])), and those with periplasmic solute binding proteins (such as within [[CBM32]]). Interestingly, the lectin ricin B chain ([[CBM13]]), while not on the same polypeptide chain, is covalently linked through a disulfide bond, to the ricin A chain with its N-glycosidase activity <cite>Lewis1986</cite>.  The ricin A chain  N-glycosidase cleaves a specific adenine from the pentose ribose in ribosomal RNA <cite>Endo1987</cite>. Finally, [[CBM29]] is a family, with only two members, that has no appended catalytic modules; however, the function of these CBMs is to target the catalytic cellulosome machinery to substrate <cite>Freelove2001</cite>.   
+
Amino acid sequence-based classification of a CBM family may lead to the incorporation of other non-catalytic-associated CBMs within a given family. Some examples of families containing CBMs without appended catalytic modules include those with lectins (such as tachycitin ([[CBM14]]), wheat germ agglutinin ([[CBM18]]), fucolectin ([[CBM47]]), and malectin ([[CBM57]])), and those with periplasmic solute binding proteins ([[CBM32]]). Interestingly, the lectin ricin B chain ([[CBM13]]), while not on the same polypeptide chain, is covalently linked through a disulfide bond to the ricin A chain with its N-glycosidase activity <cite>Lewis1986</cite>.  The ricin A chain  N-glycosidase cleaves a specific adenine from the pentose ribose in ribosomal RNA <cite>Endo1987</cite>. Finally, [[CBM29]] is a family with only two members, which have no appended catalytic modules; however, the function of these CBMs is to target the catalytic cellulosome machinery to substrate <cite>Freelove2001</cite>.   
  
 
==Studying CBM-ligand Interactions==
 
==Studying CBM-ligand Interactions==
A review on laboratory approaches to studying the binding function of carbohydrate-binding modules is available <cite>Abbott2012</cite>. Typically, molecular biology techniques are used to overproduce a CBM protein in a host strain such as ''Escherichia coli'' which is then isolated and purified. Initial screening for carbohydrate binding interactions can be performed using screening techniques such as microarrays <cite>vanBueren2007</cite> or fluorescence microscopy techniques <cite>vanBueren2007 McCartney2006 Herve2010</cite>. Several approaches can be taken to verify and quantify CBM-polysaccharide interaction, including affinity gel electrophoresis, UV difference and fluorescence spectroscopy, solid state depletion assay and isothermal titration calorimetry <cite>Lammerts2004</cite>. Overall demonstration of carbohydrate binding function by CBMs is essential to understanding the biological role of these non-catalytic modules.  
+
A review on approaches to studying the binding function of carbohydrate-binding modules is available <cite>Abbott2012</cite>. Typically, molecular biology techniques are used to overproduce a CBM protein in a host strain such as ''Escherichia coli'' which is then isolated and purified. Initial screening for carbohydrate binding interactions can be performed using techniques such as microarrays <cite>vanBueren2007</cite> or fluorescence microscopy <cite>vanBueren2007 McCartney2006 Herve2010</cite>. Several approaches can be taken to verify and quantify CBM-polysaccharide interaction, including affinity gel electrophoresis, UV difference and fluorescence spectroscopy, solid state depletion assay, and isothermal titration calorimetry <cite>Lammerts2004</cite>. Demonstration of carbohydrate binding function by CBMs is essential to understanding the biological role of these non-catalytic modules.  
  
 
==Biotechnological applications of CBMs==
 
==Biotechnological applications of CBMs==
Line 209: Line 205:
 
#Meekins2014 pmid=24799671
 
#Meekins2014 pmid=24799671
 
#Rigden2004 pmid=15236739
 
#Rigden2004 pmid=15236739
#Yoshida2010 pmid= 20662765
+
#Yoshida2010 pmid=20662765
 
#Larsbrink2011 pmid=21426303
 
#Larsbrink2011 pmid=21426303
 
#Tailford2007 pmid=17287210
 
#Tailford2007 pmid=17287210
 +
#Ficko-Blean2012 pmid=22858095
 +
#McLean2002 pmid=12191997
 +
#Jervis1997 pmid=9295354
 +
#Ficko-Blean2006 pmid=16990278
 +
#king2014 pmid=25210925
 
</biblio>
 
</biblio>
  
 
[[Category:Definitions and explanations]]
 
[[Category:Definitions and explanations]]

Latest revision as of 01:52, 9 January 2023

Approve icon-50px.png

This page has been approved by the Responsible Curator as essentially complete. CAZypedia is a living document, so further improvement of this page is still possible. If you would like to suggest an addition or correction, please contact the page's Responsible Curator directly by e-mail.


Overview

Figure 1. An example of modularity in a CBM-containing glycoside hydrolase. Sialidase from Micromonospora viridifaciens contains an N-terminal CBM32 (red) X20 linker (yellow) and a C-terminal catalytic GH33 module (green) [1]. Graphical representation of modularity in amino acid sequence (top) and 3D crystal structure (bottom) PDB ID 1eut.

Carbohydrate-binding modules (CBMs) [2, 3, 4, 5, 6, 7] are a class of sugar-binding proteins that comprise amino acid sequences within a larger encoded protein sequence that fold into a structurally discrete module, typically forming part of a larger multi-modular enzyme [8] (Figure 1). The conventional role of a CBM is to bind to carbohydrate ligand and direct the catalytic machinery onto its substrate, thus enhancing the catalytic efficiency of the multimodular carbohydrate-active enzyme; however, there are several key exceptions of divergent evolution in the functions of CBMs [9] which are discussed below. The individual CBMs are themselves devoid of any catalytic activity and are most commonly associated with Glycoside Hydrolases but have also been identified in Polysaccharide Lyases, polysaccharide oxidases, Glycosyltransferases, plant cell wall-binding expansins [10] and in some lectins [9].

CBMs themselves do not undergo any conformational changes when binding ligand. Rather, the topography of the carbohydrate-binding site is preformed to be complementary to the shape of the target ligand (see Types). This is achieved by the presence of amino acid side chains present on the CBM binding surface or within the CBM binding cleft or pocket. Multimodular enzymes that include CBMs may as a whole be quite flexible and undergo significant conformational changes when binding substrate. Flexible Ser-Thr-Pro sequences often link adjacent modules and can allow for shifts in the orientation and direction of the catalytic module with respect to the CBM on the target substrate. In other enzymes the linking regions may be quite rigid, such as the 5-helical bundle linker module linking a CBM32 to a GH84 module [11].

History of CBMs

CBMs were initially characterized as cellulose binding domains (CBDs) in cellobiohydrolases CBHI and CBHII from Trichoderma reesei [12, 13] and cellulases CenA and CexA from Cellulomonas fimi [14]. Limited proteolysis experiments on these enzymes yielded truncated enzyme products that showed a reduced or complete loss in their ability to hydrolyze cellulose substrates. The reduction in enzymatic activity was attributed to the loss of ~100 amino acid C-terminal domains which prevented the adsorbption of the enzymes onto cellulose substrate. Thus it was proposed that these independent "domains" are critical for targeting the enzymes onto its substrate and enhancing their hydrolytic activity. It rapidly became evident that CBDs were not only appended to cellulases but were also found in a range of other plant cell wall degrading enzymes [15, 16, 17].

Initially, CBDs were categorized into 13 Types based on amino acid sequence similarities [18]. This classification system became complicated when similar functional domains from non-cellulolytic carbohydrate-active enzymes were discovered that did not bind cellulose but met all of the criteria of a CBD (for example see [19]). The term carbohydrate-binding module was proposed to solve this problem to be inclusive of all ancillary modules with non-catalytic carbohydrate-binding function (for a review see [2]). Since this time, CBMs have been found appended to enzymes that interact with almost all characterized carbohydrate materials found on Earth (Table 1).

Table 1: List of Carbohydrates and Interacting CBM Familiesa
Cellulose CBM1, CBM2, CBM3, CBM4, CBM6, CBM8, CBM9, CBM10, CBM16, CBM17, CBM28, CBM30, CBM37, CBM44, CBM46, CBM49, CBM59, CBM63, CBM64
Xylan CBM2, CBM4, CBM6, CBM9, CBM13, CBM15, CBM22, CBM31, CBM35, CBM36, CBM37, CBM44, CBM54, CBM59, CBM60
Plant Cell Wall - Other

(eg: beta-glucans, pectins, mannans, gluco- and galacturonans)

CBM4, CBM6, CBM11, CBM13, CBM16, CBM22, CBM23, CBM27, CBM28, CBM29, CBM32, CBM35, CBM39, CBM42, CBM43, CBM52, CBM56, CBM59, CBM61, CBM62, CBM65, CBM67
Chitin CBM1, CBM2, CBM3, CBM5, CBM12, CBM14, CBM18, CBM19, CBM37, CBM50, CBM54, CBM55
Alpha-glucans

(starch/glycogen, mutan)

CBM20, CBM21, CBM24, CBM25, CBM26, CBM34, CBM41, CBM45, CBM48, CBM53, CBM58
Mammalian Glycans CBM32, CBM40, CBM47, CBM51, CBM57b
Algal (seaweed) Saccharides

(e.g. porphyran, agarose, carrageenan, alginate, laminarin)

CBM6, CBM16
Other Bacterial cell wall sugars: CBM35, CBM39, CBM50
Fructans: CBM38, CBM66
Yeast cell wall glucans: CBM54
aBased on the Carbohydrate Active enZyme database. CBM7 is a deleted entry and CBM33 is now reclassified as Auxiliary Activities family AA10.
bonly human lectin malectin has been characterized, however a search based on amino acid sequence similarities found that similar modules are appended to many uncharacterized glycoside hydrolases [20].

Classification

Sequence Based Classification

Carbohydrate-binding modules are classified into many tens of families based on amino acid sequence similarities (a continually updated list is available in the Carbohydrate Active enZyme database). These families often cluster modules with similar structural folds and carbohydrate-binding function. However, there are several families that exhibit diversity in the carbohydrate ligands they target (Table 1).

Fold

Figure 2. Classical CBM beta-sandwich fold. C-terminal family CBM27 from Thermotoga maritima mannanase, a Type B CBM (A)(side and front view, PDB ID 1OF4) [21] and C-terminal family CBM6 from Clostridium stercorarium xylanase (B) (PDB ID 1NAE) [22] showing binding sites on the face (A) and loop region (B) of the beta sandwich fold respectively.

CBMs fall into one of 7 fold families [2]. The most common fold exhibited by CBMs is the beta-sandwich fold comprised of two overlapping beta-sheets each consisting of three to six antiparallel beta strands (Figure 2). The ligand binding site may be located on one face of the beta-sheet (Figure 2A) or may be positioned within the variable loop region of the beta-sheet (Figure 2B). There are examples of CBMs in the beta-sandwich fold family exhibiting dual binding sites such as CBM6 [23] and dual starch-binding sites in CBM20 [24]. Other fold families include the beta-trefoil fold, cystine knot, OB fold, the hevein and hevein-like and unique folds [2]. CBMs of the beta-trefoil fold family (CBM13, CBM42) present multivalent sugar-binding sites, as demonstrated for their interaction with xylan and arabinoxylan respectively [25].

Types

Figure 3. CBM Types. (A) Schematic of different CBM Types binding with different regions of a polysaccharide substrate. (B) Type A CBM2b from Pyrococcus furiosis GH18 chitinase(PDB ID 2CRW) [26]. Aromatic side chains of Type A CBMs form the planar binding surface.

CBMs are classified into three main Types defined by the shape and degree of polymerization of their target ligand (Figure 3A). The architecture of the binding site determines what region within a saccharide macromolecule the enzyme will target. The classification of CBM Types is as follows [2, 6, 7]:

  • Type A: bind to crystalline surfaces of the polysaccharides cellulose and chitin (example families CBM1, CBM2, CBM3, CBM5, CBM10). Their binding sites are planar and rich in aromatic amino acid residues creating a flat platform to bind to the planar polycrystalline chitin/cellulose surface (Figure 3B). Type A CBMs are unique and differ significantly from Type B or C.
  • Type B: bind internal glycan chains (endo-type). Type B are the most abundant form of CBMs reported to date. Type B binding sites appear as extended grooves or clefts comprised of binding subsites to generally accommodate longer sugar chains with four or more monosaccharide units (see Figure 2A for an example). There are some examples of CBMs, in families CBM6, CBM13, CBM20, CBM36 and CBM60, that contain two subsites.
  • Type C: bind termini of glycans (reducing/non-reducing ends, exo-type). Type C binding sites are short pockets for recognizing short sugar ligands containing one to three monosaccharide units (example families CBM9, CBM13, CBM32, CBM47, CBM66, CBM67). Families containing Type C CBMs are considered 'lectin-like' and may include lectins and CBMs with no appended catalytic modules as members.

Properties of CBM Carbohydrate Binding Interactions

Functional Roles of CBMs

CBMs carry out four main functional roles:

  • Targeting Effect: CBMs target the enzyme to distinct regions on a saccharide substrate (reducing end, non-reducing end, internal polysaccharide chains), depending on the architecture of its binding site (see Types).
  • Proximity Effect: CBMs increase the concentration of enzyme in close proximity to its saccharide substrate. This leads to more rapid and efficient substrate degradation.

An excellent example demonstrating targeting and proximity effects of plant cell wall specific CBMs is available [27].

  • Disruptive Effect: Some CBMs have been shown to disrupt the surface of tightly packed polysaccharides, such as cellulose fibres and starch granules, causing the substrate to loosen and become more exposed to the catalytic module for more efficient degradation. Disruptive roles have been described for cellulose binding CBM2a [28] and CBM44 [29]. Dual starch-binding domains of family CBM20 from Aspergillus niger glucoamylase have been shown to disrupt the surface of starch [30] while dual-associated CBM41 modules may have a disruptive role in degrading glycogen granules [31]. CBM33 was thought to have a disruptive effect on chitin, however these have now been reclassified as copper-dependent lytic polysaccharide monooxygenases [32] and are found in CAZy Auxiliary Activity Family 10.
  • Adhesion: CBMs have been shown to adhere enzymes onto the surface of bacterial cell wall components while exhibiting catalytic activity on an external neighboring carbohydrate substrate. For example, CBM35 modules have been shown to interact with the surface glucuronic acid containing sugars in the cell wall of Amycolatopsis orientalis while the catalytic module is active on external chitosan likely originating from the cell wall of competing soil fungal species [33]. Streptococcus pneumoniae uses a CBM71 as an adhesin, to mediate adherence to host cell surfaces displaying lactose or N-acetyllactosamine [34].

There are examples in the literature of CBMs extending the active site sub-sites of their appended glycosidase modules. The glycogen-degrading pneumococcal virulence factor SpuA has its active site extended by one of two tandem CBM41s [35]. The glucan starch phosphatase Starch Excess4 has its active site extended by a CBM48 [36].

Several lectins are classified as CBMs even though they are not on the same polypeptide chain as a carbohydrate-active enzyme. See Blurred Lines: CBMs, Lectins and Outliers for a more complete discussion.

Driving Forces of CBM-Carbohydrate Interactions

There are two key features that drive CBM-carbohydrate interactions. Extensive hydrogen bonding occurs between the hydroxyl groups of carbohydrate ligands and polar amino acid residues within the binding site. Additional water-mediated hydrogen bonding networks between these groups can also be found in the binding site. By far the most important characteristic driving force mediating protein-carbohydrate interactions is the position and orientation of aromatic amino acid residues (Try, Tyr and sometimes Phe) within the binding site. These essential planar residues provide a hydrophobic platform for the planar face of sugar rings, an interaction resembling hydrophobic stacking interactions. Weak intermolecular electrostatic interactions occur between C-H and pi electrons in the planar ring systems and contribute 1.5 - 2.5 kcal/mol energy to the binding reaction [37]. CBMs may also use coordinated metal ions within the binding site to directly interact with their target ligand. For example, families CBM36 [38] and CBM60 [39] exhibit calcium-dependent binding to xylooligosaccharides.

CBM-carbohydrate interactions in general are quite weak (Ka affinities in mM-1 to uM-1 range) making the interaction easily reversible. This feature allows for "recycling" of the appended enzyme to bind to a new region on the substrate once catalysis has been completed at a given site. Some CBMs that bind crystalline ligands, typified by CBM2a, bind with apparent irreversibility (they do not desorb when free CBM is diluted), displaying surface mobility and exchanging with free CBM [40, 41]. Multivalent effects (more than one saccharide-binding site or multiple CBMs within the polypeptide) may act to increase the overall affinity relative to a single binding site interaction.

CBM Promiscuity

Because of the diversity of carbohydrate structures and motifs found in plant and mammalian glycans, some CBMs have evolved to recognize more than one type of monosaccharide or glycosidic bond linkage within the binding pocket, a feature called CBM promiscuity. For example a family CBM32 from Clostridium perfringens NagH binds N-acetyl-glucosamine in the primary subsite but can accommodate N-acetyl-galactosamine or mannose in the secondary site [42]. There are several examples of ligand promiscuity within family CBM32. In plant cell wall recognizing CBMs, they are often able to accommodate both cellulose and hemicelluloses. For example, several family CBM6 members interact with cellulose, xylose or laminarin [22, 43]. Family CBM41 appended to a GH13 pullulanase can accommodate both alpha-1,4- and alpha-1,6-linked glucose found in amylopectin (from starch/glycogen) [44]. The flexibility in carbohydrate recognition by CBMs contributes to the targeting efficiency of carbohydrate-active enzymes in environments where there is diverse range of saccharides present (such as the plant cell wall or mammalian tissues).

CBMs and Multivalency

Multivalency is the collective strength of several interactions with a given ligand. Because CBM-carbohydrate interactions are relatively weak, some carbohydrate-active enzymes, mainly glycoside hydrolases, have developed ways to increase their interaction with substrate via a multivalent effect. Individually, some CBMs may contain multiple binding sites to form a multivalent interaction with their target ligand, although this form of multivalency is quite rare (for example CBM6, CBM13 and CBM20). More commonly, glycoside hydrolases may contain more than one CBM within their modular architecture, either arranged in tandem or at opposing N and C terminal ends of the protein sequence, or both. These CBMs may target the same carbohydrate ligand, different regions within the same ligand, or different ligands in a complex saccharide amalgam. A multivalent interaction enhances the overall affinity of an enzyme for its substrate. Furthermore, tandem CBMs may cooperatively target the enzyme towards specific saccharide regions based on their ligand specificity and the orientation and position of the binding sites with respect to one another.

Blurred Lines: CBMs, Lectins and Outliers

While CBMs are generally considered to be discrete entities within a polypeptide chain, there are some exceptions. The glycogen-degrading pneumococcal virulence factor SpuA has its active site extended by one of two tandem CBM41s [35] and the glucan starch phosphatase Starch Excess4 has its active site extended by a CBM48 [36]. Thus, the full biological contribution to carbohydrate-binding within the polypeptide is contributed by a multivalent interaction as an extension of the catalytic module's carbohydrate-binding properties. The PA14 domain is found in bacterial toxins, enzymes, adhesins and signaling molecules [45]. It has been described as appended to the polypeptide sequence of some glycoside hydrolase enzymes (for example some GH31s) and the crystal structure of a GH31 reveals the PA14 domain is closely associated with the catalytic module, on the side of the substrate-binding cleft, potentially facilitating the binding of longer oligosaccharides [46]. It has also been described as a domain integrated into the core of some GH3 glycoside hydrolase modules. In one example, the GH3 integrated PA14 domain demonstrates carbohydrate-binding function and acts to block the active site cleft, thus conferring substrate specificity for disaccharide substrates [47]. Similarly, in a GH2 mannosidase, the PA14 domain determines exo- rather than endo-activity for the catalytic module [48]. Evidently, more research needs to go into the structure and function of these domains as they are found in a wide variety of polypeptide sequences and the functions of the PA14 domains may be diverse. They have not yet been classified into the CAZy classification system, though they are mentioned here as the domains have been referred to as CBMs in the literature [9].

Unrelated sugar-binding proteins have converged on similar biochemical mechanisms of saccharide recognition [9]. The direct interaction of Ca2+ ions with saccharides in sugar binding sites was first described in C-type animal lectins [49], named thusly because of their sugar-binding requirement for Ca2+. Other sugar-binding proteins that also require Ca2+ for binding, include yeast flocculation proteins [50] and other yeast adhesins [51, 52], and two CBM families, CBM36 and CBM60 [39].

Several lectins [53, 54] are classified as CBMs in the Carbohydrate Active enZyme database as they share amino acid sequence similarity, exhibit similar folds and display similar carbohydrate binding properties. For example, ricin toxin B chain from Ricinus communis resides in family CBM13, while wheat germ agglutinin (WGA) can be found in family CBM18. The human lectin malectin is classified as family CBM57 and plays a role in N-linked glycan processing of polypeptides in the endoplasmic reticulum [20, 55]. CBMs may also share properties with lectins that are not (yet) incorporated in the Carbohydrate Active enZyme database. For example, the fucose-specific Anquila anguila lectin AAA was described as similar to Type C CBMs found in family CBM6 and CBM32 [22] and is now classified as a CBM47 [56]. Lectins which are classified as CBMs are incorporated into a family because they were found to share amino acid sequence identity with a known CBM appended to a carbohydrate-active enzyme. A brief historical overview of the discovery and characterization of lectins is available [53] as is a review describing the convergent and divergent mechanisms of sugar recognition across the kingdoms of life [9].

The biological reaction of agglutination is when particles that are suspended in a liquid collect into clumps, such as that occuring as a serologic response to a specific antibody. The most prominent feature that is genarally considered to separate CBMs from lectins is the involvement of lectins in agglutination of sugar-containing molecules or glycoconjugates. Lectins exploit multivalency, often forming quaternary structures as homodimers, trimers or tetramers with several binding sites which then agglutinate the target glycocongugate [53, 54]. Few studies have been done on the agglutinating effects of CBMs or CBM tandems; however, a CBM26/CBM25 pair from Bacillus halodurans is described as strongly agglutinating on soluble amylopectin (and pullulan), suggesting multivalent binding of the individual CBMs to sites on separate glucan chains [56]. CBMs individually are not known to be directly involved in the formation of quaternary structures and are not known to have agglutinating properties - in common with sugar-recognition modules of all glycan-binding proteins, including lectins [9]. Other examples of CBMs participating in quaternary structures but not directly implicated in quaternary structure formation are found in cellulosome complexes [57, 58, 59] and in some secreted pathogenic bacterial enzymes complexes [11, 60] where complex formation is mediated through specific cohesin-dockerin module interactions.

Amino acid sequence-based classification of a CBM family may lead to the incorporation of other non-catalytic-associated CBMs within a given family. Some examples of families containing CBMs without appended catalytic modules include those with lectins (such as tachycitin (CBM14), wheat germ agglutinin (CBM18), fucolectin (CBM47), and malectin (CBM57)), and those with periplasmic solute binding proteins (CBM32). Interestingly, the lectin ricin B chain (CBM13), while not on the same polypeptide chain, is covalently linked through a disulfide bond to the ricin A chain with its N-glycosidase activity [61]. The ricin A chain N-glycosidase cleaves a specific adenine from the pentose ribose in ribosomal RNA [62]. Finally, CBM29 is a family with only two members, which have no appended catalytic modules; however, the function of these CBMs is to target the catalytic cellulosome machinery to substrate [57].

Studying CBM-ligand Interactions

A review on approaches to studying the binding function of carbohydrate-binding modules is available [63]. Typically, molecular biology techniques are used to overproduce a CBM protein in a host strain such as Escherichia coli which is then isolated and purified. Initial screening for carbohydrate binding interactions can be performed using techniques such as microarrays [31] or fluorescence microscopy [27, 31, 64]. Several approaches can be taken to verify and quantify CBM-polysaccharide interaction, including affinity gel electrophoresis, UV difference and fluorescence spectroscopy, solid state depletion assay, and isothermal titration calorimetry [65]. Demonstration of carbohydrate binding function by CBMs is essential to understanding the biological role of these non-catalytic modules.

Biotechnological applications of CBMs

CBMs and their carbohydrate-binding properties are used for many different biological applications. Below is a non-exhaustive list of several examples:

  • Features of CBMs are currently being exploited to create designer CAZymes with enhanced or modified carbohydrate recognition functions [66, 67, 68, 69].
  • Family CBM9 can be used as an affinity tag to purify tagged proteins on a cellulose-based affinity column [70].
  • CBMs are used as molecular probes to detect presence of specific carbohydrate motifs in plant [27, 64] and mammalian tissues [44, 56].
  • CBMs are used in fibre modification. Engineered CBMs have been shown to increase the strength of cellulose pulp in paper-making processes [71, 72], in crosslinking polysaccharide fibres for biomaterials [73] and cotton fibre modification [74].
  • There are several examples of CBMs being used to immobilize whole cells onto carbohydrate surfaces [75, 76, 77].
  • CBMs are used to enhance bioprocessing enzymes for industrial uses in pulp processing and biofuel production [29, 78, 79].
  • Starch binding CBMs added onto transglucosylating enzyme CGTase from GH13 created a fusion enzyme with more efficient transglucosylating activity with soluble starch, important for industrial biotransformation processes [80].


References

  1. Gaskell A, Crennell S, and Taylor G. (1995). The three domains of a bacterial sialidase: a beta-propeller, an immunoglobulin module and a galactose-binding jelly-roll. Structure. 1995;3(11):1197-205. DOI:10.1016/s0969-2126(01)00255-6 | PubMed ID:8591030 [Gaskell1995]
  2. Boraston AB, Bolam DN, Gilbert HJ, and Davies GJ. (2004). Carbohydrate-binding modules: fine-tuning polysaccharide recognition. Biochem J. 2004;382(Pt 3):769-81. DOI:10.1042/BJ20040892 | PubMed ID:15214846 [Boraston2004]
  3. Hashimoto H (2006). Recent structural studies of carbohydrate-binding modules. Cell Mol Life Sci. 2006;63(24):2954-67. DOI:10.1007/s00018-006-6195-3 | PubMed ID:17131061 [Hashimoto2006]
  4. Shoseyov O, Shani Z, and Levy I. (2006). Carbohydrate binding modules: biochemical properties and novel applications. Microbiol Mol Biol Rev. 2006;70(2):283-95. DOI:10.1128/MMBR.00028-05 | PubMed ID:16760304 [Shoseyov2006]
  5. Guillén D, Sánchez S, and Rodríguez-Sanoja R. (2010). Carbohydrate-binding domains: multiplicity of biological roles. Appl Microbiol Biotechnol. 2010;85(5):1241-9. DOI:10.1007/s00253-009-2331-y | PubMed ID:19908036 [Guillen2010]
  6. Gilbert HJ, Knox JP, and Boraston AB. (2013). Advances in understanding the molecular basis of plant cell wall polysaccharide recognition by carbohydrate-binding modules. Curr Opin Struct Biol. 2013;23(5):669-77. DOI:10.1016/j.sbi.2013.05.005 | PubMed ID:23769966 [Gilbert2013]
  7. Armenta S, Moreno-Mendieta S, Sánchez-Cuapio Z, Sánchez S, and Rodríguez-Sanoja R. (2017). Advances in molecular engineering of carbohydrate-binding modules. Proteins. 2017;85(9):1602-1617. DOI:10.1002/prot.25327 | PubMed ID:28547780 [Armenta2017]
  8. Ficko-Blean E and Boraston AB. (2012). Insights into the recognition of the human glycome by microbial carbohydrate-binding modules. Curr Opin Struct Biol. 2012;22(5):570-7. DOI:10.1016/j.sbi.2012.07.009 | PubMed ID:22858095 [Ficko-Blean2012]
  9. Taylor ME and Drickamer K. (2014). Convergent and divergent mechanisms of sugar recognition across kingdoms. Curr Opin Struct Biol. 2014;28:14-22. DOI:10.1016/j.sbi.2014.07.003 | PubMed ID:25102772 [Taylor2014]
  10. Georgelis N, Tabuchi A, Nikolaidis N, and Cosgrove DJ. (2011). Structure-function analysis of the bacterial expansin EXLX1. J Biol Chem. 2011;286(19):16814-23. DOI:10.1074/jbc.M111.225037 | PubMed ID:21454649 [Georgelis2011]
  11. Ficko-Blean E, Gregg KJ, Adams JJ, Hehemann JH, Czjzek M, Smith SP, and Boraston AB. (2009). Portrait of an enzyme, a complete structural analysis of a multimodular {beta}-N-acetylglucosaminidase from Clostridium perfringens. J Biol Chem. 2009;284(15):9876-84. DOI:10.1074/jbc.M808954200 | PubMed ID:19193644 [Ficko2009]
  12. Van Tilbeurgh, H., Tomme P., Claeyssens M., Bhikhabhai R., Pettersson G.(1986) Limited proteolysis of the cellobiohydrolase I from Trichoderma reesei. FEBS Lett. 204,223–227. DOI:10.1016/0014-5793(86)80816-X

    [VanTilbeurgh1986]
  13. Tomme P, Van Tilbeurgh H, Pettersson G, Van Damme J, Vandekerckhove J, Knowles J, Teeri T, and Claeyssens M. (1988). Studies of the cellulolytic system of Trichoderma reesei QM 9414. Analysis of domain function in two cellobiohydrolases by limited proteolysis. Eur J Biochem. 1988;170(3):575-81. DOI:10.1111/j.1432-1033.1988.tb13736.x | PubMed ID:3338453 [Tomme1988]
  14. Gilkes NR, Warren RA, Miller RC Jr, and Kilburn DG. (1988). Precise excision of the cellulose binding domains from two Cellulomonas fimi cellulases by a homologous protease and the effect on catalysis. J Biol Chem. 1988;263(21):10401-7. | Google Books | Open Library PubMed ID:3134347 [Gilkes1988]
  15. Kellett LE, Poole DM, Ferreira LM, Durrant AJ, Hazlewood GP, and Gilbert HJ. (1990). Xylanase B and an arabinofuranosidase from Pseudomonas fluorescens subsp. cellulosa contain identical cellulose-binding domains and are encoded by adjacent genes. Biochem J. 1990;272(2):369-76. DOI:10.1042/bj2720369 | PubMed ID:2125205 [Kellett1990]
  16. Ferreira LM, Durrant AJ, Hall J, Hazlewood GP, and Gilbert HJ. (1990). Spatial separation of protein domains is not necessary for catalytic activity or substrate binding in a xylanase. Biochem J. 1990;269(1):261-4. DOI:10.1042/bj2690261 | PubMed ID:2115772 [Ferriera1990]
  17. Ferreira LM, Wood TM, Williamson G, Faulds C, Hazlewood GP, Black GW, and Gilbert HJ. (1993). A modular esterase from Pseudomonas fluorescens subsp. cellulosa contains a non-catalytic cellulose-binding domain. Biochem J. 1993;294 ( Pt 2)(Pt 2):349-55. DOI:10.1042/bj2940349 | PubMed ID:8373350 [Ferriera1993]
  18. Tomme, P., Warren, R.A., Miller, R.C., Jr., Kilburn, D.G. & Gilkes, N.R. (1995) in Enzymatic Degradation of Insoluble Polysaccharides (Saddler, J.N. & Penner, M., eds.), Cellulose-binding domains: classification and properties. pp. 142-163, American Chemical Society, Washington.

    [Tomme1995]
  19. Svensson B, Jespersen H, Sierks MR, and MacGregor EA. (1989). Sequence homology between putative raw-starch binding domains from different starch-degrading enzymes. Biochem J. 1989;264(1):309-11. DOI:10.1042/bj2640309 | PubMed ID:2481445 [Svensson1989]
  20. Schallus T, Jaeckh C, Fehér K, Palma AS, Liu Y, Simpson JC, Mackeen M, Stier G, Gibson TJ, Feizi T, Pieler T, and Muhle-Goll C. (2008). Malectin: a novel carbohydrate-binding protein of the endoplasmic reticulum and a candidate player in the early steps of protein N-glycosylation. Mol Biol Cell. 2008;19(8):3404-14. DOI:10.1091/mbc.e08-04-0354 | PubMed ID:18524852 [Shallus2008]
  21. Boraston AB, Revett TJ, Boraston CM, Nurizzo D, and Davies GJ. (2003). Structural and thermodynamic dissection of specific mannan recognition by a carbohydrate binding module, TmCBM27. Structure. 2003;11(6):665-75. DOI:10.1016/s0969-2126(03)00100-x | PubMed ID:12791255 [Boraston20031]
  22. Boraston AB, Notenboom V, Warren RA, Kilburn DG, Rose DR, and Davies G. (2003). Structure and ligand binding of carbohydrate-binding module CsCBM6-3 reveals similarities with fucose-specific lectins and "galactose-binding" domains. J Mol Biol. 2003;327(3):659-69. DOI:10.1016/s0022-2836(03)00152-9 | PubMed ID:12634060 [Boraston20032]
  23. Pires VM, Henshaw JL, Prates JA, Bolam DN, Ferreira LM, Fontes CM, Henrissat B, Planas A, Gilbert HJ, and Czjzek M. (2004). The crystal structure of the family 6 carbohydrate binding module from Cellvibrio mixtus endoglucanase 5a in complex with oligosaccharides reveals two distinct binding sites with different ligand specificities. J Biol Chem. 2004;279(20):21560-8. DOI:10.1074/jbc.M401599200 | PubMed ID:15010454 [Pires2004]
  24. Lawson CL, van Montfort R, Strokopytov B, Rozeboom HJ, Kalk KH, de Vries GE, Penninga D, Dijkhuizen L, and Dijkstra BW. (1994). Nucleotide sequence and X-ray structure of cyclodextrin glycosyltransferase from Bacillus circulans strain 251 in a maltose-dependent crystal form. J Mol Biol. 1994;236(2):590-600. DOI:10.1006/jmbi.1994.1168 | PubMed ID:8107143 [Lawson1994]
  25. Fujimoto Z (2013). Structure and function of carbohydrate-binding module families 13 and 42 of glycoside hydrolases, comprising a β-trefoil fold. Biosci Biotechnol Biochem. 2013;77(7):1363-71. DOI:10.1271/bbb.130183 | PubMed ID:23832347 [Fujimoto2013]
  26. Nakamura T, Mine S, Hagihara Y, Ishikawa K, Ikegami T, and Uegaki K. (2008). Tertiary structure and carbohydrate recognition by the chitin-binding domain of a hyperthermophilic chitinase from Pyrococcus furiosus. J Mol Biol. 2008;381(3):670-80. DOI:10.1016/j.jmb.2008.06.006 | PubMed ID:18582475 [Nakamura2008]
  27. Hervé C, Rogowski A, Blake AW, Marcus SE, Gilbert HJ, and Knox JP. (2010). Carbohydrate-binding modules promote the enzymatic deconstruction of intact plant cell walls by targeting and proximity effects. Proc Natl Acad Sci U S A. 2010;107(34):15293-8. DOI:10.1073/pnas.1005732107 | PubMed ID:20696902 [Herve2010]
  28. Din, N., Gilkes, N.R., Tekant, B., Miller, R.C., Jr., Warren, R.A., and Kilburn, D.G. (1991) Non-Hydrolytic Disruption of Cellulose Fibres by the Binding Domain of a Bacterial Cellulase. Nat. Biotech. 9, 1096 - 1099. DOI:10.1038/nbt1191-1096

    [Din1991]
  29. Gourlay K, Arantes V, and Saddler JN. (2012). Use of substructure-specific carbohydrate binding modules to track changes in cellulose accessibility and surface morphology during the amorphogenesis step of enzymatic hydrolysis. Biotechnol Biofuels. 2012;5(1):51. DOI:10.1186/1754-6834-5-51 | PubMed ID:22828270 [Gourlay2012]
  30. Southall SM, Simpson PJ, Gilbert HJ, Williamson G, and Williamson MP. (1999). The starch-binding domain from glucoamylase disrupts the structure of starch. FEBS Lett. 1999;447(1):58-60. DOI:10.1016/s0014-5793(99)00263-x | PubMed ID:10218582 [Southall1999]
  31. van Bueren AL, Higgins M, Wang D, Burke RD, and Boraston AB. (2007). Identification and structural basis of binding to host lung glycogen by streptococcal virulence factors. Nat Struct Mol Biol. 2007;14(1):76-84. DOI:10.1038/nsmb1187 | PubMed ID:17187076 [vanBueren2007]
  32. Vaaje-Kolstad G, Westereng B, Horn SJ, Liu Z, Zhai H, Sørlie M, and Eijsink VG. (2010). An oxidative enzyme boosting the enzymatic conversion of recalcitrant polysaccharides. Science. 2010;330(6001):219-22. DOI:10.1126/science.1192231 | PubMed ID:20929773 [Vaaje2010]
  33. Montanier C, van Bueren AL, Dumon C, Flint JE, Correia MA, Prates JA, Firbank SJ, Lewis RJ, Grondin GG, Ghinet MG, Gloster TM, Herve C, Knox JP, Talbot BG, Turkenburg JP, Kerovuo J, Brzezinski R, Fontes CM, Davies GJ, Boraston AB, and Gilbert HJ. (2009). Evidence that family 35 carbohydrate binding modules display conserved specificity but divergent function. Proc Natl Acad Sci U S A. 2009;106(9):3065-70. DOI:10.1073/pnas.0808972106 | PubMed ID:19218457 [Montanier2009]
  34. Singh AK, Pluvinage B, Higgins MA, Dalia AB, Woodiga SA, Flynn M, Lloyd AR, Weiser JN, Stubbs KA, Boraston AB, and King SJ. (2014). Unravelling the multiple functions of the architecturally intricate Streptococcus pneumoniae β-galactosidase, BgaA. PLoS Pathog. 2014;10(9):e1004364. DOI:10.1371/journal.ppat.1004364 | PubMed ID:25210925 [king2014]
  35. Lammerts van Bueren A, Ficko-Blean E, Pluvinage B, Hehemann JH, Higgins MA, Deng L, Ogunniyi AD, Stroeher UH, El Warry N, Burke RD, Czjzek M, Paton JC, Vocadlo DJ, and Boraston AB. (2011). The conformation and function of a multimodular glycogen-degrading pneumococcal virulence factor. Structure. 2011;19(5):640-51. DOI:10.1016/j.str.2011.03.001 | PubMed ID:21565699 [Lammerts2011]
  36. Meekins DA, Raththagala M, Husodo S, White CJ, Guo HF, Kötting O, Vander Kooi CW, and Gentry MS. (2014). Phosphoglucan-bound structure of starch phosphatase Starch Excess4 reveals the mechanism for C6 specificity. Proc Natl Acad Sci U S A. 2014;111(20):7272-7. DOI:10.1073/pnas.1400757111 | PubMed ID:24799671 [Meekins2014]
  37. Meyer EA, Castellano RK, and Diederich F. (2003). Interactions with aromatic rings in chemical and biological recognition. Angew Chem Int Ed Engl. 2003;42(11):1210-50. DOI:10.1002/anie.200390319 | PubMed ID:12645054 [Meyer2003]
  38. Jamal-Talabani S, Boraston AB, Turkenburg JP, Tarbouriech N, Ducros VM, and Davies GJ. (2004). Ab initio structure determination and functional characterization of CBM36; a new family of calcium-dependent carbohydrate binding modules. Structure. 2004;12(7):1177-87. DOI:10.1016/j.str.2004.04.022 | PubMed ID:15242594 [Jamal2004]
  39. Montanier C, Flint JE, Bolam DN, Xie H, Liu Z, Rogowski A, Weiner DP, Ratnaparkhe S, Nurizzo D, Roberts SM, Turkenburg JP, Davies GJ, and Gilbert HJ. (2010). Circular permutation provides an evolutionary link between two families of calcium-dependent carbohydrate binding modules. J Biol Chem. 2010;285(41):31742-54. DOI:10.1074/jbc.M110.142133 | PubMed ID:20659893 [Montanier2010]
  40. McLean BW, Boraston AB, Brouwer D, Sanaie N, Fyfe CA, Warren RA, Kilburn DG, and Haynes CA. (2002). Carbohydrate-binding modules recognize fine substructures of cellulose. J Biol Chem. 2002;277(52):50245-54. DOI:10.1074/jbc.M204433200 | PubMed ID:12191997 [McLean2002]
  41. Jervis EJ, Haynes CA, and Kilburn DG. (1997). Surface diffusion of cellulases and their isolated binding domains on cellulose. J Biol Chem. 1997;272(38):24016-23. DOI:10.1074/jbc.272.38.24016 | PubMed ID:9295354 [Jervis1997]
  42. Ficko-Blean E and Boraston AB. (2009). N-acetylglucosamine recognition by a family 32 carbohydrate-binding module from Clostridium perfringens NagH. J Mol Biol. 2009;390(2):208-20. DOI:10.1016/j.jmb.2009.04.066 | PubMed ID:19422833 [Ficko20092]
  43. van Bueren AL, Morland C, Gilbert HJ, and Boraston AB. (2005). Family 6 carbohydrate binding modules recognize the non-reducing end of beta-1,3-linked glucans by presenting a unique ligand binding surface. J Biol Chem. 2005;280(1):530-7. DOI:10.1074/jbc.M410113200 | PubMed ID:15501830 [Lammerts2005]
  44. van Bueren AL and Boraston AB. (2007). The structural basis of alpha-glucan recognition by a family 41 carbohydrate-binding module from Thermotoga maritima. J Mol Biol. 2007;365(3):555-60. DOI:10.1016/j.jmb.2006.10.018 | PubMed ID:17095014 [Lammerts2007]
  45. Rigden DJ, Mello LV, and Galperin MY. (2004). The PA14 domain, a conserved all-beta domain in bacterial toxins, enzymes, adhesins and signaling molecules. Trends Biochem Sci. 2004;29(7):335-9. DOI:10.1016/j.tibs.2004.05.002 | PubMed ID:15236739 [Rigden2004]
  46. Larsbrink J, Izumi A, Ibatullin FM, Nakhai A, Gilbert HJ, Davies GJ, and Brumer H. (2011). Structural and enzymatic characterization of a glycoside hydrolase family 31 α-xylosidase from Cellvibrio japonicus involved in xyloglucan saccharification. Biochem J. 2011;436(3):567-80. DOI:10.1042/BJ20110299 | PubMed ID:21426303 [Larsbrink2011]
  47. Yoshida E, Hidaka M, Fushinobu S, Koyanagi T, Minami H, Tamaki H, Kitaoka M, Katayama T, and Kumagai H. (2010). Role of a PA14 domain in determining substrate specificity of a glycoside hydrolase family 3 β-glucosidase from Kluyveromyces marxianus. Biochem J. 2010;431(1):39-49. DOI:10.1042/BJ20100351 | PubMed ID:20662765 [Yoshida2010]
  48. Tailford LE, Money VA, Smith NL, Dumon C, Davies GJ, and Gilbert HJ. (2007). Mannose foraging by Bacteroides thetaiotaomicron: structure and specificity of the beta-mannosidase, BtMan2A. J Biol Chem. 2007;282(15):11291-9. DOI:10.1074/jbc.M610964200 | PubMed ID:17287210 [Tailford2007]
  49. Weis WI, Drickamer K, and Hendrickson WA. (1992). Structure of a C-type mannose-binding protein complexed with an oligosaccharide. Nature. 1992;360(6400):127-34. DOI:10.1038/360127a0 | PubMed ID:1436090 [Weis1992]
  50. Veelders M, Brückner S, Ott D, Unverzagt C, Mösch HU, and Essen LO. (2010). Structural basis of flocculin-mediated social behavior in yeast. Proc Natl Acad Sci U S A. 2010;107(52):22511-6. DOI:10.1073/pnas.1013210108 | PubMed ID:21149680 [Veelders2010]
  51. Maestre-Reyna M, Diderrich R, Veelders MS, Eulenburg G, Kalugin V, Brückner S, Keller P, Rupp S, Mösch HU, and Essen LO. (2012). Structural basis for promiscuity and specificity during Candida glabrata invasion of host epithelia. Proc Natl Acad Sci U S A. 2012;109(42):16864-9. DOI:10.1073/pnas.1207653109 | PubMed ID:23035251 [Maestre-Reyna2012]
  52. Ielasi FS, Decanniere K, and Willaert RG. (2012). The epithelial adhesin 1 (Epa1p) from the human-pathogenic yeast Candida glabrata: structural and functional study of the carbohydrate-binding domain. Acta Crystallogr D Biol Crystallogr. 2012;68(Pt 3):210-7. DOI:10.1107/S0907444911054898 | PubMed ID:22349222 [Ielasi2012]
  53. Sharon N and Lis H. (2004). History of lectins: from hemagglutinins to biological recognition molecules. Glycobiology. 2004;14(11):53R-62R. DOI:10.1093/glycob/cwh122 | PubMed ID:15229195 [SharonLis2004]
  54. Nathan Sharon and H. Lis. (2007-10) Lectins. Springer Science & Business Media. [SharonLis2007]
  55. Galli C, Bernasconi R, Soldà T, Calanca V, and Molinari M. (2011). Malectin participates in a backup glycoprotein quality control pathway in the mammalian ER. PLoS One. 2011;6(1):e16304. DOI:10.1371/journal.pone.0016304 | PubMed ID:21298103 [Galli2011]
  56. Boraston AB, Healey M, Klassen J, Ficko-Blean E, Lammerts van Bueren A, and Law V. (2006). A structural and functional analysis of alpha-glucan recognition by family 25 and 26 carbohydrate-binding modules reveals a conserved mode of starch recognition. J Biol Chem. 2006;281(1):587-98. DOI:10.1074/jbc.M509958200 | PubMed ID:16230347 [Boraston2006]
  57. Boraston AB, Wang D, and Burke RD. (2006). Blood group antigen recognition by a Streptococcus pneumoniae virulence factor. J Biol Chem. 2006;281(46):35263-71. DOI:10.1074/jbc.M607620200 | PubMed ID:16987809 [Boraston2006]
  58. Freelove AC, Bolam DN, White P, Hazlewood GP, and Gilbert HJ. (2001). A novel carbohydrate-binding protein is a component of the plant cell wall-degrading complex of Piromyces equi. J Biol Chem. 2001;276(46):43010-7. DOI:10.1074/jbc.M107143200 | PubMed ID:11560933 [Freelove2001]
  59. Poole DM, Morag E, Lamed R, Bayer EA, Hazlewood GP, and Gilbert HJ. (1992). Identification of the cellulose-binding domain of the cellulosome subunit S1 from Clostridium thermocellum YS. FEMS Microbiol Lett. 1992;78(2-3):181-6. DOI:10.1016/0378-1097(92)90022-g | PubMed ID:1490597 [Poole1992]
  60. Morag E, Lapidot A, Govorko D, Lamed R, Wilchek M, Bayer EA, and Shoham Y. (1995). Expression, purification, and characterization of the cellulose-binding domain of the scaffoldin subunit from the cellulosome of Clostridium thermocellum. Appl Environ Microbiol. 1995;61(5):1980-6. DOI:10.1128/aem.61.5.1980-1986.1995 | PubMed ID:7646033 [Morag1995]
  61. Adams JJ, Gregg K, Bayer EA, Boraston AB, and Smith SP. (2008). Structural basis of Clostridium perfringens toxin complex formation. Proc Natl Acad Sci U S A. 2008;105(34):12194-9. DOI:10.1073/pnas.0803154105 | PubMed ID:18716000 [Adams2008]
  62. Lewis MS and Youle RJ. (1986). Ricin subunit association. Thermodynamics and the role of the disulfide bond in toxicity. J Biol Chem. 1986;261(25):11571-7. | Google Books | Open Library PubMed ID:3745156 [Lewis1986]
  63. Endo Y and Tsurugi K. (1987). RNA N-glycosidase activity of ricin A-chain. Mechanism of action of the toxic lectin ricin on eukaryotic ribosomes. J Biol Chem. 1987;262(17):8128-30. | Google Books | Open Library PubMed ID:3036799 [Endo1987]
  64. Abbott DW and Boraston AB. (2012). Quantitative approaches to the analysis of carbohydrate-binding module function. Methods Enzymol. 2012;510:211-31. DOI:10.1016/B978-0-12-415931-0.00011-2 | PubMed ID:22608728 [Abbott2012]
  65. McCartney L, Blake AW, Flint J, Bolam DN, Boraston AB, Gilbert HJ, and Knox JP. (2006). Differential recognition of plant cell walls by microbial xylan-specific carbohydrate-binding modules. Proc Natl Acad Sci U S A. 2006;103(12):4765-70. DOI:10.1073/pnas.0508887103 | PubMed ID:16537424 [McCartney2006]
  66. Lammerts van Bueren A and Boraston AB. (2004). Binding sub-site dissection of a carbohydrate-binding module reveals the contribution of entropy to oligosaccharide recognition at "non-primary" binding subsites. J Mol Biol. 2004;340(4):869-79. DOI:10.1016/j.jmb.2004.05.038 | PubMed ID:15223327 [Lammerts2004]
  67. Cuskin F, Flint JE, Gloster TM, Morland C, Baslé A, Henrissat B, Coutinho PM, Strazzulli A, Solovyova AS, Davies GJ, and Gilbert HJ. (2012). How nature can exploit nonspecific catalytic and carbohydrate binding modules to create enzymatic specificity. Proc Natl Acad Sci U S A. 2012;109(51):20889-94. DOI:10.1073/pnas.1212034109 | PubMed ID:23213210 [Cuskin2012]
  68. McKee LS, Peña MJ, Rogowski A, Jackson A, Lewis RJ, York WS, Krogh KB, Viksø-Nielsen A, Skjøt M, Gilbert HJ, and Marles-Wright J. (2012). Introducing endo-xylanase activity into an exo-acting arabinofuranosidase that targets side chains. Proc Natl Acad Sci U S A. 2012;109(17):6537-42. DOI:10.1073/pnas.1117686109 | PubMed ID:22492980 [McKee2012]
  69. Tang CD, Li JF, Wei XH, Min R, Gao SJ, Wang JQ, Yin X, and Wu MC. (2013). Fusing a carbohydrate-binding module into the Aspergillus usamii β-mannanase to improve its thermostability and cellulose-binding capacity by in silico design. PLoS One. 2013;8(5):e64766. DOI:10.1371/journal.pone.0064766 | PubMed ID:23741390 [Tang2013]
  70. Kavoosi M, Meijer J, Kwan E, Creagh AL, Kilburn DG, and Haynes CA. (2004). Inexpensive one-step purification of polypeptides expressed in Escherichia coli as fusions with the family 9 carbohydrate-binding module of xylanase 10A from T. maritima. J Chromatogr B Analyt Technol Biomed Life Sci. 2004;807(1):87-94. DOI:10.1016/j.jchromb.2004.03.031 | PubMed ID:15177165 [Kavoosi2004]
  71. Levy, I., Paldi, T., Siegel, D., and Shoseyov, O. (2003) Cellulose binding domain from Clostridium cellulovorans as a paper modification reagent. Nordic Pulp Paper Res. J. 18:421-428.

    [Levy2003]
  72. Yokota, S., Matuso, K., Kitaoka, T., and Wariishi, H. (2009) Retention and paper strength characteristics of anionic polyacrylamides conjugated with carbohydrate-binding modules. "Carbohydrate-binding anionic PAM". BioResources 4(1):234-244 Article.

    [Yokota2009]
  73. Levy I, Paldi T, and Shoseyov O. (2004). Engineering a bifunctional starch-cellulose cross-bridge protein. Biomaterials. 2004;25(10):1841-9. DOI:10.1016/j.biomaterials.2003.08.041 | PubMed ID:14738848 [Levy2004]
  74. Zhang, Y., Chen, S., He, M., Wu, J., Chen, J., and Wang, Q. (2011) Effects of Thermobifida fusca Cutinase-carbohydrate-binding Module Fusion Proteins on Cotton Bioscouring. Biotechnology and Bioprocess Engineering. 16,645-653 DOI:10.1007/s12257-011-0036-4

    [Zhang2011]
  75. Francisco JA, Stathopoulos C, Warren RA, Kilburn DG, and Georgiou G. (1993). Specific adhesion and hydrolysis of cellulose by intact Escherichia coli expressing surface anchored cellulase or cellulose binding domains. Biotechnology (N Y). 1993;11(4):491-5. DOI:10.1038/nbt0493-491 | PubMed ID:7763519 [Francisco1993]
  76. Simşek Ö, Sabanoğlu S, Çon AH, Karasu N, Akçelik M, and Saris PE. (2013). Immobilization of nisin producer Lactococcus lactis strains to chitin with surface-displayed chitin-binding domain. Appl Microbiol Biotechnol. 2013;97(10):4577-87. DOI:10.1007/s00253-013-4700-9 | PubMed ID:23354445 [Simsek2013]
  77. Wang JY and Chao YP. (2006). Immobilization of cells with surface-displayed chitin-binding domain. Appl Environ Microbiol. 2006;72(1):927-31. DOI:10.1128/AEM.72.1.927-931.2006 | PubMed ID:16391137 [Wang2006]
  78. Reyes-Ortiz V, Heins RA, Cheng G, Kim EY, Vernon BC, Elandt RB, Adams PD, Sale KL, Hadi MZ, Simmons BA, Kent MS, and Tullman-Ercek D. (2013). Addition of a carbohydrate-binding module enhances cellulase penetration into cellulose substrates. Biotechnol Biofuels. 2013;6(1):93. DOI:10.1186/1754-6834-6-93 | PubMed ID:23819686 [Reyes2013]
  79. Ravalason H, Herpoël-Gimbert I, Record E, Bertaud F, Grisel S, de Weert S, van den Hondel CA, Asther M, Petit-Conil M, and Sigoillot JC. (2009). Fusion of a family 1 carbohydrate binding module of Aspergillus niger to the Pycnoporus cinnabarinus laccase for efficient softwood kraft pulp biobleaching. J Biotechnol. 2009;142(3-4):220-6. DOI:10.1016/j.jbiotec.2009.04.013 | PubMed ID:19414054 [Ravalason2009]
  80. Han R, Li J, Shin HD, Chen RR, Du G, Liu L, and Chen J. (2013). Carbohydrate-binding module-cyclodextrin glycosyltransferase fusion enables efficient synthesis of 2-O-d-glucopyranosyl-l-ascorbic acid with soluble starch as the glycosyl donor. Appl Environ Microbiol. 2013;79(10):3234-40. DOI:10.1128/AEM.00363-13 | PubMed ID:23503312 [Han2013]
  81. Ficko-Blean E and Boraston AB. (2006). The interaction of a carbohydrate-binding module from a Clostridium perfringens N-acetyl-beta-hexosaminidase with its carbohydrate receptor. J Biol Chem. 2006;281(49):37748-57. DOI:10.1074/jbc.M606126200 | PubMed ID:16990278 [Ficko-Blean2006]

All Medline abstracts: PubMed