The phenylpropanoid pathway serves as a rich source of metabolites in plants, being required for the biosynthesis of lignin, and serving as a starting point for the production of many other important compounds, such as the flavonoids, coumarins, and lignans. In spite of the fact that the phenylpropanoids and their derivatives are sometimes classified as secondary metabolites, their relevance to plant survival has been made clear via the study of Arabidopsis and other plant species. As a model system, Arabidopsis has helped to elucidate many details of the phenylpropanoid pathway, its enzymes and intermediates, and the interconnectedness of the pathway with plant metabolism as a whole. These advances in our understanding have been made possible in large part by the relative ease with which mutations can be generated, identified, and studied in Arabidopsis. Herein, we provide an overview of the research progress that has been made in recent years, emphasizing both the genes (and gene families) associated with the phenylpropanoid pathway in Arabidopsis, and the end products that have contributed to the identification of many mutants deficient in the phenylpropanoid metabolism: the sinapate esters.
Phenylpropanoids are a diverse group of compounds derived from the carbon skeleton of phenylalanine that are involved in plant defense, structural support, and survival (Vogt, 2010). Compounds with these roles, or even with no known roles, are often referred to as “secondary metabolites” because they have no apparent involvement in basal cellular processes such as photosynthesis, respiration, or protein and nucleic acid synthesis, processes that are the domain of so-called primary metabolites. As a result, secondary metabolites appear at first glance to be dispensable to the immediate survival of the organism. Instead, this frequently invoked distinction is probably simply representative of our lack of understanding of the importance of secondary metabolites in the interaction of plants with their biotic and abiotic environment, particularly those interactions that go on outside of the confines of the controlled conditions of growth chambers and greenhouses. Phenylpropanoid metabolism sits astride the boundary of primary and secondary metabolism. Although it is a source of many molecules whose functions are relatively poorly studied and understood, other pathway products are of obvious importance to plants. Specifically, the phenylpropanoid pathway is indispensable to plants because of its role in the production of the hydroxycinnamyl alcohols, also known as monolignols (Boerjan et al., 2003). Monolignols serve as the building blocks of lignin, which confers structural support, vascular integrity, and pathogen resistance to plants. Indeed, the importance of lignin is underscored by the fact that it is estimated to be the second most abundant biopolymer on earth, and the importance of the phenylpropanoid pathway in general is illustrated by what is thought to be a ubiquitous presence in terrestrial plants (Boerjan et al., 2003).
Phenylalanine is an end product of the shikimate pathway, which also gives rise to the aromatic amino acids tyrosine and tryptophan (Herrmann and Weaver, 1999; Tzin and Galili, 2010). From phenylalanine, lignin biosynthesis proceeds via a series of side-chain modifications and ring hydroxylations and O-methylations to yield the lignin monomers (Boerjan et al., 2003). The pathway also gives rise to a host of other small molecules such as the flavonoids, coumarins, hydroxycinnamic acid conjugates, and lignans (D'Auria and Gershenzon, 2005; Vogt, 2010), many of which could be, and have been, the sole focus of review articles (e.g. Winkel-Shirley, 2001; Lepiniec et al., 2006). The phenylpropanoid pathway begins with three reactions leading to the synthesis of 4-coumaroyl CoA (alternatively, p-coumaroyl CoA). This review focuses on these and the remaining steps of the pathway which lead to the synthesis of monolignols and hydroxycinnamic acids such as ferulic and sinapic acids and their corresponding esters. Arabidopsis and other members of the Brassicaceae are particularly well known for accumulating a variety of sinapate esters including sinapoylcholine, which accumulates in seeds, and sinapoylmalate, which accumulates in leaves (for review, see Milkowski et al., 2004). Disruptions in the genes of the phenylpropanoid pathway result in reductions in sinapoylmalate levels in leaves. This reduction, in turn, leads to a characteristic red fluorescence of plants under UV light, due to both a decrease in sinapoylmalate fluorescence and the enhanced chlorophyll fluorescence which occurs when the photosynthetic apparatus is not shielded from incident UV (Chapple et al., 1992). This phenotype has led to the rapid isolation of phenylpropanoid mutants and to the identification of many of the genes associated with phenylpropanoid metabolism (Humphreys and Chapple, 2002; Stout and Chapple, 2004). This approach, along with biochemical approaches in other species, has provided us with the view that in many cases, phenylpropanoid genes are members of diverse gene families, and that their isolation and characterization can provide important clues about enzyme evolution in plant secondary metabolism. In particular and as described in more detail later in this review, the characterization of several of these mutants demonstrated that the later steps in monolignol biosynthesis involving ring hydroxylation and methylation steps do not occur at the level of hydroxycinnamic acids and that they instead are catalyzed by enzymes that use hydroxycinnamyl aldehydes and alcohols as substrates. Further, they revealed that ferulic and sinapic acids, rather than being intermediates in monolignol biosynthesis as previously envisioned, are actually synthesized by the oxidation of hydroxycinnamaldehydes (Humphreys and Chapple, 2002; Stout and Chapple, 2004).
THE SHIKIMATE PATHWAY1
The shikimate pathway is required for the biosynthesis of the aromatic amino acids in microorganisms, plants, and fungi, but is absent from animals (Fig. 1; for review, see Herrmann and Weaver, 1999; Knaggs, 2003; and Tzin and Galili, 2010). The enzymes of the shikimate pathway were initially identified in bacteria, and have since been studied in a variety of organisms. Research on the pathway has led to development of several herbicides, including N-[phosphonomethyl]glycine (glyphosate; commercially known as Roundup), which targets 5-enolpyruvylshikimate-3-phosphate (EPSP) synthase (Fig. 1; EC 188.8.131.52). The importance of the shikimate pathway beyond primary metabolism in plants can be found both in its immediate end products (phenylalanine (Phe), tyrosine (Tyr), and tryptophan (Trp)), and in the vast array of compounds that are ultimately derived from its end products: alkaloids, indole glucosinolates, flavonoids, hydroxycinnamic acids, lignins, and lignans (Knaggs, 2003). In addition, the shikimate pathway intermediate chorismate serves as the starting point for the biosynthesis of physiologically important compounds like the folates and quinones (Tzin and Galili, 2010).
The shikimate pathway begins with the condensation of phosphoenolpyruvate (PEP) with erythrose 4-phosphate (E4P) to form 3-deoxy-D-heptulosonate 7-phosphate (DAHP) and inorganic phosphate (Fig. 1). Subsequent ring closure, dehydration, and reduction lead to the production of shikimate. Phosphorylation then yields the substrate for 5-enolpyruvylshikimate-3-phosphate (EPSP) synthase, the target for the herbicide glyphosate. Chorismate synthase then catalyzes a 1,4-trans elimination of the EPSP phosphate group resulting in chorismate. The biosynthetic pathways of the three aromatic amino acids diverge at this point, with the metabolic steps leading to Phe and Tyr proceeding independently, via prephenate, from those leading to Trp. Prephenate biosynthesis is catalyzed by chorismate mutase and constitutes the committed step in the biosynthesis of both Phe and Tyr. Transamination then yields arogenate; at which point the Tyr and Phe pathways diverge. The subsequent removal of the ring carboxylate and hydroxyl groups via arogenate dehydratase comprises the final step in the biosynthesis of phenylalanine in plants. Phenylpropanoid metabolism follows thereafter (Fig. 2).
Phenylalanine ammonia-lyase (PAL)
Phenylalanine ammonia-lyase (PAL; EC 184.108.40.206) catalyzes the first step in phenylpropanoid metabolism in which phenylalanine undergoes deamination to yield trans-cinnamic acid and ammonia (Fig. 2). As in many plants, the active PAL isoforms of Arabidopsis are encoded by a gene family, with genes designated PAL1–PAL4 (Raes et al., 2003). Expression studies of the PAL genes have shown that PAL3 is expressed only at basal levels in stems (Mizutani et al., 1997; Raes et al., 2003). By contrast, PAL1, PAL2 and PAL4 are expressed at relatively high levels in stems during the latter stages of development with PAL1 expression localized to vascular tissue and PAL2 and PAL4 both expressed in seeds (Ohl et al., 1990; Leyva et al., 1995; Raes et al., 2003; Rohde et al., 2004). On the basis of these data and the presence of specific promoter elements associated with the PAL1 and PAL2 genes, but not with the PAL3 and PAL4 genes, Raes and colleagues proposed that the former two genes encode the principal PAL enzymes of Arabidopsis phenylpropanoid metabolism (Raes et al., 2003).
Kinetic analysis of the proteins encoded by the Arabidopsis PAL gene family has shown that PAL1, 2, and 4 can catalyze phenylalanine deamination in vitro, whereas PAL3 shows only minimal activity (Cochrane et al., 2004). Studies of the pal1 and pal2 mutants have also shown that mutations in each of PAL1 and PAL2 alone do not produce a noticeable phenotype; PAL1 is up-regulated in the pal2 mutant, PAL2 is up-regulated in the pal1 mutant, and PAL4 is up-regulated in both mutants (Rohde et al., 2004). The hypothesis that PAL1 and PAL2 function as the primary PAL isoforms is further strengthened by analysis of the pal1 pal2 double mutant, which over-accumulates phenylalanine and exhibits reduced lignin content with an increased syringyl (S) lignin to guaiacyl (G) lignin ratio (Rohde et al., 2004). The double mutant is also deficient in tannin and anthocyanin biosynthesis (Huang et al., 2010). Thus, while PAL4 appears to partially compensate for the loss of PAL1 and PAL2, the data would suggest that PAL1 and PAL2 encode the major functional PAL enzymes in Arabidopsis (Table 1). The phenotype of the pal1 pal2 pal3 pal4 quadruple mutant is generally consistent with these results; the mutant is stunted, accumulates reduced levels of lignin, and accumulates reduced levels of salicylic acid following pathogen attack. Tissues from the quadruple mutant do, nevertheless, exhibit residual PAL activity, raising the possibility of unknown PAL-like genes in Arabidopsis (Huang et al., 2010).
Cinnamate 4-hydroxylase (C4H)2
Cinnamate 4-hydroxylase (C4H; EC 220.127.116.11) is a cytochrome P450-dependent monooxygenase (P450; CYP) that catalyzes the hydroxylation of cinnamate to yield 4-coumarate (also known as p-coumarate; Fig. 2). As a group, P450s are divided into families of proteins that are at least 40% identical to one another, and subfamilies of proteins that are at least 55% identical to one another. C4H is the first of three P450s involved in lignin biosynthesis (see below), and is the only member of the CYP73A subfamily in Arabidopsis, designated CYP73A5 (Werck-Reichhart et al., 2002).
The single C4H gene from Arabidopsis was cloned and characterized over a decade ago (Table 1; Bell-Lelong et al., 1997; Mizutani et al., 1997); however, the identification of viable C4H mutants was only reported recently. By using the UV-screening method described previously, Schilmiller and coworkers were able to isolate three allelic mutants harboring missense mutations in the Arabidopsis C4H gene. These reduced epidermal fluorescence 3 (ref3) mutants fail to accumulate wild-type levels of sinapoylmalate (Ruegger and Chapple, 2001; Schilmiller et al., 2009). The missense mutations found in the ref3 alleles are associated with an array of metabolic changes. In addition to having low levels of sinapoylmalate, leaves of the ref3 mutants accumulate at least two cinnamate esters that are not found in the leaves of wild-type plants. Levels of condensed tannins in ref3 seeds are also reduced, and lignin deposition in each of the ref3 mutants is lower as well. The latter phenomenon appears to be caused by a decrease in G lignin monomers, leading to an S/G ratio higher than that of wild type (Schilmiller et al., 2009). These metabolic changes are likely due to the altered stability and substrate binding of the mutant C4H proteins; the different missense mutations in the ref3-1, ref3-2 and ref3-3 genes all lead to amino acid substitutions in structural motifs that are highly conserved in the C4H proteins of land plants.
The metabolic changes found in the ref3 mutants are accompanied by developmental, structural, and reproductive phenotypes. For instance, the more pronounced reduction in lignin content found in the ref3-1 and ref3-2 mutants results in collapsed xylem, presumably impairing water transport in these plants (Fig. 3). Plants homozygous for the ref3-1 and ref3-2 alleles are dwarfed and are reduced in apical dominance as well (Fig. 4). An indication of the severity of the ref3-2 mutation is further illustrated by the fact that ref3-2 plants fail to produce mature pollen in their anthers and are male-sterile. For reasons that are unclear at this time, strong C4H alleles lead to swellings at the base of lateral branches (Fig. 4; Schilmiller et al., 2009). Taken together, the abnormalities found in the ref3 mutants demonstrate the relevance of C4H to physiological processes that are essential to plant survival.
4-coumarate:CoA ligase (4CL)
4-coumarate:CoA ligase (4CL; EC 18.104.22.168) catalyzes the ATP-dependent formation of the CoA thioester 4-coumaroyl CoA (also known as p-coumaroyl CoA; Fig. 2). Although an initial study identified a single 4CL-like protein in Arabidopsis, later denoted At4CL1 (Lee et al., 1995), subsequent studies have since revealed that the Arabidopsis genome encodes at least four such proteins: At4CL1–At4CL4 (Ehlting et al., 1999; Hamberger and Hahlbrock, 2004; Costa et al., 2005). The initial analysis of At4CL1 showed that its expression correlates with lignin deposition in young seedlings, stem bolting in mature plants, and pathogenic infiltration. The pattern of gene expression in the latter case parallels that of PAL gene expression. Antisense suppression of At4CL1 results in a pronounced decrease in G lignin (Lee et al., 1997).
AGI codes and gene names for enzymes of the phenylpropanoid pathway.
At4CL2 is 83% identical to At4CL1 and exhibits an expression pattern in stems that is similar to that of At4CL1 (Hamberger and Hahlbrock, 2004); however the At4CL3 isozyme is only 61% identical to At4CL1, and is expressed primarily in siliques (Raes et al., 2003). Unlike At4CL1 and At4CL2, At4CL3 is not induced by pathogen attack, but is induced by UV radiation (Ehlting et al., 1999). All three proteins exhibit activity towards 4-coumarate, but their expression patterns and their similarity to 4CL homologues found in other plants suggest that At4CL1 and At4CL2 are likely to be involved in lignin biosynthesis (Table 1), while At4CL3 may have a role in flavonoid biosynthesis (Ehlting et al., 1999). In contrast, the last 4CL isozyme to be discovered in Arabidopsis, At4CL4, exhibits preferential activity not towards 4-coumarate but towards ferulate and sinapate, suggesting a metabolic function different from the other three isozymes (Hamberger and Hahlbrock, 2004).
Several studies on the substrate binding properties of the Arabidopsis 4CL proteins have identified residues important to the function of these enzymes. For example, mutagenesis experiments with the At4CL2 isozyme have since identified the amino acid residues responsible for substrate specificity in the At4CL family (Ehlting et al., 2001; Stuible and Kombrink, 2001; Schneider et al., 2003). This study showed that At4CL2 activity can be altered to yield gain-of-function mutants; altering 12 amino acids found in the substrate binding pocket of At4CL2 produces mutants exhibiting pronounced activities towards ferulate, sinapate, and cinnamate (Schneider et al., 2003).
The importance of the At4CL genes and the proteins they encode is especially evident when one considers the role of the activated thioester p-coumaroyl CoA in plant secondary metabolism. In higher plants, p-coumaroyl CoA represents the starting point for the biosynthesis of not only phenylpropanoid compounds, but a wide variety of other secondary metabolites as well (for review, see Vogt, 2010). Some pathways, such as those leading to the proanthocyanidins, tannins, and flavonoids diverge from phenylpropanoid metabolism immediately after the biosynthesis of 4-coumaroyl CoA (Winkel-Shirley, 2001; Dixon et al., 2005). Other pathways, such as those leading to the phenylpropenes, coumarins, lignins, and lignans share some of the intermediates of phenylpropanoid metabolism (Lewis and Davin, 1999; Boerjan et al., 2003; Dudareva et al., 2004; Vogt, 2010). Our understanding of the metabolic pathways associated with these metabolites has been greatly enhanced through the study of Arabidopsis as well (D'Auria and Gershenzon, 2005).
Hydroxycinnamoyl-coenzyme A shikimate:quinate hydroxycinnamoyl-transferase (HCT)
Hydroxycinnamoyl-coenzyme A shikimate:quinate hydroxycinnamoyl-transferase (HCT; EC 2.3.1. 133) catalyzes two steps in phenylpropanoid metabolism (Fig. 2). First, following the synthesis of p-coumaroyl CoA by 4CL, HCT catalyzes the transfer of the p-coumaroyl group to shikimate (Hoffmann et al., 2003). The enzyme can also use quinate in place of shikimate but only poorly. Second, following the 3′ hydroxylation of p-coumaroyl shikimate by p-coumaroyl shikimate 3′-hydroxylase (C3′H) to form caffeoyl shikimate (Schoch et al., 2001; Franke et al., 2002a), HCT catalyzes the transfer of the caffeoyl moiety back onto Coenzyme A (Fig. 2). The first gene found to encode an HCT was characterized in tobacco (Hoffmann et al., 2003). Silencing of the Arabidopsis HCT has been shown to result in a dwarf phenotype with reduced S lignin and over-accumulation of flavonoids relative to wild-type plants (Table 1; Hoffmann et al., 2004; Besseau et al., 2007; Li et al., 2010b). Although it has been proposed that flavonoid accumulation is responsible for the growth inhibition observed in HCT-RNAi (Besseau et al., 2007), another study has shown this not to be the case (Fig. 5) and suggested that the dwarf phenotype instead results from disruptions in the biosynthesis of lignin or dehydrodiconiferyl alcohol glucosides (DCGs) (Li et al., 2010b).
p-coumaroyl shikimate 3′ hydroxylase (C3′H)2
P-coumaroyl shikimate 3′ hydroxylase (C3′H; EC 1.14. 13.36) catalyzes the hydroxylation of p-coumaroyl shikimate to yield caffeoyl shikimate (Fig. 2). It is the second of three P450s identified in lignin biosynthesis, and is one of three P450s in Arabidopsis belonging to the CYP98 family (CYP98A3) (Schoch et al., 2001; Franke et al., 2002a). Although C3′H has the capacity to act on the quinate ester of p-coumarate, the rate of conversion of the shikimate ester is four times higher (Schoch et al., 2001). When considered in light of the preference exhibited by HCT for shikimate over quinate, these results suggest that the shikimate ester of p-coumarate is the actual substrate for C3′H in vivo. The significance of the characterization of C3′H is underscored by the fact that it led to a revision to our understanding of phenylpropanoid metabolism in general and established a direct link between the intermediates of lignin metabolism, and the intermediates and products of the shikimate pathway (Schoch et al., 2001; Franke et al., 2002a; Humphreys and Chapple, 2002).
The first CYP98A3 mutant to be identified, ref8, harbors a single transition mutation resulting in a G444D substitution that largely eliminates C3′H activity (Table 1; Franke et al., 2002a). In addition to accumulating only low levels of sinapoylmalate like the other ref mutants (Fig. 6; Franke et al., 2002a), ref8 plants also deposit p-hydroxyphenyl (H) lignin instead of the G and S lignin found in the wild type. The mutant is also dwarfed and hyperaccumulates flavonoids (Franke et al., 2002b). In addition to highlighting the importance of lignin biosynthesis to plant growth (Li et al., 2010b), studies of the CYP98A3 mutants have revealed a possible alternate, CYP98A3-independent route to G lignin production in roots (Abdulrazzak et al., 2006).
A relatively recent study of the two other Arabidopsis CYP98 genes (CYP98A8/At1g74540 and CYP98A9/At1g74550) has demonstrated that while both CYP98A8 and CYP98A9 catalyze meta-hydroxylation reactions, they are required for the synthesis of specialized metabolites currently known to occur only in pollen, rather than in lignin biosynthesis like CYP98A3 (Matsuno et al., 2009). C3′H, CYP98A8, and CYP98A9 appear to have evolved from a common ancestor, with CYP98A8 and CYP98A9 being 50% identical to C3′H. Both CYP98A8 and CYP98A9 appear to be expressed in inflorescence tips, young flower buds, and stamens. UPLC-MS/MS analysis of CYP98A8/9 overexpression and knock-out mutants suggests that both genes are involved in the production of N1,N5-di(hydroxyferuloyl)-N10-sinapoylspermidine. Further, CYP98A8 and CYP98A9 can both catalyze the hydroxylation of the precursor metabolite N1,N5-N10-tricoumaroyl spermidine, and CYP98A8 may also catalyze the hydroxylation of N1,N5N10- triferuloyl spermidine (Matsuno et al., 2009). When viewed in light of the pollen-related phenotypes of the ref3-2 mutant (Schilmiller et al., 2009), these data demonstrate an important role for phenylpropanoid metabolism during pollen development.
Caffeoyl CoA 3-O-methyltransferase (CCoAOMT)
Caffeoyl CoA 3-O-methyltransferase (CCoAOMT; EC 22.214.171.124) catalyzes the first methyl-transfer reaction in phenylpropanoid metabolism, synthesizing feruloyl CoA from caffeoyl CoA (Fig. 2). Research on this enzyme has contributed significantly to the revision of the phenylpropanoid pathway away from the traditional model in which ring and side-chain modifications take place exclusively at the level of hydroxycinnamates, to the currently accepted model involving CoA thioesters, shikimate esters, and hydroxycinnamyl aldehydes and alcohols (Humphreys and Chapple, 2002). It has now been shown that there are seven CCoAOMT-like genes encoded by the Arabidopsis genome (Raes et al., 2003; Sibout et al., 2005), but only one gene (CCoAOMT1) is a confirmed CCoAOMT gene (Table 1; Do et al., 2007). Prior to the identification of a ccomt1 TDNA insertion mutant, its encoded protein had been shown to exhibit activity towards caffeoyl-CoA and three other methyl group acceptors (Ibdah et al., 2003). Consistent with the loss of this activity in ccomt1, stem lignin content is lower in ccomt1 than in wild type, and xylem vessels are also collapsed (Do et al., 2007). Additional studies of ccomt1 null mutants also suggest that coumarin biosynthesis is significantly affected by the elimination of CCoAOMT activity, pointing once again to the interconnectedness of pathways in plant secondary metabolism (Kai et al., 2008).
Cinnamoyl-CoA reductase (CCR)
Cinnamoyl-CoA reductase (CCR; EC 126.96.36.199) catalyzes the synthesis of hydroxycinnamaldehydes from the hydroxycinnamoyl-CoA thioesters (Fig. 2). The first cDNA encoding CCR was cloned from eucalyptus (Lacombe et al., 1997), and several additional studies have since identified CCR genes in tobacco and maize among others (Pichon et al., 1998; Piquemal et al., 1998). Two CCR cDNAs from Arabidopsis have also been characterized, and their encoded proteins (AtCCR1 and AtCCR2) are more than 80% identical. Kinetic studies of AtCCR1 and AtCCR2 show that the proteins both have a high affinity for feruloyl-CoA, with lower affinities for sinapoyl-CoA and caffeoyl-CoA (Lauvergeat et al., 2001; Baltas et al., 2005). AtCCR1 has been shown to convert feruloyl-CoA to coniferaldehyde with a 5-fold greater catalytic efficiency than AtCCR2. The AtCCR1 and AtCCR2 genes also exhibit different expression patterns. Under normal conditions, AtCCR1 is expressed in flowers and leaves and highly expressed in stems, whereas AtCCR2 RNA is undetectable in all tissue types under normal conditions. In contrast, when leaves are exposed to bacterial pathogens, AtCCR2 transcripts increase steadily over a 12-hour period while AtCCR1 transcripts decrease. These results have led to the hypothesis that AtCCR1 is involved in the normal lignification pathway and AtCCR2 is involved in pathogeninduced lignification (Lauvergeat et al., 2001).
Three separate studies of AtCCR1 mutants have further established the role of the gene in lignin production (Jones et al., 2001; Goujon et al., 2003a). Both the EMS irregular xylem4 (irx4) mutants and AtCCR1 antisense transgenic plants produce approximately 50% less lignin than wild type, have collapsed xylem, and are severely dwarfed. The vessel element cell walls of AtCCR1 mutants are also expanded and appear diffuse when visualized by electron microscopy, presumably due to a lack of compression normally provided by lignin (Fig. 7), a phenotype that is also brought about by PAL inhibition (Smart and Amrhein, 1985; Jones et al., 2001). AtCCR1 null mutants also have reduced levels of sinapoylmalate in their stems relative to wild type, and accumulate significant amounts of feruloylmalate, consistent with (1) a significant reduction in the levels of both hydroxycinnamaldehydes and hydroxycinnamoyl alcohols, and (2) the rerouting of feruloyl-CoA into what may be a thioester-based pathway to the novel ferulate esters found in these plants (Derikvand et al., 2008). Taken together, these results have established AtCCR1 as the primary in vivo CCR of Arabidopsis phenylpropanoid metabolism (Table 1).
Ferulate 5-hydroxylase (F5H)2
Ferulate 5-hydroxylase (F5H) is the third P450 involved in phenylpropanoid metabolism, catalyzing the NADPH- and O2-dependent hydroxylation of both coniferaldehyde and coniferyl alcohol to yield 5-hydroxyconiferaldehyde and 5-hydroxyconiferyl alcohol, respectively (Fig. 2). The mutant corresponding to F5H (ferulic acid hzydroxylase; fah1) was initially identified with a thin layer chromatography-based screen for plants that fail to accumulate sinapoylmalate (Fig. 8; Table 1). Radiotracer feeding experiments showed that phenylpropanoid metabolism in the fah1 mutant is blocked at F5H, and subsequent analysis revealed that the mutant fails to deposit S lignin (Fig. 9; Chapple et al., 1992). The low homology between F5H and any other previously characterized P450 led to its designation as the defining member of a new P450 family, CYP84 (Meyer et al., 1996).
At the time that the FAH1 gene was cloned, it was thought that the hydroxylation and O-methylation of G and S lignin biosynthetic intermediates took take place at the level of free hydroxycinnamic acids, with F5H acting upon ferulate to yield 5-hydroxyferulate which was subsequently O-methylated to sinapate (Chapple et al., 1992; Humphreys and Chapple, 2002). In contrast, in vitro kinetic studies of F5H identified coniferaldehyde and coniferyl alcohol as having much lower Km values relative to ferulate (Humphreys et al., 1999; Humphreys and Chapple, 2002). An overexpression study published shortly following the initial cloning of FAH1 showed that Arabidopsis plants in which FAH1 was expressed under the control of the C4H promoter deposit lignin composed almost exclusively of S monomers and in a non-tissue specific fashion (Meyer et al., 1998). These findings indicate that the 5-hydroxylation of coniferaldehyde and coniferyl alcohol limits the synthesis of syringyl lignin both quantitatively and spatially.
Several other studies of F5H have further refined our understanding of its role in phenylpropanoid metabolism. Like other phenylpropanoid genes, FAH1 is highly expressed in the rachis (flowering stem) of adult plants, but it is developmentally regulated in a manner that sets it apart from PAL, C4H, OMT, and 4CL (Ruegger et al., 1999). A 3′ regulatory region of the FAH1 gene is necessary for expression in leaves, but not in siliques, pointing to a more complex regulatory scheme for FAH1 compared to other genes of the phenylpropanoid pathway (Ruegger et al., 1999).
F5H overexpression studies in Arabidopsis, tobacco, and poplar have provided a basis for metabolic engineering of lignin biosynthesis (Meyer et al., 1998; Marita et al., 1999; Franke et al., 2000). Two strategies employing different promoters were taken in these studies and led to surprisingly different results. Lignin monomer composition in tobacco and Arabidopsis plants transformed with the 35S::FAH1 expression construct is relatively similar to that of the wild type (Meyer et al., 1998; Marita et al., 1999; Franke et al., 2000). On the other hand, plants transformed with C4H::FAH1 deposit lignin highly enriched in S subunits (Meyer et al., 1998; Marita et al., 1999; Franke et al., 2000). Similar results have also been obtained for poplar transformed with this C4H::FAH1 construct, and together with the previous results, these data suggest that this strategy can be used to alter lignin monomer composition to improve pulping efficiency of wood for paper production (Huntley et al., 2003), the digestibility of forage crops (Franke et al., 2000), and the saccharification potential of plants used for biofuels production (Li et al., 2010c).
Caffeic acid/5-hydroxyferulic acid O-methyltransferase (COMT)
Originally thought to catalyze the methylation of free hydroxycinnamic acids, caffeic acid/5-hydroxyferulic acid O-methyltransferase (COMT; EC 188.8.131.52) is now thought to act at the level of the aldehyde and alcohol precursors of S lignin by methylating 5-hydroxy-coniferaldehyde and 5-hydroxyconiferyl alcohol to yield sinapaldehyde and sinapyl alcohol, respectively (Fig. 2). Initial research on COMT was prompted by the finding that F5H uses coniferaldehyde and coniferyl alcohol as substrates instead of ferulic acid (Humphreys et al., 1999; Humphreys and Chapple, 2002). Initial results on COMT activity were published along with the characterization of the Arabidopsis F5H (Humphreys et al., 1999) shortly after the cloning of the Arabidopsis COMT gene (Table 1; Zhang et al., 1997), and similar results were reported contemporaneously on F5H and COMT from Liquidambar styraciflua (sweetgum) (Osakabe et al., 1999). By carrying out coupled enzyme assays in which recombinant F5H and COMT were incubated with coniferaldehyde and coniferyl alcohol as starting substrates, Humphreys and coworkers demonstrated that COMT exhibits activity towards the reaction products of F5H (5-hydroxyconiferaldehyde and 5-hydroxyconiferyl alcohol), providing some of the first evidence for the now-established route of sinapate ester and S lignin biosynthesis (Humphreys et al., 1999).
Goujon and coworkers have shown that the T-DNA knockout mutant for the Arabidopsis COMT gene (COMT1; AtOMTI) exhibits a phenotype consistent with the in vitro substrate analysis of the enzyme (Goujon et al., 2003b). Lignin in the comt1 mutant is missing S units, and instead includes the 5-hydroxyguaiacyl S unit precursors. The comt1 mutant also exhibits only reduced sinapoylmalate levels in leaves, stems, and seedlings. The differential impact of this mutation on S lignin and sinapoylmalate accumulation indicates a redundancy for COMT activity in leaves that is lacking in lignifying cells. Goujon and coworkers also found that Atomt1 seedlings accumulate 5-hydroxyferuloylmalate and 5-hydroxyferuloylglucose, suggesting that the UDP-glucosyltransferase and corresponding acyltransferase involved in the synthesis of sinapoylmalate (see below) may both exhibit activity towards 5-hydroxyferulate and 5-hydroxyferuloylglucose, respectively (Goujon et al., 2003b). Additional studies on comt1 ccomt1 double mutants have shown that the combination of mutations result in plants that are severely dwarfed, that fail to accumulate sinapate esters, and whose lignin is composed primarily of p-hydroxyphenyl (H) units (Do et al., 2007).
Cinnamyl alcohol dehydrogenase (CAD)
Cinnamyl alcohol dehydrogenase (CAD; EC 184.108.40.206) catalyzes the final step in the biosynthesis of G and S lignin monomers: the NADPH-dependent reduction of coniferaldehyde and sinapaldehyde to the monolignols coniferyl alcohol and sinapyl alcohol, respectively (Fig. 2). Of the nine CAD-like proteins encoded by the Arabidopsis genome (Kiedrowski et al., 1992; Baucher et al., 1995; Somers et al., 1995; Costa et al., 2003; Sibout et al., 2003), experimental results suggest that two of these proteins, designated variously as CAD-C/AtCAD4/AtCAD-C and CAD-D/AtCAD5/AtCAD-D, function in lignin biosynthesis (Sibout et al., 2003; Kim et al., 2004; Sibout et al., 2005). Both CAD-C/AtCAD4 and CAD-D/AtCAD5 are highly expressed in vascular bundles, as well as several other tissue types (Sibout et al., 2003; Kim et al., 2007). Analysis of the cadc cad-d double mutant has provided conclusive evidence about the role of both genes in lignin metabolism, and has established them as the primary CADs in Arabidopsis. In addition to an overall reduction of lignin in xylem and fiber tissues, the double mutant exhibits a 94% reduction in traditional G and S subunits relative to wild-type plants, depositing lignin composed chiefly of coniferaldehyde and sinapaldehyde subunits (Sibout et al., 2005). In addition, CAD-D/ AtCAD5 may be involved in salt-induced stress responses in roots (Chen et al., 2007), and both CAD-C/AtCAD4 and CAD-D/AtCAD5 may be important to plant defense (Tronchet et al., 2010).
Peroxidases and Laccases
Following transport to the secondary cell wall, monolignols are oxidized and subsequently polymerized to form lignin. The initial step in this polymerization process involves oxidation of the hydroxyl at the para position of the monolignol to yield an intermediate radical. The traditional model has that cross coupling of the radicals in a combinatorial fashion results in polymer extension (Boerjan et al., 2003). Although the enzymes ultimately responsible for this step in lignin biosynthesis have not yet been definitively determined, and although several different classes of oxidoreductases have been identified as likely candidates, guaiacol peroxidases (Passardi et al., 2004) and laccases (Mayer and Staples, 2002) are viewed as a likely source of radical formation and random coupling (Hatfield and Vermerris, 2001; Boerjan et al., 2003).
Seventy-three genes annotated as guaiacol, or class III peroxidase-like genes are found in the Arabidopsis genome (Tognolli et al., 2002; Welinder et al., 2002; Valério et al., 2004). Members of this Arabidopsis peroxidase (AtPrx) gene family are expressed in a variety of tissue types, including stems, flowers, leaves, and roots (Tognolli et al., 2002; Welinder et al., 2002; Valério et al., 2004). Studies of individual AtPrx genes suggest that they may be involved in both cell growth and lignification. For instance, abnormal root lengths of the atprx33 and atprx34 mutants point to a role for the corresponding wild-type genes in cell elongation (Passardi et al., 2006), and expression patterns for AtPrx47, AtPrx64, and AtPrx66 are consistent with roles in vessel (AtPrx47, AtPrx64) and sclerenchyma (AtPrx64) lignification (Tokunaga et al., 2009).
Perhaps the clearest link between the AtPrx gene family and lignin biosynthesis comes from research on AtPrx53 (alternatively: ATP A2). GUS staining experiments have shown that the AtPrx53 promoter is active in lignified tissues (Østergaard et al., 2000). In addition, AtPrx53 mRNA levels are elevated in the G20 mutant (Østergaard et al., 2000), which accumulates high levels of lignin relative to wild type (Sundaresen et al., 1995). These results, and an analysis of the high-resolution AtPrx53 structure, led Østergaard and coworkers to hypothesize that AtPrx53 may be involved in the covalent cross-linking of lignin monomers (Østergaard et al., 2000). A follow-up study on the activity of AtPrx53 has produced data consistent with this hypothesis (Nielsen et al., 2001). Furthermore, AtPrx53 is more than 50% identical to two peroxidases from Norway spruce (Picea abies), each of which exhibit a high affinity for monolignols (Koutaniemi et al., 2005). Like the AtPrx genes, the Arabidopsis laccase-like multicopper oxidase (LMCO) genes are expressed in a wide variety of tissue types (Ehlting et al., 2005; McCaig et al., 2005; Cai et al., 2006). Salk T-DNA insertion mutants for twelve of the seventeen LMCO genes in Arabidopsis have been examined, with mutations in three genes (AtLAC2, AtLAC8, and AtLAC15) leading to altered phenotypes (Cai et al., 2006); lac2 mutants exhibit reduced root elongation when the plants are subjected to polyethylene-glycol-induced dehydration, lac8 mutants exhibit early flowering, and lac15 mutants exhibit altered seed color. This latter change in lac15 seed color is accompanied by a 60% reduction of soluble proanthocyanidin and condensed tannins relative to wild type. Extractable lignin content in lac15 seeds is also reduced by nearly 30%, and enzyme assays suggest that the polymerization activity of coniferyl alcohol is reduced in lac15 seeds (Liang et al., 2006). Thus, AtLAC15 (At5g48100) appears to encode the first Arabidopsis laccase to be identified as having a role in lignin biosynthesis.
SINAPATE ESTER METABOLISM
Hydroxycinnamaldehyde dehydrogenase (HCALDH)
As with the other ref mutants, ref1 fails to accumulate wild-type levels of sinapate esters in the leaf epidermis, with sinapoylmalate levels in leaves reduced to less than 10% of wild type (Table 1; Nair et al., 2004). Cloning of the REF1 gene and kinetic analysis of its encoded protein has shown that the enzyme catalyzes the NADP+-dependent oxidation of coniferaldehyde and sinapaldehyde to yield ferulate and sinapate, respectively (Fig. 2). Although the activity of the enzyme toward sinapaldehyde could be clearly related to the mutant's sinapoylmalate-deficient phenotype, the demonstrated activity of the enzyme against coniferaldehyde had a less obvious relationship to an in vivo function. Nevertheless, it was demonstrated that ref1 plants contain reduced amounts of cell wall esterified ferulic acid and ref1/fah1 double mutants fail to accumulate feruloylmalate, which is accumulated at low levels in fah1 plants (Nair et al., 2004). These data indicate that at least in Arabidopsis, a substantial portion of ferulate and sinapate are derived from their corresponding aldehydes by oxidation and that their synthesis proceeds in a manner directly opposite from that proposed by the original model of phenylpropanoid metabolism. The residual levels of sinapate esters indicate that there is a REF1-independent route to sinapate ester biosynthesis in Arabidopsis, possibly via a redundant HCALDH isoform (Nair et al., 2004).
Sinapate:UDP-glucose glucosyltransferase (SGT)
Sinapate:UDP-glucose glucosyltransferase (SGT) catalyzes the synthesis of sinapoylglucose (Fig. 10; Fraser et al., 2007), the activated metabolic precursor to both sinapoylmalate and sinapoylcholine (Strack, 1980; Strack, 1982; Strack et al., 1983; Mock and Strack 1993, Lehfeldt et al., 2000; Shirley et al., 2001), the dominant sinapate esters in leaves and seeds of Arabidopsis, respectively. SGT is a member of a family of UDP-glucosyltransferases (UGTs) in Arabidopsis, which contains 107 members divided into 12 major groups on the basis of sequence similarity (Li et al., 2001). Many Arabidopsis UGTs exhibit activity towards secondary metabolites (Bowles et al., 2005), including the phenylpropanoids (Milkowski et al., 2000; Lim et al., 2001; Lim et al., 2005; Lanot et al., 2006; Sinlapadech et al., 2007; Lanot et al., 2008; Meiβner et al., 2008), synthesizing both glucosides and glucose esters. Initially, a single member of this UGT superfamily was identified as the likely Arabidopsis SGT: UGT84A2 (At3g21560; Lim et al., 2001). Kinetic analyses showed that UGT84A2 exhibits a much greater affinity for sinapic acid than any other UGT in Arabidopsis. Nonetheless, UGT84A2 is one of four UGTs (UGT84A1, UGT84A2, UGT84A3 and UGT84A4) with affinities for the hydroxycinnamic acids (Milkowski et al., 2000; Lim et al., 2001), and there appears to be some functional overlap between the four enzymes (Milkowski et al., 2000; Lim et al., 2001; Sinlapadech et al., 2007; Meiβner et al., 2008). The close proximity of of UGT84A1, UGT84A3, and UGT84A4 in the Arabidopsis genome has prevented the generation of multiple mutants that would help to elucidate the biochemical roles of each individual protein and reveal any potential genetic redundancy (Meiβner et al., 2008).
Much of the information about the in planta function of the Arabidopsis SGT comes from the bright trichomes (brt1) mutant, which harbors a mutation in UGT84A2 (Table 1; Sinlapadech et al., 2007). Consistent with the role of the enzyme in sinapate ester synthesis, brt1 leaves accumulate less sinapoylmalate (Sinlapadech et al., 2007; Meiβner et al., 2008), and brt1 seeds accumulate less sinapoylcholine and sinapoylglucose. Interestingly, the trichomes of the brt1 mutant fluoresce with a distinctive intensity not seen in wild type (Fig. 11), caused by the accumulation of a sinapic acid-derived polyketide. Apparently, the loss of SGT activity in brt1 results in the activation of sinapate to its CoA thioester via a 4CL, which then serves as a substrate for a CHS or a CHS-like protein which generates the polyketide found in brt1 trichomes (Sinlapadech et al., 2007).
Sinapoylglucose:malate sinapoyltransferase (SMT)
Sinapoylglucose:malate sinapoyltransferase (SMT) was initially characterized in extracts of red radish (Strack, 1982) and catalyzes the transfer of sinapate from sinapoylglucose to malate to form sinapoylmalate (Fig. 10; Fraser et al., 2007), which serves as a UV-B-protectant in leaves (Strack, 1982; Mock et al., 1992; Landry et al., 1995; Sheahan, 1996). The Arabidopsis sinapoylglucose accumulator (sng1) mutant, which lacks SMT activity, accumulates sinapoylglucose in the place of sinapoylmalate in leaves (Lorenzen et al., 1996; Lehfeldt et al., 2000). Cloning of the SNG1 gene provided in vitro confirmation that it encodes SMT, and revealed that SMT is actually a serine carboxypeptidase-like (SCPL) protein (Table 1). As such, SMT shares a number of conserved residues with known serine carboxypeptidases (SCPs), including the serine, aspartate, and histidine residues that comprise the classic catalytic triad (Lehfeldt et al., 2000). SNG1 is one of 51 SCPL genes encoded by the Arabidopsis genome (Fraser et al., 2005) and one of five SCPL genes arranged in a tandem cluster on chromosome II. These clustered SCPL genes encode proteins whose pair-wise percent identities range from 71% to 78% (Fig. 12; Lehfeldt et al., 2000; Fraser et al., 2007). Three of these neighboring SCPL genes encode acyltransferases that use sinapoylglucose as an acyl donor, but differ in their acyl acceptor specificities (Fraser et al., 2007; Stehle et al., 2008a, b; Stehle et al., 2009).
SMT is recalcitrant with respect to heterologous expression (Lehfeldt et al., 2000; Stehle et al., 2008b), with initial attempts to express it in E. coli and yeast marked by low yield and a high percentage of misfolded protein. Successful expression of SMT has recently been achieved in S. cerevisiae using a combination of (1) codon usage adaptation, (2) a high copy vector and (3) fermenter cultivation (Stehle et al., 2008b). This expression system has allowed for detailed analyses of SMT's catalytic properties (Stehle et al., 2008a; Stehle et al., 2009). Stehle and coworkers have shown that in addition to catalyzing the synthesis of sinapoylmalate, it is also capable of acting as a hydrolase to generate sinapic acid and glucose and can catalyze the disproportionation of two molecules of sinapoylglucose to yield 1,2-disinapoylglucose and glucose, depending upon the availability of substrates and pH (Fraser et al., 2007; Stehle et al., 2008a). Based on these results, and a combination of mutagenesis experiments and molecular modeling, Stehle and coworkers suggest that SMT may employ a random sequential bi-bi mechanism of catalysis (Stehle et al., 2006; Stehle et al., 2008a; Stehle et al., 2009).
Sinapoylglucose:choline sinapoyltransferase (SCT)
Sinapoylglucose:choline sinapoyltransferase (SCT) catalyzes a reaction similar to that of SMT: the transfer of the sinapate moiety of sinapoylglucose to a hydroxylated acceptor molecule, in this case choline, to form sinapoylcholine, also known as sinapine (Fig. 10). The gene encoding the Arabidopsis SCT was identified via the sng2 mutant, which accumulates sinapoylglucose in the place of sinapoylcholine in seeds (Table 1; Shirley et al., 2001) Like SMT, SCT is an SCPL protein, (Shirley et al., 2001). Although the characterization of SCT in Arabidopsis has contributed to our understanding of phenylpropanoid metabolism, it also holds biotechnological potential. Sinapoylcholine is found at high concentrations in the seeds of the agronomically important crop Brassica napus (oilseed rape) (Bouchereau et al., 1991). Because of the antinutritive properties of sinapoylcholine, engineering sinapate ester metabolism to reduce its biosynthesis in the seeds of B. napus could improve the nutritive quality of canola meal (Shirley et al., 2001; Milkowski and Strack, 2010). Two SCT genes (BnSCT1 and BnSCT2) have been identified in B. napus since the characterization of the Arabidopsis SCT, and metabolic engineering projects are currently underway to eliminate the biosynthesis of choline in B. napus seeds (Milkowski et al., 2004; Huang et al., 2008; Weier et al., 2008; Bhinu et al., 2009; Milkowski and Strack, 2010).
Sinapoylcholine esterase (SCE)
Sinapoylcholine serves as a storage reserve of choline in the seeds of Arabidopsis and other members of the Brassicaceae (Henry Fils and Garot, 1825; Gadamer, 1897; Bouchereau et al., 1991). Hydrolysis of the compound in germinating seedlings by sinapoylcholine esterase (SCE) releases choline for phospholipid synthesis and sinapate which is re-esterified by SGT to form sinapoylglucose (Tzagoloff, 1963; Nurmann and Strack, 1979; Strack et al., 1980; Clauβ et al., 2008). There are four Arabidopsis genes whose sequences are similar to the gene encoding SCE in B. napus (Clauβ et al., 2008). The proteins encoded by these three genes exhibit SCE activity and are highly expressed in 3-day-old seedlings (Table 1). One of these genes (At1g28670) is also expressed in adult leaves, suggesting a possible alternative role in mature plants (Clauβ et al., 2008). Interestingly, the three Arabidopsis SCEs are not SCPL proteins like SMT and SCT, but instead are similar in sequence to GDSL serine lipases/esterases, hydrolytic enzymes with broad substrate specificities (Akoh et al., 2004). The Arabidopsis SCEs appear to have the conserved Ser, Gly, Asn, and His catalytic residues characteristic of GDSL proteins, and may employ a Ser-Asp-His catalytic triad similar to the ones found in SCPL proteins (Clauβ et al., 2008). Also like SCPL proteins, the Arabidopsis SCEs may be the evolutionary result of enzymes involved in primary plant metabolism being “recruited” for roles in secondary metabolism (Clauβ et al., 2008).
Sinapoylglucose:sinapoylglucose sinapoyltransferase (SST)
Sinapoylglucose:sinapoylglucose sinapoyltransferase (SST) catalyzes the disproportionation reaction of two molecules of sinapoyglucose to yield 1,2-disinapoylglucose (Fig. 10; Fraser et al., 2007). In addition to the SNG1 gene, there are two Arabidopsis genes that encode enzymes with SST activity: At2g22980 and At2g23010 (Table 1). Each gene is found in the SMT-containing SCPL gene cluster on chromosome II (Fig. 12). The functions of At2g22980 and At2g23010 were identified by comparing the metabolic phenotypes associated with three sng1 alleles (sng1-1, and two fast neutron-induced deletion alleles, sng1-5 and sng1-6). Both At2g22980 and At2g23010 are present in the sng1-1 EMS mutant and the sng1-5 deletion mutant, but are missing from the sng1-6 deletion mutant. Metabolic profiling data and complementation analysis demonstrated that the proteins encoded by At2g22980 and At2g23010 can both synthesize 1,2-disinapoylglucose, and that the protein encoded by At2g23010 is exclusively responsible for the biosynthesis of an as-yet-unidentified sinapate ester (Fig. 12). Although disinapoylglucose accumulates in etiolated Arabidopsis seedlings (Fraser et al., 2007) and has also been found in etiolated R. sativus seedlings (Strack et al., 1984; Dahlbender and Strack, 1984), the role of the compound remains to be determined.
Sinapoylglucose:anthocyanin sinapoyltransferase (SAT)
Sinapoylglucose:anthocyanin sinapoyltransferase (SAT) is an anthocyanin acyltransferase (AAT) that catalyzes the transfer of the sinapate moiety of sinapoylglucose to a number of different anthocyanins in Arabidopsis (Fig. 10; Fraser et al., 2007). Like the other Arabidopsis SCPL acyltransferases, sinapoylglucose functions as the activated donor molecule for SAT; however, SAT is unlike the AATs responsible for transferring the feruloyl, coumaroyl, caffeoyl, and malonyl moieties to anthocyanis in Arabidopsis. These latter AATs all belong to the BAHD family of acyltransferases, and so use activated CoA thioesters as acyl donors (D'Auria, 2006; D'Auria et al., 2007; Luo et al., 2007). While the energetics of 1-O-glucose and CoA thioester acyltransfer reactions are similar (Dahlbender and Strack, 1984; Leznicki and Bandurski, 1988; Mock and Strack, 1993; St Pierre and De Luca, 2000), it is likely that the Arabidopsis SCPL and BAHD acyltransferases employ different catalytic mechanisms, with the BAHD acyltransferases making use of a conserved HXXXD motif for deprotonation (Ma et al., 2005; D'Auria, 2006; Stehle et al., 2009).
As with SST, the identification of the gene encoding SAT was determined via the analysis of the sng1-1, sng1-5, and sng1-6 mutants. Wild-type Arabidopsis plants and the sng1-1 mutant accumulate a number of sinapoylated anthocyanins (Bloor and Abrahams, 2002; Tohge et al., 2005; Fraser et al., 2007), but the sng1-5 and sng1-6 mutants do not. Both the sng1-5 and the sng1-6 mutants lack SNG1 and At2g23000, with the deletion harbored by the sng1-5 mutant spanning only these two genes (Fig. 12; Fraser et al., 2007). Transformation of the sng1-5 mutant with the SNG1 gene also failed to restore the accumulation of sinapoylated anthocyanins, thereby demonstrating that At2g23000 is the sole SAT in Arabidopsis (Table 1). Since the initial identification of SAT, a naturally occurring accession Pna-10 deletion mutant has been found which is missing both SNG1 and At2g23000, and is deficient in sinapoylated anthocyanins and sinapoylmalate (Li et al., 2010a).
In addition to expanding our knowledge of sinapate ester biosynthesis, studies of SNG1, SNG2, At2g22980, At2g23000, and At2g23010 have given researchers an excellent opportunity to study evolution among the enzymes of plant secondary metabolism. The SCPL proteins encoded by these genes are highly similar, functioning as acyltransferases that share the same activated donor molecule (sinapoylglucose) but whose acyl acceptor substrates differ (Fraser et al., 2007). Although a structure of an Arabidopsis SCPL sinapoylglucose acyltransferase has yet to be produced, modeling and experimental data have identified a number of candidate amino acid residues that may be involved in prote in-substrate interactions, and suggest that SMT, SCT and the other Arabidopsis SCPL sinapoylglucose acyltransferases have evolved from a common hydrolytic ancestor (Lehfeldt et al., 2000; Shirley et al., 2001; Shirley and Chapple, 2003; Fraser et al., 2005; Stehle et al., 2006; Fraser et al., 2007; Stehle et al., 2008a; Stehle et al., 2009). Research on the other 46 Arabidopsis SCPL proteins is likely to shed further light on the role of enzymatic evolution in plant secondary metabolism, both in Arabidopsis and higher plants in general.
Arabidopsis thaliana is a convenient and robust system for the study of plant physiology and biochemistry. The simple genome and short generation time of Arabidopsis, combined with the relative ease with which it can be genetically modified and experimentally analyzed have proven essential to progress in the plant sciences. The biosynthesis of sinapate esters has served as a reliable experimental indicator for disruptions in the phenylpropanoid pathway in Arabidopsis, and has once again demonstrated the benefits of choosing an experimentally convenient biological system for biochemical research. In light of the central importance of the phenylpropanoid pathway to plant secondary metabolism as a whole, and the promise that metabolic engineering holds for crop improvement, it is likely that discoveries made in Arabidopsis will have a significant translational impact on agriculture in the years to come.
Despite the advances that have been made, many significant questions remain unanswered with regard to phenylpropanoid metabolism, even in an experimental system as well-studied as Arabidopsis. For example, the synthesis of sinapate esters involves three subcellular compartments: the chloroplast as the site of Phe synthesis, the cytoplasm for the synthesis of sinapoylglucose from Phe and the vacuole for the final transesterification step. Neither of the required transporters has yet been identified. How phenylpropanoid flux is controlled and measured is also unclear. Arabidopsis accumulates very reproducible levels of both sinapoylmalate and lignin and yet how the levels of these compounds are sensed and feed back to as-yet-undiscovered homeostatic mechanisms is unknown. How the levels of lignin are controlled within particular bounds is particularly interesting when one considers that lignin is an extracellular insoluble polymer, a state not particularly amenable to measurement by known biological mechanisms. The transport issue is also pertinent to lignin biosynthesis considering that the (presumed) proteins that transport monolignols to the apoplast remain unidentified. Finally, the perennial question of the role of so-called secondary metabolites in the biology of plants remains an open question and one that, based upon the number of phenylpropanoid metabolites in Arabidopsis alone, will keep researchers busy for many years to come.
 1 A comprehensive discussion of shikimate metabolism can be found in The Arabidopsis Book chapter by Tzin and Galili (2010). This brief introduction is included for completeness.
 2 An additional discussion of P450s can be found in The Arabidopsis Book chapter by Werck-Reichhart et al. (2002).