The 55 Arabidopsis glutathione transferases (GSTs) are, with one microsomal exception, a monophyletic group of soluble enzymes that can be divided into phi, tau, theta, zeta, lambda, dehydroascorbate reductase (DHAR) and TCHQD classes. The populous phi and tau classes are often highly stress inducible and regularly crop up in proteomic and transcriptomic studies. Despite much study on their xenobiotic-detoxifying activities their natural roles are unclear, although roles in defence-related secondary metabolism are likely. The smaller DHAR and lambda classes are likely glutathione-dependent reductases, the zeta class functions in tyrosine catabolism and the theta class has a putative role in detoxifying oxidised lipids. This review describes the evidence for the functional roles of GSTs and the potential for these enzymes to perform diverse functions that in many cases are not “glutathione transferase” activities. As well as biochemical data, expression data from proteomic and transcriptomic studies are included, along with subcellular localisation experiments and the results of functional genomic studies.
Glutathione transferases (GSTs; EC 220.127.116.11) are enzymes that typically add, or substitute, the non-ribosomally synthesised tripeptide glutathione (GSH; γ-Glu-Cys-Gly) to an electrophilic centre contained within a small molecule acceptor. Arabidopsis contains 54 soluble GSTs and one membrane protein associated with this activity. The soluble GSTs have an ancient monophyletic origin shared with the respective enzymes from nearly all other eukaryotic and prokaryotic species. These GSTs share a similar structural biology based on the thioredoxin/glutaredoxin fold that can be considered as comprising the glutathione transferase superfamily. The membrane associated microsomal, or MAPEG, GST is evolutionarily distinct and will be considered separately later. The description of GSTs as enzymes solely catalysing glutathione conjugation reactions is misleading. As will be illustrated through examining the functional genomics of the GSTs in Arabidopsis these proteins have assumed additional GSH-dependent and GSH-independent catalytic and binding functions. The diversification in function of these enzymes in plants has been recently comprehensively reviewed (Dixon et al., 2010).
To orientate the reader, the following represents a short history of the discovery and study of plant GSTs, with key references included. GSTs active in herbicide metabolism were first described in plants in 1970 and became a well established determinant of selectivity in crops and weeds (Edwards and Dixon, 2000). For reasons which are still not known, major crops such as rice, wheat, maize sorghum and soybean all contain much higher levels of herbicide detoxifying GSTs than competing weeds, thereby providing a powerful platform for directing selective chemical weed control based on relative rates of detoxification. Early studies concentrated on the characterization of these herbicide detoxifying enzymes first in maize and then subsequently in wheat and soybean (Edwards and Dixon, 2000). However, in the late 1980s and early 1990s GSTs began to be identified in increasing numbers as stress responsive proteins which accumulated during biotic and abiotic stress (Marrs, 1996; Dixon et al., 2002a; Frova, 2003; Moons, 2005; Frova, 2006; Dixon et al., 2010). This began a new era of attempting to assign functions for GSTs in endogenous metabolism and development, which continues to this day. In 2000, the completion of the Arabidopsis sequencing project and the large-scale development of associated genomic tools massively enhanced our capability to study the GST superfamily. This included attempts to study the multiplicity of GST genes in a systematic way for the first time (Wagner et al., 2002; Dixon et al., 2009). In addition, the publication of DNA array data showed GSTs to be among the most responsive of genes to stress and chemical signalling treatments and to have unexpected patterns of co-regulation, with for example plant secondary metabolism (Glombitza et al., 2004). Similarly, undirected proteomic studies in Arabidopsis have identified GST family members as being both relatively abundant and associated with a number of subcellular compartments (Sappl et al., 2004; Smith et al., 2004; Edwards et al., 2005; Gruhler et al., 2005; Dixon and Edwards, 2009). These studies have most recently been complemented by attempts to test for GST functions through systematic reverse-genetic approaches (Sappl et al., 2009). Useful additional web based resources relating to this chapter can be found linked to the TAIR gene family page - http://www.arabidopsis.org/browse/genefamily/gst.jsp. Readers interested in assaying for the GST activities described in the text are referred to Methods in Enzymology (Edwards and Dixon, 2005).
Arabidopsis GST nomenclature
KNOWN CATALYTIC ROLES, XENOBIOTIC DETOXIFICATION
The study of the Arabidopsis GSTs has benefited greatly from our biochemical understanding of the activities of these enzymes in other plants. Much of the early work on plant GSTs focussed on their important role in herbicide detoxification, and as such the conjugation of xenobiotics (Edwards and Dixon, 2000). Since most of the herbicide conjugation assays require chromatographic sampling assays, a good deal of work in studying the GST-mediated conjugation of xenobiotics has concentrated on the use of simple colorimetric assays to measure activity. The best known of these is the GST catalyzed substitution of glutathione for the chloro group of the xenobiotic 1-chloro-2,4-dinitrobenzene (CDNB), which was developed as a marker for similar detoxification activities observed with mammalian GSTs (Habig et al., 1974).
The xenobiotic-detoxifying activity implied through this CDNB assay has been applied in testing many GSTs from Arabidopsis and other plant species for their potential to metabolize herbicides and other crop protection agents (Edwards and Dixon, 2000). However, it is clear that this xenobiotic conjugating activity may have little to do with the endogenous roles of these enzymes in plants, which are not exposed to synthetic chemicals. Similarly, many GST superfamily members have no, or only very low activity when assayed with CDNB as substrate, so this is by no means a universal assay to assess whether or not a given enzyme is catalytically active. This observation is clearly reinforced by the range of activities observed in recent surveys of the GST family in Arabidopsis (Dixon et al., 2009) and poplar (Lan et al., 2009). Other GST-associated activities have been postulated (Dixon et al., 2010), including intracellular transport of small molecules such as flavonoids, transient glutathione conjugation to protect reactive metabolites such as porphyrinogens and oxylipins, introduction of sulfur into secondary metabolites such as glucosinolates, and cis-trans isomerisation reactions. In the majority of cases, these activities have been inferred through informatic approaches or biochemical precedent. More rarely, genetic studies have further confirmed this diversification in function (Dixon et al., 2010). However, the size of the plant GST super-family, the frequency of gene duplication and the associated likelihood of functional redundancy complicates functional genomic studies aimed at defining loss of function through reverse-genetic approaches.
Prior to 2000, an ad-hoc naming system for GSTs developed which led to some confusion in the literature. Since 2000, Arabidopsis and other plant GSTs now have a stable, unified system that, if adhered to, minimises ambiguity. The classification of the soluble GSTs into classes is based on the sequence similarity and gene structure of family members, with each class being assigned a Greek letter. The nomenclature is a logical extension of the naming system used for mammalian GSTs and takes the form XxGSTYn, where Xx is the species designation (E.g. At for Arabidopsis) Y is the single letter class code derived from the Greek letter class designation (see table 1) and n is the isoenzyme's number within the class. Therefore AtGSTU19 is an Arabidopsis tau class GST, and the 19th to be named. The numbering is typically based on order of discovery. However, the mapping of the Arabidopsis genome has allowed the GSTs to be numbered according to their organization and position within the component chromosomes, with clustered GSTs being given contiguous numbers. Since soluble GSTs exist as dimers and since such dimers are to date only found between isoenzymes of the same class, nomenclature for proteins is of the form XxGSTYn1-n2, where n1 and n2 are the numbers of the two subunits. For example, ZmGSTF1–2 is a heterodimer between maize (Zea mays) GSTF1 and GSTF2 subunits.
Since the remainder of this chapter will concentrate on Arabidopsis GSTs only, the proteins will be referred to by their class and number alone, with the At prefix omitted.
The Arabidopsis genome contains a total of 55 GST genes which can be divided into 8 classes (Table 1). Of these genes, at least 52 are transcribed, and 41 of the respective proteins have been shown to possess GSH-dependent catalytic activities (Bresell et al., 2005; Dixon et al., 2009). Most classes are small and possess 1 to 4 members. However, the phi and tau classes have undergone repeated gene duplication events, resulting in the generation of 13 and 28 members respectively. This gene duplication is evident in the genome, with GST genes often occurring in clusters (Figure 1). While the small zeta, theta and microsomal classes are widely distributed in eukaryotes, the remainder are plant specific. A phylogenetic analysis of the GST polypeptide sequences (Figure 2) clearly shows the clustering by GST class, and, apart from a link between the lambda and DHAR enzymes, there is no obvious higher order organisation.
To date, only 2 Arabidopsis GSTs have had their crystal structures solved using x-ray diffraction analysis. The structure of the phi GSTF2 has been solved in complex with the affinity ligand S-hexylglutathione and a glutathionylated herbicide reaction product (Reinemer et al., 1996; Prade et al., 1998). In addition, the structure of GSTZ1 has been solved in its apo form (Thom et al., 2001). Despite their sequence diversity, the solved structures of these soluble Arabidopsis GSTs are very similar (Dixon et al., 2002a), showing very similar overall protein folds (Figure 3). Based on studies with other GSTs, it is very likely that enzymes within a class are highly likely to form very similar structures. In particular, structures for phi (Prade et al., 1998) and tau class GSTs (Thom et al., 2002) from other plant species should provide good models for the corresponding classes of GST in Arabidopsis.
Each GST consists of an N-terminal, GSH-binding domain (G-site) which is structurally similar to, and has evolved from, the thioredoxin fold (Atkinson and Babbitt, 2009). Addition of a C-terminal alpha-helical domain, providing the majority of the binding site for the hydrophobic co-substrate (H- site), results in a complete GST polypeptide. Looking beyond the structures of the Arabidopsis GSTs, the large number of solved crystal structures for these proteins (the majority of which are mammalian GSTs) are all remarkably similar, despite the considerable diversity in their primary sequence between GSTs. Structural differences relate to the lengths of component peptide loops and helices and the presence, or absence, of N- and C-terminal extensions (Thom et al., 2002). For each soluble GST, two monomers associate to form a dimer, which possesses a central cleft containing an active site on each side. In each case dimerisation appears to be essential in conferring enzyme activity to the GST, even though each monomer theoretically provides a more-or-less catalytically independent active site. The G-site is very specific, only accepting GSH or other closely related gamma-glutamyl linked peptides due to the multiple selective binding interactions present in this highly conserved binding pocket. The co-substrate binding site (H-site) is positioned adjacent to the G-site and is often hydrophobic in nature and sufficiently open to accommodate a diverse range of substrates and ligands. Catalysis is dependent on stabilising the reactive thiolate anion of GSH. This sulfhydryl group has a pKa of 9.4 (Tajc et al., 2004), so to lower this value and favor anion formation at physiological pH, GSTs promote proton abstraction from the thiol of GSH using an active site residue coupled to a hydrogen bonding network. In plant GSTs this residue is a serine, located near the N-terminus of the GST polypeptide. GSTs without this serine (which is commonly substituted by a tyrosine in non-plant GSTs) should not be active. Certain GSTs instead have a cysteine residue at this position which completely changes the character of the enzyme as described later. Based on the activation of the thiolate anion, the GSH acts as a nucleophile and will readily react with available ‘soft’ electrophiles.
Structural and biochemical studies have additionally identified non-active site binding sites for small ligands. For example, GSTF2 crystallised with S-hexylglutathione showed two ligand molecules bound per monomer, one in the active site and one adjacent to it (Reinemer et al., 1996). Biochemical analysis of the same protein provided evidence for different binding sites for naphthalic acid and indole acetic acid, although neither hormone bound with high affinity (Smith et al., 2003). However the presence of this additional binding site is potentially very significant, for example by allowing regulatory ligands to modulate GST activity. It is possible to conceive of functional cross-talk between different molecules, for example with flavonoids regulating auxin transport (Smith et al., 2003), or these proteins co-transporting dissimilar ligands. The biological significance of these multiple binding sites has not been proven to be physiologically relevant, with those binding interactions described to date being relatively weak.
There is very little evidence for post-translational modification of Arabidopsis GSTs, with the respective recombinant proteins expressed in tobacco recovered as the native unmodified poly-peptides (Dixon et al., 2009). Arabidopsis GSTs have not, to our knowledge, been identified in the extensive glycoproteome studies conducted, although phosphopeptides annotated in the Phos-PhAt database as from GSTs U12, U19, U22, F8, F9, DHAR2 and TCHQD have been described (Durek et al., 2010). These annotations have not been verified by more directed studies. In contrast, several GSTs are readily and reversibly modified by GSH to form mixed disulfides, in particular GSTs with an active site cysteine, notably the GSTLs and DHARs (Dixon et al., 2002b). A more indirect screen for S-glutathionylated (thiolated) Arabidopsis proteins also identified GSTZ1, GSTF7 and GSTU19 as undergoing this modification (Dixon et al., 2005). Intriguingly, GSTF10, a protein lacking cysteine, was also identified in this screen apparently due to its ability to form heterodimers with GSTF7, showing that these polypeptides do form heterodimers in planta (Dixon et al., 2005). Gel-based proteomic studies often identify multiple spots corresponding to individual GSTs (Sappl et al., 2004; Edwards et al., 2005; Gruhler et al., 2005; Dixon and Edwards, 2009). At least some of these are likely to be artefacts formed by the differential oxidation of cysteine and methionine residues, or result from partial protein degradation. Whether any of these spots arise from post-translationally modified polypeptides remains to be determined.
Arabidopsis GSTs crop up near ubiquitously in proteomic studies due to their abundance, solubility, low molecular weight and inducibility in response to a wide range of stresses. GSTs tend to be hydrophobic proteins and this sticky nature, coupled with their abundance, means that identification of GSTs in numerous proteome studies of chloroplasts (Zybailov et al., 2008), mitochondria (Heazlewood et al., 2004) and vacuoles (Carter et al., 2004) may in some cases be due to non-specific associations with membrane rich fractions. GFP fusions of GSTs tend to accumulate in the cytosol (Dixon et al., 2009), although individual members have shown localisation to the chloroplast, nucleus, peroxisome and plasma membrane as discussed in more detail later. Proteomic studies directed at sampling GSTs have utilised affinity chromatography (glutathione and S-hexylglutathione affinity media) to capture a substantial number of isoenzymes (Sappl et al., 2004; Dixon and Edwards, 2009). However, results are strongly dependent on the affinity resin used (Dixon and Edwards, 2009), and as such can fail to purify the isoenzymes that do not tightly bind to these supports, thereby under-estimating their relative abundance. In total, at least 34 GST family members have been identified in GST-directed and total proteomic studies (Table 2), confirming the expression of the majority of isoenzymes. Those isoenzymes occurring most frequently in proteomic studies can be considered to be highly expressed isoenzymes, although this will be skewed in favour of those enzymes that are more efficiently purified, at least for the GST-directed studies.
As with proteomic studies, GST transcripts are routinely identified as being strongly upregulated in stress studies, although these correlations have generally not lead to any major new insights into defining GST function. Individual studies focusing on a subset of GSTs have again shown the strong inducibility of many of the transcripts (DeRidder et al., 2002; Wagner et al., 2002; Sappl et al., 2009). It is clear that rather than showing co-ordinate regulation, each GST shows a distinct pattern of stress responsiveness. Similarly, analysis of multiple microarray experiments has allowed the overall patterns of GST expression to be estimated (Dixon et al., 2010), again showing diverse regulation of GST transcripts (Figure 4). In order to mine the microarray data for information regarding GST function, genes showing co-regulation with each GST transcript were extracted (Dixon et al., 2010) and links with flavonoid metabolism, glucosinolate and phytoalexin synthesis and defence response identified. For example, GSTF12, already implicated in anthocyanin transport (see later), showed close co-regulation with other anthocyanin biosynthetic genes. Similarly, transcripts for GSTs U3/U4 and GSTs U11/U12 (microarray probes were unable to distinguish between the paired genes) showed co-regulation with indole glucosinolate synthetic enzymes, and GSTF10, GSTF11 and GSTU20 transcripts showed co-regulation with aliphatic glucosinolate biosynthesis.
FUNCTIONS BY CLASS
Arabidopsis has two genes encoding GSTZs, which were originally classified as “Type II” GSTs. Of the two genes only GSTZ1 (At2g02390) appears to be transcribed at any significant level. A second adjacent gene (GSTZ2; At2g02380) is presumed to be a pseudogene, although it has been identified in a single proteomic study (Table 2). GSTZs are highly conserved in eukaryotes, indicative of their important and central function in cell metabolism. The identification of a fungal zeta GST as the remaining missing step in the catabolism of tyrosine led to the discovery of the enzyme's activity in catalysing the isomerisation of maleylacetoacetate to fumarylacetoacetate (Fernández-Cañón and Peñalva, 1998). GSTZ1 also efficiently catalyses this reaction, and the presence of active upstream and downstream enzymes in the tyrosine catabolic pathway in Arabidopsis provides support that this pathway also operates in planta, even though technically it is not required (Dixon et al., 2000; Dixon and Edwards, 2006). At first glance, the GSH-dependent isomerase reaction appears very different from the canonical GST reactions. However, the reaction mechanism is in fact closely related. As with typical GST reactions, the activated GSH adds to the cis double bond of maleylacetoacetate, allowing its free rotation. Subsequent elimination of GSH then forms fumarylacetoacetate, with its trans-configured double bond. GSTZ1 can also catalyse the glutathione-dependent dehalogenation of dichloroacetic acid to glyoxylic acid, which can then enter primary metabolism (Dixon et al., 2000). This is a rare example of GST activity resulting in the complete recycling of a xenobiotic, and is also unusual in utilising GSH catalytically rather than as a substrate.
Summary of glutathione transferase genes, loci and functions.
Solution of the Arabidopsis GSTZ1 crystal structure (Thom et al., 2001) confirmed that this enzyme folded to form a typical GST dimeric structure (Figure 3) and that Ser17 in the conserved triad Ser17-Ser18-Cys19 was correctly placed to stabilise the anionic GSH. This was backed up by mutagenesis experiments, which also showed that CyS19 was involved in the catalytic mechanism, although it was not essential (Thom et al., 2001). Reflecting the polar nature of the endogenous substrates, the H-site of GSTZ1 was more polar than typical for GSTs and contained basic residues to promote binding of the acid groups on maleylacetoacetate.
The dehydroascorbate reductases (DHARs) are unique in being a plant-specific GST class with defined endogenous functions. Through biochemical assays, these enzymes have been shown to efficiently catalyse the reduction of dehydroascorbate to ascorbate, with the concomitant oxidation of reduced GSH to glutathione disulfide (Shimaoka et al., 2000; Urano et al., 2000; Dixon et al., 2002b). Unlike most other GSTs, DHARs have an active site cysteine instead of serine/tyrosine, so rather than stabilising the thiolate anion of GSH, this residue instead forms a mixed disulfide with GSH as part of the catalytic mechanism (Dixon et al., 2002b). As such, DHARs are unable to catalyse typical GST conjugating reactions using GSH. Low level DHAR activity has also been determined in unrelated enzymes, due it seems to the presence of reactive cysteine residues. This in turn has led to a debate as to whether or not this reductase activity is non-specific and the relative importance of DHARs in maintaining the reduced ascorbate pool (Morell et al., 1997; Foyer and Mullineaux, 1998). More recent studies have shown that true GST-like DHARs have activities orders of magnitude higher than these non-specific reductases and it is now clear through knock out and over-expression studies that DHARs have an important role in plants, particularly when they are subjected to oxidative stress (Chen and Gallie, 2006; Yoshida et al., 2006).
Arabidopsis has 5 DHAR-like genes of which 3 are transcribed and encode functional proteins, namely DHAR1 (At1g19570), DHAR2 (At1g75270) and DHAR3 (At5g16710). Additionally, DHAR4 (At5g36270) appears to be a pseudogene encoding a full-length but inactive enzyme, while At1g19950 corresponds to an untranscribed region encoding an N-terminally truncated, hence inactive enzyme. Earlier genome annotations showed At5g16705 as a protein with a DHAR domain, but this erroneous gene model has since been corrected. Confusingly, DHAR1 has also been described as DHAR5 in a recent paper (Vadassery et al., 2009). DHAR activity is an important part of the ascorbate-glutathione cycle (Noctor and Foyer, 1998) and as such should be co-localised with compartments where these coupled redox reactions are needed to maintain pools of reductants. Consistent with such a role, there is good evidence that DHARs are found in multiple subcellular organelles. DHAR1 peptides have been reported in mitochondrial preparations (Chew et al., 2003), though this is surprising based on the absence of a respective N-terminal signal peptide. Interestingly, later studies have clearly identified this enzyme as being peroxisome-targeted (Reumann et al., 2009). DHAR3 has a clear N-terminal signal peptide that is presumed to provide chloroplast/mitochondrial targeting (Dixon et al., 2002b; Chew et al., 2003). Such a localisation has been confirmed by identifying DHAR3 in the chloroplast proteome (Zybailov et al., 2008), though in vitro transport assays failed to show such import into either chloroplasts or mitochondria (Chew et al., 2003). Perhaps surprisingly for genes encoding proteins which counteract oxidising conditions, within the class only DHAR2 transcripts have been shown to accumulate in response to stress (Dixon et al., 2002b; Yoshida et al., 2006).
At the protein level, DHARs behave differently from most other GSTs in being expressed as monomers rather than as dimers. Thus, gel filtration experiments showed that DHAR1 and DHAR3 both eluted with a retention time consistent with being monomers (Dixon et al., 2002b), though this unusual behaviour has not yet been validated by other techniques. Intriguingly, DHARs are the closest plant homologue to intracellular chloride channels (CLICs), which are mammalian GST-like proteins that are peculiar in existing as either soluble forms or as membrane-associated ion channels (Cromer et al., 2002). When DHAR1 was expressed with a C-terminal GFP fusion in mammalian cells, a small proportion localised to the microsomal fraction and led to an increase in non-specific membrane conductance (Elter et al., 2007). Although this increased ion conductance could not be definitively linked to a direct effect conferred by DHAR1, these results do raise the possibility of an unusual additional function for these proteins in Arabidopsis.
Arabidopsis contains three lambda GSTs namely, GSTL1 (At5g02780), GSTL2 (At3g55040) and GSTL3 (At5g02790), which in many ways resemble the DHAR class (Dixon et al., 2002b). Thus, like DHARs, all the GSTLs contain a conserved active site cysteine residue and so cannot catalyse typical GST reactions. Instead, this cysteine residue is sufficiently reactive to readily form mixed disulfides with GSH (Dixon et al., 2002b). While their true substrates are unknown, based on this redox chemistry it is likely that GSTLs catalyse the GSH-dependent reduction of small molecules. To date, the only activity observed is the glutathione-dependent reduction of a mercaptoethanol-GSH mixed disulfide (Dixon et al., 2002b). GSTL1 and GSTL3 are presumed to be cytosolic, while GSTL2 has a clear N-terminal transit peptide (Dixon et al., 2002b) and has been identified in chloroplast proteome studies (Zybailov et al., 2008). When this N-terminal extension is replaced by GFP, the resulting fusion localises to both the cytosol and peroxisomes, raising the possibility of dual targeting (Dixon et al., 2009). This multiplicity of targeting is similar to that observed for the DHARs and suggests that the endogenous substrate(s) for GSTLs occur in a range of cellular compartments. As found for DHARs, GSTL1 and GSTL2 behaved as monomers by gel filtration (Dixon et al., 2002b). From public microarray data and PCR data (Dixon et al., 2002b), GSTL1 shows low transcript levels in unstressed tissue but transcripts are strongly stress inducible. In contrast, GSTL2 and GSTL3 show modest constitutive expression only. Again, this stress-inducibility of a single class member is very similar to the situation with the DHARs.
Along with the zeta class, the theta class of GSTs are conserved between plants and animals. Based on the pre-2000 GST classification system, which tended to focus on the respective mammalian proteins, many non-mammalian GSTs, including plant enzymes now reclassified as phi enzymes, were collectively termed theta class enzymes. This point is important to note when referring to the older literature, as it is now clear that the true theta class enzymes form a discrete group of enzymes with conserved properties and perhaps conserved functions. Like many plant GSTs, but unlike most mammalian GSTs, theta class enzymes have an active site serine residue. In both Arabidopsis (Dixon et al., 2001; Dixon et al., 2009) and man (Tan and Board, 1996), GSTTs show some activity toward typical xenobiotic substrates, but are particularly efficient as glutathione-dependent peroxidases (GPOX), using GSH to reduce organic hydroperoxides to alcohols.
Arabidopsis GSTTs were first named GST 10, 10b and 10c and placed in a new “Type IV” grouping (Dixon et al., 1999). On publication of the genome sequence, it could be seen that Arabidopsis had three clustered GSTT genes, GSTT1 (At5g41210), GSTT2 (At5g41240) and GSTT3 (At5g41220) (Wagner et al., 2002). Each encodes a GST with high GPOX activity toward both artificial substrates, such as cumene hydroperoxide and more likely endogenous fatty acid oxidation products, such as linoleic acid hydroperoxide and linolenic acid hydroperoxide (Dixon et al., 2009). Each GSTT has a high pI (pH 8.9 – 9.5) and contains a C-terminal peroxisome targeting signal. Targetting to the peroxisome has been confirmed using the respective N-terminal GFP-GSTT fusion proteins (Reumann et al., 2007; Dixon et al., 2009) and through peroxisome proteomic studies (Reumann et al., 2007; Eubel et al., 2008). Peroxisomal metabolism generates large amounts of hydrogen peroxide which will oxidatively damage components within the organelle if allowed to accumulate. It is highly likely that the lipid hydroperoxides formed would act as the substrates for peroxisomal GSTTs.
Arabidopsis GSTT2 and GSTT3 genes are unusual in also encoding much larger proteins arising through alternative splicing. In both cases, transcripts are produced encoding fusion proteins with an N-terminal GSTT domain and a C-terminal domain resembling myb-like transcription factors. This fusion masks the peroxisome targeting signal and in the case of GSTT3, the resulting fusion protein is localised exclusively to the nucleus where it accumulates with a punctate distribution (Dixon et al., 2009). No other plant or mammalian GSTT appears to form similar fusion proteins and the significance of this localisation in Arabidopsis is unclear. Roles in modulating gene transcription under oxidative stress conditions, or detoxifying oxidatively damaged DNA have been postulated (Dixon et al., 2009). The three GSTTs have very similar sequences precluding differentiation in microarray studies, but the expression of this class of GSTs is not strongly altered between tissues or by stress.
The phi GSTFs are a large, plant-specific class of proteins and have been the subject of detailed studies. GSTFs have in the past been placed in the theta class, or called “Type I” GSTs. With one exception, very little is known about the function of phi GSTs in Arabidopsis, with individual enzymes appearing to be non-essential for normal growth, based on the lack of phenotype observed in knock-out lines (Sappl et al., 2009). Even when multiple phi GSTs (GSTs F6, F7, F9 and F10) were knocked down using RNAi, no overt phenotype could be shown even when the plants were stressed (Sappl et al., 2009). However, these studies did reveal subtle changes in metabolite levels that could be linked to decreased tolerance to oxidative stress. This potential functional redundancy, compounded by the size of this class and the non-essential nature of their roles has hampered the functional characterisation of GSTFs.
The first GSTF sequence isolated from Arabidopsis (Bartling et al., 1993), now named GSTF1, is actually not present in this plant's reference genome. The enzyme polypeptide sequence is most similar to enzymes from fungi and amoebae, suggesting that the original sequence was derived from a co-cultivated pathogen. Thus, “AtGSTF1” is unlikely to be a true Arabidopsis GST, but its name has nevertheless been retained until the source of the sequence is clarified. After disregarding GSTF1, the Arabidopsis genome can be seen to contain 13 GSTFs, numbered GSTF2 to GSTF14. GSTF9 and GSTF10 form a tandem array on chromosome 2 while GSTs F4, F5, F6 and F7 form a tight cluster on chromosome 1. The remaining 7 GSTF genes are present as singletons.
GSTF2 (At4g02520) is the best studied of the Arabidopsis GSTs, with respect to its biochemical properties (Wagner et al., 2002; Dixon et al., 2009), localisation and interaction with flavonoids (Smith et al., 2003), and crystal structure (Reinemer et al., 1996; Prade et al., 1997; Prade et al., 1998). This GST has been reported to be localised to the plasma membrane (Zettl et al., 1994; Smith et al., 2003), cytosol (Dixon et al., 2009) and, despite the lack of a signal peptide, the chloroplast (Armbruster et al., 2009), depending on the methodology used. GSTF2 transcripts are strongly inducible, for example by oxidative stress and following treatment with phytohormones (Zhou and Goldsbrough, 1993; Smith et al., 2003; Mang et al., 2004). GSTF2 expression and localisation is also disrupted in flavonoid-deficient mutants (Smith et al., 2003). These studies also demonstrated that GSTF2 can bind flavonoids as ligands. GSTF2 was originally identified due to its ability to be labelled by azido-indole acetic acid (Zettl et al., 1994). These studies showed that GSTF2 was able to bind to indole acetic acid, 1-naphthaleneacetic acid and the auxin transport inhibitor 1-N-naphthylphthalamic acid, albeit with low affinity in each case. Studies with the Brassica juncea orthologue, BjGSTF2, showed that the over-expression of this gene in Arabidopsis, or the use of an antisense construct to knock down AtGSTF2 expression, lead to alterations to flowering time, stress resistance and shoot regeneration (Gong et al., 2005).
GSTF3 (At2g02930) is almost identical to GSTF2, but based on transcript abundance is expressed at much lower levels than GSTF2 (Lieberherr et al., 2003; Smith et al., 2003), and has not been studied in detail.
GSTF4 (At1g02950) and GSTF5 (At1g02940) are very similar to one another and clustered on the same chromosome, suggesting an origin in gene duplication. Both proteins are unusual in containing an N-terminal extension as compared to other GSTs. GSTF4 has a 23 residue extension while GSTF5 has a 35 residue extension that contains 4 cysteine residues, perhaps providing a metal ion binding function. These extensions may act as targeting sequences, although this has not been tested directly.
GSTF6 (At1g02930) and GSTF7 (At1g02920) are very similar in sequence and are clustered with GSTF4 and GSTF5 on chromosome 1. They are not distinguishable from each other in microarray experiments, but PCR experiments have shown that the two GSTs are similarly responsive to stress treatments (Lieberherr et al., 2003), being strongly and rapidly induced by treatment with avirulent Pseudomonas syringae. Other PCR experiments, together with proteomic data, have shown GSTF7 to be less abundant, but more strongly SA-inducible, than GSTF6 (Sappl et al., 2004). GSTF6, along with GSTF5 and GSTF12 was up-regulated when anthocyanin pigment synthesis was up-regulated (see below for GSTF12). Unlike GSTF12, GSTF6 does not have a direct role in anthocyanin biosynthesis as a validated gstf6 knock-out line failed to show any disruption in anthocyanin content (Wangwattana et al., 2008).
GSTF8 (At2g47730) is a major Arabidopsis GST and while containing a clear N-terminal chloroplast targeting peptide, it has been shown that the majority of GSTF8 transcripts are spliced such that this targeting peptide is removed, with the resulting protein remaining in the cytosol (Thatcher et al., 2007). GSTF8 is strongly induced following exposure to H2O2 (Chen et al., 1996), pathogen infection (Jones et al., 2004; Perl-Treves et al., 2004) and salicylic acid (Chen and Singh, 1999; Uquillas et al., 2004), the latter being independent of NPR1 signalling. This inducibility can be partly ascribed to an ocs element in the promoter of the GSTF8 gene, which exerts its effect mainly in root tissue (Chen and Singh, 1999) and can be suppressed by prior chemical treatments (Foley et al., 2006). As an enzyme, GSTF8 is by far the most active Arabidopsis enzyme in its class when assayed with CDNB (Dixon et al., 2009). With respect to potential natural substrates, GSTF8 has also been shown to catalyse the reversible glutathione conjugation of the oxylipin (15Z)-12-oxophyto-10, 15-dienoic acid (OPDA), which is a signalling agent released on wounding (Dueckershoff et al., 2008). GSTF8 also has also been assayed with a range of other GST substrates (Wagner et al., 2002). These features make GSTF8 one of the major contributors, along with GSTU19, to glutathione conjugating activity toward CDNB as determined in crude Arabidopsis extracts.
The closely related GSTF10 (At2g30870) was originally isolated as a dehydration-induced transcript and termed ERD13 (Kiyosue et al., 1993). More recently GSTF10 was identified through a yeast two hybrid screen as a binding partner of a Brassino-steroid-Insensitive 1-Associated Kinase 1 (BAK1) (Ryu et al., 2009), which is a leucine-rich repeat receptor-like kinase that appears to have a role in perception of hormone, biotic and abiotic signals (Chinchilla et al., 2009). The functional significance of this interaction is unclear. However, RNAi-based knock down of GSTF9 and GSTF10 gene expression resulted in a more compact rosette, increased anthocyanin levels and lower tolerance to exposure to salt and chemical treatments such as N-acetylcysteine. While GSTF10 transcripts were generally unresponsive to stress treatments, they were upregulated in response to drought (Ryu et al., 2009).
GSTF11 (At3g03190), GSTF12 (At5g17220) and GSTF14 (At1g49860) do not have a serine at the expected active site residue and so should not be able to catalyse typical GST reactions. Remarkably, GSTF12 is one of the best functionally characterised GSTs in any plant, since the mutation leading to its loss of expression results in the TRANSPARENT TESTA 19 phenotype (Kitamura et al., 2004; Kitamura, 2007). Although the associated biochemistry is unclear, this GST appears to promote transport of anthocyanins and proanthocyanidins from the cytosol into the vacuole, with GSTF12 transcription being closely co-regulated with other anthocyanin synthetic genes (Wangwattana et al., 2008). Since any involvement of GSTF12 in conjugation reactions is biochemically very unlikely, by analogy with similarly functioning GSTs in other plants (Mueller et al., 2000) it would seem more probable that this protein binds flavonoids, presumably also binding GSH, and delivers them to an ABC transporter which transports secondary metabolites into the vacuole. It has been proposed that the binding of GSTF12 is specific for anthocyanins esterified with coumaric acid ( http://arabidopsis.org/servlets/TairObject?type=publication&id=501727762). Whether GSTF11 and GSTF14 have similar roles remains to be determined, with GSTF11 implicated, through co-regulation studies, with glucosinolate metabolism (Hirai, 2009). Little is known about the function of GSTF14, though this protein has an unusual 30 amino acid residue C-terminal extension.
GSTF13 (At3g62760) is one of the few GSTs that could not be cloned from cDNA prepared from a range of Arabidopsis tissues (Dixon et al., 2009), suggesting very low transcript levels. Microarray data suggests that the respective genes show a very specific expression near the root tip. Unusually, the polypeptide sequence of GSTF13 is much more similar to other monocot (and dicot) GST sequences than it is to any other Arabidopsis GSTF, suggesting an ancient origin and a conserved function. GSTF13, along with GSTF12, has two cysteine residues near the N-terminus that may be functionally significant.
The plant-specific tau GSTUs are the most numerous GST class in Arabidopsis and also in other plants examined, with early classifications referring to these proteins as “Type III” GSTs (Droog et al., 1995). Gene clustering of tau GSTs is even more pronounced than for the phi GSTs, with all but 4 (GSTs U8, U9, U27, U28) of the 28 Arabidopsis genes found in clusters (Figure 1). GSTs U1 to U7 form the largest, 7-member cluster, found on chromosome 2. GSTUs group into three distinct clades (Figure 2), however the functional significance of this remains unclear. Uniquely among the GST classes tested, almost all the Arabidopsis GSTUs were found to selectively bind fatty acid derivatives. The ligands identified corresponded to S-(fatty acyl) glutathione thioesters in bacteria, and various glutathionylated and oxidised fatty acids in plants (Dixon and Edwards, 2009). The specificity of binding observed strongly suggested a physiological role, perhaps in intracellular transport of fatty acid derived reactive molecules such as oxylipins (Dixon and Edwards, 2009). Many tau GSTs have been identified as auxin-responsive genes, and this is true for some, but by no means all, Arabidopsis GSTUs. Induction by auxin means that this subset of GSTs is particularly abundant in actively growing tissue, for reasons that remain unknown.
The first clade comprises GSTs U1 to U10. All these GSTs are mainly root-expressed, although GSTs U3 and U4 transcripts have a wider distribution (Dixon et al., 2010). GFP fusions of GSTs U2, U7 and U9 all localised to the cytosol (Dixon et al., 2009) and in the absence of any overt signalling peptides, it is assumed the remaining GSTs in this clade are also found in the cytosol. Little is known about the clustered GSTs U1 to U7 (At2g29490 to At2g29420). All show some GST conjugating activity with CDNB activity, with GSTU6 and GSTU7 showing strong binding to C12-C16 chain length fatty acid-glutathione thioesters, with the dissociation constant for GSTU7 binding to S-myristoylglutathione measured at 900 nM (Dixon et al., 2009). GSTU5 and GSTU7 are expressed as active enzymes in planta, having been identified in some of the proteomic studies referenced above. GSTU5 was originally identified as AT103-1a, an auxin-induced gene (van der Kop et al., 1996), while GSTU7 transcripts are responsive to a variety of stress treatments, as judged from the publically available microarray data. From microarray data, GSTU9 (At5g62480) is present in mature seeds, and its expression in roots is induced following exposure to saline conditions or the plant hormone abscisic acid. GSTU10 (At1g74590) is also salt-inducible, with its transcipts accumulating in senescing tissues. Both GSTU9 and GSTU10 bind high levels of long chain fatty acyl glutathione in E. coli, and both bind free C18 fatty acid derivatives such as divinylethers in plants (Dixon and Edwards, 2009).
The second tau clade comprises GSTs U11 to U18. GSTU11 (At1g69930) was inactive towards all GST substrates tested (Dixon et al., 2009), despite retaining the active-site serine. GSTU12 (At1g69920) is unusual in being targeted to the nucleus, presumably due to its N-terminal extension, containing the putative nuclear localisation signal oligopeptide KKRKK (Dixon et al., 2009). When expressed in E. coli, GSTU12 was co-purified with ribosomal proteins and so was likely to bind RNA, presumably as a consequence of its basic N-terminal extension (Dixon et al., 2009). GSTU14 (At1g27140) is very similar to GSTU13 (At1g27130) but possesses only very low GST activity (Dixon et al., 2009). Despite this sequence similarity, GSTU14 lacks an active site serine (although there are two serine residues close by) and this explains the poor observed activity. GSTs U15 to U18 (At1g59670, At1g59700, At1g10370 and At1g10360) are uncharacterised to date.
The third tau clade comprises GSTs U19 to U28, of which GSTU19 (At1g78380) is by far the best studied. First cloned as the protein GST8 induced by drought (Bianchi et al., 2002), GSTU19 was subsequently characterised as a major proteins whose expression was enhanced in Arabidopsis cultures following exposure to herbicide safeners (DeRidder et al., 2002). GSTU19 has subsequently been identified as an abundant transcript and protein in numerous stress studies, with its major contribution to CDNB conjugating activity readily determined in Arabidopsis plants (Dixon et al., 2009). Using GSTU19 promoter:GFP fusions, its induction by chemicals was shown to be far more pronounced in roots than in shoots (DeRidder and Goldsbrough, 2006). Partial and complete GSTU19 insertional knock-out lines have since been generated and validated at the proteome level (ülker et al., 2008), but despite the abundance of this GST in wild-type tissues, no overt phenotype has been reported in knock-out plants. At the biochemical level when expressed in plants, GSTU19 binds biologically active ligands including 2-S-glutathionylchlorogenic acid, and a range of glutathionylated fatty acid derivatives (Dixon and Edwards, 2009).
Using yeast two-hybrid screening, the N-terminal domain of GSTU20 (At1g78370) was shown to interact with the far-red insensitive 219 protein (Chen et al., 2007), which is an auxininduced, jasmonate-conjugating enzyme linked to phytochrome signalling. Both over-expression and knock-down of GSTU20 resulted in a similar hyposensitivity to continuous far-red light. In young plants, the GSTU20 promoter directed expression in the vasculature of cotyledons while in older plants expression was found in auxin-producing areas (Chen et al., 2007). Microarray experiments show generally low levels of GSTU20 transcript, which is in contrast to the high levels of the respective polypeptides determined in proteomic experiments (Dixon and Edwards, 2009). Despite these interesting associations, the biochemical function of GSTU20 remains elusive. Given that GSTUs have an affinity for fatty acid derived compounds, it is feasible that GSTU20 binds jasmonic acid or a derivative, stabilising it and/or transporting it within the cell. One possibility is that GSTU20 prevents epimerisation of the active (+)-7-iso-jasmonoyl-L-isoleucine to the inactive (-)-jasmonoyl-L-isoleucine. GSTU20 is also co-regulated with aliphatic glucosinolate metabolic genes (Hirai, 2009), suggesting an alternative functionality. Little is known about GSTs U21–U23 (At1g78360, At1g78340 and At1g78320). GSTU24 (At1g17170) transcripts were strongly induced by treatment with a variety of xenobiotics including the explosive TNT (Mezzari et al., 2005), although plants lacking this enzyme showed similar responses to wild-type on TNT treatment (Yoon et al., 2007). GSTs U25 (At1g17180) and U28 (At1g53680) both have very high activity towards CDNB. GSTU25 also has very high GPOX activity towards the synthetic substrate cumene hydroperoxide, although unusually this activity does not extend to lipid hydroperoxides (Dixon et al., 2009). Both GSTs bound shorter chain length fatty acidglutathione adducts, but interestingly GSTU25 specifically bound hydroxylated fatty acids when expressed in E. coli, while GSTU28 only bound non-hydroxylated fatty acids. In tobacco, both proteins bound a range of glutathione-modified fatty acid-derived compounds (Dixon and Edwards, 2009). GSTU26 (At1g17190) was shown to be inducible by chemical treatment, and has also been assayed for activity towards xenobiotic substrates (Nutricati et al., 2006). Similarly, GSTU27 (At3g43800) is also responsive to chemical treatments being with its expression induced by salicylic acid treatments as determined in proteomic studies (Gruhler et al., 2005)
Arabidopsis has a single enigmatic member of an unusual family of GSTs, named tetrachlorohydroquinone dehalogenases (TCHQDs), based on their sequence homology to the prokaryotic enzymes showing this unusual ability to metabolise chlorinated xenobiotics (At1g77290). Closely related sequences are also present in other monocot and dicot plants. Apart from a recent study showing localisation of GFP-tagged TCHQD to the plasma membrane (Dixon et al., 2009), very little is known about this protein. Its active site bears the conserved serine residue suggesting that it could catalyse standard GST reactions. The protein in Arabidopsis does differ from other related plant sequences in being significantly larger due mainly to an approximate 25 amino acid residue insertion in the middle of the protein and a further C-terminal extension.
In addition to the soluble GSTs, plants have, like other eukaryotes and prokaryotes, members of a second phylogenetically unrelated GST family known as the membrane associated proteins in eicosanoid and glutathione metabolism (MAPEG) (Jakobsson et al., 1999). In mammals these enzymes form membrane-bound trimers that catalyse glutathione-dependent transferase and peroxidase reactions on hydrophobic substrates such as leukotrienes. Arabidopsis has a single MAPEG-like protein (At1g65820), which has low activity toward CDNB and based on sequence similarity clusters with mammalian MGST3 proteins (Bresell et al., 2005).
Despite the numerous associations of GSTs with stress responses, plant development and metabolism, the functions of these enzymes in Arabidopsis and other plants remain elusive. That being said, more progress has been made in studying GSTs in the last 10 years in Arabidopsis than had been reported in the previous 30 years using a range of crops and weeds. The adoption of Arabidopsis as a model system to study this super-family, with all its associated genomic resources, has proven to be a major step in accelerating our knowledge of the organization of these proteins. For example, the recognition of the propensity of GSTs to undergo gene duplication and mask each others functions due to redundancy has only been made apparent though the work in Arabidopsis. With respect to studying these GSTs, the continuing challenge will be their functional characterization in planta. As our phenomic and metabolomic screens improve in their coverage and sophistication it will be increasingly likely that subtle differences will be identified in knock out plants, as shown by the recent RNAi studies targeting multiple GSTs (Sappl et al., 2009). Similarly, as our ability to interrogate large data sets improves, bioinformatic analysis of DNA and proteomic data will provide us with new useful associations between specific GSTs and co-regulated cellular events (Dixon et al., 2010). Once ‘systems based’ hypotheses as to GST function have been established, these in turn can be usefully tested by focussing on specific groups of metabolic pathways and metabolites, for example by using tagged GSTs to capture and pull down ligands in planta for identification by mass spectrometry (Dixon and Edwards, 2009). There is therefore increasing scope to address the enigmatic functions of GSTs by making better use of resources that we already have available to us.
Another interesting area of study in which Arabidopsis may be very useful is determining the diversification in function of the GSTs, based on a conserved structure and active site chemistry. In terms of the gene family size and knowledge of their protein chemistry, GSTs are well suited to a study of their evolutionary biochemistry, and how an adaptable protein platform can be adapted to support multiple useful activities which can then undergo selective fine tuning over generations. In turn, such studies could also help answer major questions in crop science relating to the very high titre and diversity of GSTs in our major domesticated crops. What is the advantage to plants over many generations having multiple copies of proteins which apparently have duplicated functions? In this regard there is some real scope to use our knowledge of GSTs developed in Arabidopsis in understanding the roles of GSTs in crop domestication.