Polycomb group (PcG) and trithorax group (trxG) proteins are key regulators of homeotic genes and have crucial roles in cell proliferation, growth and development. PcG and trxG proteins form higher order protein complexes that contain SET domain proteins, with a histone methyltransferase (HMTase) activity, responsible for the different types of lysine methylation at the N-terminal tails of the core histone proteins. In recent years, genetic studies along with biochemical and cell biological analyses in Arabidopsis have enabled researchers to begin to understand how PcG and trxG proteins are recruited to chromatin and how they regulate their target genes and to elucidate their functions. This review focuses on the advances in our understanding of the biological roles of PcG and trxG proteins, their molecular mechanisms of action and further examines the role of histone marks in PcG and trxG regulation in Arabidopsis.
INTRODUCTION
In eukaryotic organisms the genetic information encoded by the DNA is compacted into chromatin. The basic unit of the chromatin, the nucleosome, is formed by the wrapping of 147bp around a histone octamer (two copies of each histone H2A, H2B, H3, and H4). The position of the nucleosomes along the DNA depends on factors like the DNA sequence and the presence or nature of other proteins bound to the DNA. Thus, nucleosomes present a barrier for proteins that need to contact the DNA. This arrangement of nucleosomes on DNA is dynamic, and changes occur rapidly according to the needs of the cell and in response to endogenous and exogenous signals. Within the context of chromatin, it has been postulated that processes involved in regulating gene expression include histone post-translational modifications, variation of nucleosome composition (histone replacement and histone variants) and nucleosome positioning by ATP-dependent chromatin-remodeling complexes and cytosine methylation on DNA (Pfluger and Wagner 2007). Accordingly, covalent modifications of histone proteins are fundamental for regulation of gene activity. For example, histone modifications regulate cellular events like gene expression (Muramoto et al, 2010; Karlic et al, 2010), gene silencing (Jackson et al, 2002; Tamaru et al, 2003; Jackson et al, 2004), DNA repair (Fernandez-Capetillo et al, 2004; Sanders et al, 2004), and chromatin condensation (Houben et al, 2005). The N-terminal tails of histones are subject to different combinations of modifications. For example methylation, acetylation, phosphorylation, ubiquitination, and sumoylation. These modifications in turn change the histone-DNA interactions and create or blocks protein-binding sites. For example, acetyl lysine has been shown to associate with bromodomains. In this case, acetylated H3 stabilizes binding of the histone acetyltransferase GCN5 through its bromodomain (Dhalluin et al, 1999). Also, lysine methylation provides an important switch for binding of representatives of proteins with chromodomains and tudordomains. The initial observation was that methyl H3K9 associates with the chromodomain of heterochromatin-like protein 1 (HP1) to promote its binding to heterochromatin (Bannister et al, 2001 ; Lachner et al, 2001). Even though DNA methylation particularly in 5′ control regions is generally associated with transcriptional repression and/or silencing of genes, histone modifications are associated with activation and silencing of gene expression (Saze, 2008). Importantly, histone modifications can be reversed by specific enzymes such as histone deacetylases (HDAC; Chen and Tian, 2007), deubiquitinases (Sridhar et al, 2007), phosphatases and histone demethylases of the, for example, lysine-specific demethylase1 KDM1/LSD1 family (Shi et al, 2004) and the Jumonji C domain—containing proteins family (Tsukada et al, 2006). On the other hand, ATP-dependent chromatin-remodelling complexes use the energy derived from ATP hydrolysis to modify the position or composition of nucleosomes (Lu et al, 2009; Mousson et al., 2007). Finally, methylation of cytosine residues in the DNA can alter gene expression profiles by influencing the binding affinities of transcription factors or other proteins to DNA (Johnson et al, 2007; Zilberman et al, 2007; Cheng and Blumenthal, 2010 and ref therein).
In this chapter I will discuss the different covalent histone modifications and their role in regulating gene expression. These modifications, the type and degree of modification, have an effect on the configuration of the chromatin by creating “open” or “closed” conformations and in turn activating or repressing gene transcription, respectively. I will focus on the available information related to Polycomb-Group (PcG) and trithorax-Group (trxG) protein complexes and their targets in plants, with emphasis in Arabidopsis thaliana. Finally, I will examine the regulation of gene expression by PcG and trxG proteins and their counteracting activities in plants and metazoans.
Epigenetic Marks: Core Histone Modifications and Epigenetic Memory
In 1994, Robin Holliday broadly re-defined epigenetics as “the study of the changes in gene expression which occur in organisms with differentiated cells, and the mitotic inheritance of given patterns of gene expression” (Holliday, 1994; Holliday, 2006). However, it has been suggested that the term ‘heritable’ be omitted from the definition (Bird, 2007) and has been referred to as “the information carried by the genome that is not coded by the DNA” (Kouzarides, 2007), or the “sum of the alterations to the chromatin template that collectively establish and propagate different patterns of gene expression (transcription) and silencing from the same genome” (Allis et al., 2007). Nevertheless, the most popular definition of epigenetics refers to changes in gene expression that does not involve changes in the DNA sequence, but that are inherited even in the absence of the signal that initiated the change (Berger et al, 2009). Epigenetic mechanisms regulate a wide range of processes including development, cell differentiation, DNA repair, senescence, disease and cancer (Kouzarides 2007; Surani et al., 2007). Epigenetic phenomena like X-chromosome inactivation, genomic imprinting, centromere function and gene silencing, depend on the establishment and maintenance of specific chromatin structures, which are defined by DNA methylation, histone post-translational modifications (PTMs), histone variants, and chromatin-binding proteins (Groth et al, 2007; Bernstein et al., 2007; Kouzarides, 2007; Schuettenguber et al., 2007). In particular, the functional consequences of histone PTMs can be direct, causing structural changes to chromatin, or indirect, acting through the recruitment of effector proteins.
Histone residues are key substrates in histone biochemistry, undergoing modifications including acetylation, methylation, ubiquitylation, phosphorylation and SUMOylation. Histone lysine residues methylated in vivo include H3K4, H3K9, H3K27, H3K36, H3K79, H4K20 and H1K26. The first H3K4 (histone H3 lysine 4) methylase, Set1/COMPASS, was isolated from Saccharomyces cerevisiae and was demonstrated to be capable of mono-, di-, and trimethylating H3K4 (Miller et al., 2001 ; Roguev et al., 2001). Thus, methylation can occur several times on one lysine side chain and each level of modification may have different biological outcomes. In yeast, histone H2B mono-ubiquitination by Rad6/ Bre1 is required for the trimethylation of both histone H3K4 by COMPASS and H3K79 by Dot1 methyltransferases, which is a process known as histone cross talk (Dover et al., 2002; Sun and Allis, 2002; Shilatifard, 2006; Nakanishi et al, 2006). Whereas in Arabidopsis relatives of the yeast Rad6/Bre1 (UBIQUITIN-CONJUGATING ENZYME 1, UBC1 and UBC2; HISTONE MONOUBIQUITINATION 1, HUB1 and HUB2, respectively), mediate histone H2B monoubiquitination and up-regulate the expression of FLOWERING LOCUS C, and in this way repress the developmental transition from a vegetative to a reproductive phase, or flowering (Gu et al, 2009). Lysine residues can be acetylated, activating gene expression, whereas SUMOylation seems to be repressing, and these two types of modifications may mutually interfere (Iñiguez-Lluhí, 2006). By contrast, methylation and ubiquitylation have variable effects, depending on the residues being modified and their contexts. For example, trimethylation of lysine 4 in histone H3 (H3K4me3) accumulates predominantly when genes become induced or on genes poised for transcription later in development (Saleh et al, 2007; Zhang et al, 2009), whereas H3K9me3, in mouse embryonic fibroblasts, is almost exclusively enriched at pericentric heterochromatin, while H3K9 mono- and dimethylation are enriched at more euchromatic regions (Rice et al, 2003). In contrast, in Arabidopsis the H3K9me3 mark either doen't exists (almost undetectable; Jackson et al, 2004) or is found throughout euchromatin (Bernatavichute et al, 2008; Fuchs et al, 2006), while their H3K9me2 mark is highly enriched in the pericentromeric heterochromatin where transposons and other repeats cluster (Bernatavichute et al, 2008). On the other hand, two ubiquitylation sites in the C-termini of H2B and H2A correlate with active and repressed transcription, respectively (Berger, 2007 and ref. therein). Arginine residues on histones H3 and H4 can also be methylated (mono- or di-methylated) and arginine dimethylation can be symmetrical or asymmetrical. Arginine methylation seems to be involved primarily in activating transcription (Litt et al, 2009 and ref. therein). However, for example, histone H3 arginine 2 asymmetrically dimethylated (H3R2me2a) in yeast is associated with both heterochromatin and euchromatin, necessary for heterochromatic silencing, and act as a negative regulator of H3K4 trimethylation (Kirmizis et al., 2007) Histone phosphorylation (ph) on serine and threonine residues is also a PTM involved in transcription. For example, histone H3 phosphorylation at serine 10 (H3S10ph) is required for chromatin condensation during mitosis and is also associated with the transcriptional activation resulting from stimulation by external stimuli like mitogens and stress. It is still not clear what determines if H3S10ph is associated with “active” or “inactive” chromatin. It is assumed that, in part, other histone modifications together with H3S10ph elicit the correct biological outcome (Mahadevan et al, 1991; Dong and Bode 2006). Moreover, it is likely that similar mechanisms will be operational at other sites of histone phosphorylation. On the other hand, phosphorylation of histone H3 at Threonine 45 (H3T45ph), has been shown to increase dramatically in apoptotic HL60 and purified human neutrophil cells. It has been suggested that phosphorylation of this residue, with the resulting introduction of negative charge close to the DNA, would affect the binding energies of the nucleosome and possibly the altered nucleosomal/chromatin structure favors DNA processing in late apoptosis, such as DNA nicking and/or fragmentation (Hurd et al., 2009).
All histone PTMs are removable. Histone deacetylases (HDACs) catalyze the removal of acetyl groups from the ε-amine of acetyl-lysine residues within histones and Ser/Thr phosphatases remove phosphate groups. Eukaryotic HDACs have been grouped into three classes based on their homology to three yeast HDACs: REDUCED POTASSIUM DEPENDENCY3 (RPD3), HISTONE DEACETYLASE1 (HDA1), and SIRTUIN2 (Pandey et al., 2002; Yang and Seto, 2003). In addition, plants contain another class of HDACs, the HD2 class (Lusser et al., 1997; Aravind and Koonin, 1998; Wu et al., 2000a, 2003; Dangl et al., 2001; Zhou et al., 2004). Ubiquitin proteases remove mono-ubiquitin from H2B. Arginine methylation is altered by deiminases, which convert the side chain to citrulline. Two families of lysine demethylase have recently been identified: the LSD1/KDM1 family, which removes H3K4me1 and H3K4me2 with concomitant formation of formaldehyde, and the jumonji histone demethylases (JHDMs) family, which removes H3K4me2 and H3K4me3, H3K9me2 and H3K9me3, and H3K36me2 and H3K36me3 (Smith and Denu 2009, and ref. therein; Berger, 2007).
Additionally, inheritance of epigenetic memory is accompanied by the replacement of histone variants. The H2A, H2B, H1 and H3 histone families contain variants and each of these is thought to have specific properties and functions. These variants are expressed during the whole cell cycle and can be assembled into nucleosomes in differentiated cells, expanding in this way the cell's PTM profile. The synthesis and deposition of some of these histone variants, e.g., H2A.Z and H3.3, are not restricted to the S phase of cell cycle, in contrast to the canonical histones, and consequently they go through a replication-independent assembly. In the case of the histone H3 family, five histone H3 variants have been described in mammals: H3.1, H3.2, H3.3, H3.1t and CENP-A (Bernstein and Hake, 2006). In Arabidopsis thaliana there are two predominant histone H3 variants in the genome in addition to the centromeric histone CenH3: H3.1 (five copies) and H3.2 (three copies) (Johnson et al, 2004; Waterborg, 1992), in addition to the five H3.3-like genes reported (Okada et al, 2005). H3.1 is similar to the S phase-dependent variant found in animals and H3.2 is similar to the replacement histone H3.3 expressed throughout the cell cycle (Johnson et al, 2004). Additionally, the replication dependent histone H3 (H3.1) has been shown to be enriched in modifications associated with gene silencing while the replication-independent histone H3 (H3.2) has a lower abundance of the silencing modifications and higher abundance of methylation at K36 (Johnson et al, 2004). Amino acid substitutions distinguishing H3.1 from H3.2 from plant histone variants are at positions 31, 41, 87, and 90, and this differential amino acid composition has suggested that H3 variants arose independently in plants and animals (Wu et al, 2009; lngouff and Berger, 2010 and ref. therein). Hake and Allis (2006) have proposed that histone H3 variants may serve as epigenetic labels throughout the genome to mark different functional domains (e.g. euchromatin, facultative heterochromatin, and constitutive heterochromatin).
Epigenetic Memory
During chromatin replication key steps must be coordinated to transmit faithfully both genetic and epigenetic information to cell progeny. Failure to synchronize DNA replication with maintenance of chromatin organization can lead to developmental defects by putting at risk genetic and/or epigenetic integrity (Jasencakova and Groth, 2009). The chemical modifications to histone proteins (PTMs) and DNA (cytosine methylation) provide heritable epigenetic information not coded by the nucleotide sequence. Specifically, particular states that define cell identity are achieved by heritable instructions, the epigenetic marks that determine whether, when and how particular genetic information will be read to ensure the transmission of epigenetic marks, once they are established, from mother to daughter cell and likely from generation to generation (Probst et al, 2009; Marumoto et al, 2010).
Histone PTMs and DNA methylation are systems able to activate or repress transcription in a heritable manner, and they appear to be involved in maintaining established states, rather than to fully silence expressed genes or to activate completely silent genes (Bird, 2002). In eukaryotic genomes, patterns of cytosine methylation are inherited from cell to cell through the action of maintenance methyltransferase enzymes on symmetrical CG dinucleotide pairs or CNG regions (Bird, 2002; Goll and Bestor, 2005). Furthermore, a subset of histone modifications also appears to show epigenetic inheritance. For example, in yeast (which lack DNA methylation), interactions between hypoacetylated histones and SIR proteins (S. cerevisiae) or between H3K9 methylated histones and the Swi6 chromodomain (S. pombe) maintain the heterochromatic state through cell division (Grewal and Moazed, 2003; Bernstein et al., 2007). In Arabidopsis, DNA methylation is present in three DNA sequence contexts: CG, CHG and asymmetrical sequence contexts CHH (where H=A, T, or C) (Bernatavichut et al, 2009). This provides an increased combinatorial power to the ‘DNA methylation code’ (Vaillant and Paszkowski, 2007). Although all histone-based information not necessarily must be maintained during replication, certain modifications and histone variants should be transmitted to both DNA daughter strands and may serve as a template for newly synthesized histones. This is a condition if histones have an epigenetic function, and indicate that histone recycling is also a key process for genome function (Groth, 2009). Thus, the fundamental topic for restoring the epigenetic framework is the way nucleosomes are formed on nascent DNA from old and newly synthesized histones.
In each cell cycle, the genetic and epigenetic information is challenged during DNA replication. And because of genomewide alterations in chromatin structure that occur during replication, the S-phase has been also considered a distinctive phase where cells can modify their chromatin structures and influence gene expression patterns, and as a result, cell fate (Corpet and Almounzi, 2009). Furthermore, when DNA replicates, chromatin goes through a series of disruption and subsequent restoration in the wake of the passage of the replication fork. While lineage preservation requires the maintenance of epigenetic marks, DNA replication also provides the opportunity for changes in epigenetic states to occur during cell differentiation and development. Thus, complicated mechanisms have evolved to ensure stability through the transmission of genetic and epigenetic information at the replication fork, and to ensure plasticity that allows the desired switches during development. When taking into consideration epigenetic marks, in addition to DNA duplication, it is important to assess how DNA methylation, histone deposition and histone marks are associated to the replication machinery (Probst et al, 2009).
Most of the known processes involved in the duplication of epigenetic marks relate to the silent modifications that establish heterochromatic structures. DNA replication occurs in an asymmetric manner where only one of the two templates can be replicated continuously as the replication fork moves (leading strand) and discontinuous on the lagging strand. To coordinate the replication of both DNA strands, multiple DNA polymerases function at the replication fork and they are assisted by the processivity factor proliferating cell nuclear antigen (PCNA), a ring-shaped homotrimeric protein that serves as a processivity factor for the DNA polymerases (Shibahara and Stillman, 1999). PCNA is loaded around DNA by the conserved chaperone-like complex Replication Factor C (RFC). The formation of a stable PCNA-RFC complex, and for its loading to primed DNA, requires of ATP binding. DNA binding in turn activates the ATP hydrolysis activity of RFC, leading to its dissociation from the loaded clamp. Then, the PCNA ring, which encircles DNA, tethers polymerases to DNA, making the sliding clamp an essential cofactor for DNA synthesis (see Figure 1 ; Moldovan et al., 2007). Thus, PCNA provides a link between the two DNA strands and might also link DNA synthesis and the inheritance of epigenetic marks. Establishment of PTMs on new histones in a replication-coupled fashion would ensure the rapid restoration of domains. However, as PCNA is present at all replication forks, there must be additional levels of regulation to guarantee specificity. Perhaps PCNA through highly dynamic interactions (e.g. with factors involved in nucleosome assembly, histone deacetylation, DNA methylation, nucleosome remodelling, and histone methylation) creates a high local concentration of factors, making them available for recruitment to specific binding sites in nascent chromatin (e.g. PTMs on parental histones) (Jasencakova and Groth, 2009). Though, it is not known how active chromatin can be inherited during replication. It remains unclear if active marks are duplicated on new nucleosomes or simply kept on parental ones where they would provide sufficient active marks to maintain a permissive state for transcription (Corpet and Almouzni, 2009 and ref. therein).
Nucleosome assembly and disassembly is carried out by partially redundant pathways, involving the conserved H3/H4 histone chaperones CAF-1 or the HIR complex and their common cofactor Asf1, in addition to chromatin remodeling complexes which modify chromatin structure during transcription and to various enzymes involved in catalyzing and removing histone modifications (Mousson et al., 2007). Bulk histones are incorporated into chromatin in a replication coupled (RC) manner by the CAF-1 complex (H3/H4) and NUCLEOSOME ASSEMBLY PROTEIN-1 (NAP-1) (H2A/H2B) proteins. The heterotrimetric complex CAF-1 (Cac1, Cac2, and Cac3) delivers histones H3 and H4 to replicating DNA during the S phase but also in chromosomal replication-independent chromatin assembly (Smith and Stillman, 1989), whereas HIR (a complex of Hir1, Hir2, Hir3, and Hpc2 in yeast) is involved in H3/H4 deposition in the replication-independent pathway (Green et al, 2005). Targeting of CAF-1 to sites of DNA synthesis requires its direct interaction with PCNA, thereby physically linking histone deposition activity to the replication fork (Moldovan et al, 2007). Together with CAF-1, a key chaperone in this complex is, anti-silencing function 1 (Asf1), which facilitates chromatin assembly that is linked to DNA synthesis in vitro. Asf1 in complex with H3/H4 can act as a histone donor and synergize with CAF-1. Formation of such ternary complex (CAF-1/Asf1/H3/ H4) seems to be intermediary, enabling histones to be handed over from one chaperone to the next. Here, such histone transfer from Asf1 to CAF-1 would ensure an efficient histone deposition coupled to DNA replication (Corpet and Almouzni, 2009; see Figure 1). On the other hand, histone variants (e.g. H2A.Z, H3.2) are placed into chromatin by specific histone deposition complexes which have been identified from yeast to humans, also present in plants. For example, the histone H2A.Z is incorporated into chromatin by a multi-subunit complex termed SWR1 in yeast and SWR1-like complex in plants (March-Diaz et al., 2007). Studies of genetic and physical interactions suggest that the Arabidopsis PHOTOPERIOD-INDEPENDENT EARLY FLOWERING 1 (PIE1), ACTIN-RELATED PROTEIN 6 (ARP6) and SERRATED LEAVES AND EARLY FLOWERING (SEF) proteins form a complex (SWR1-like complex) involved in histone variant deposition that is related to the yeast SWR1 complex (March-Diaz et al., 2008). While the H3.2 histone variant is placed into chromatin by the histone chaperone Histone Regulation A (HIRA) (Loyola and Almouzni, 2007).
New histones carry distinct PTMs and they are involved in the establishment of specific domains in chromatin restoration. So far, the process wherein new histones acquire the appropriate PTM profile of the loci where they will be incorporated is something not entirely understood. However, several mechanisms have been proposed, for example: (i) through cross-talk with other marks, i.e. DNA methylation or PTMs on other histone subtypes, (ii) through spatio-temporal regulation during replication, (ii) by using marks on parental histones as a blue-print (Jasencakova and Groth, 2009).
During cell division histones segregate randomly and each daughter chromosome inherits some modified histones. This modification state could spread locally to newly deposited histones. In fact, various protein complexes in chromatin have complementary binding and modifying activities and may consequently contribute to the epigenetic maintenance of histone modification patterns (Groth et al., 2007). In addition, recent evidence supports the heritability of specific histone modifications in multicellular organisms (Marumoto et al, 2010). In particular, H3K27 and H3K4 methylation are catalyzed by Polycomb-group (PcG) and trithorax-group (trxG) protein complexes, which mediate mitotic inheritance of lineage-specific gene expression patterns (Ringrose and Paro, 2004; Schuettengruber et al., 2007; Marumoto et al, 2010). It has been shown that PcG protein complexes of the PRC1-class remain bound to chromatin and DNA during replication, and that this retention of Polycomb proteins through DNA replication may contribute to maintenance of transcriptional silencing through cell division (Francis et al, 2009). Thus, as has been proposed, a physical interaction between PcG complexes and methylated histones retained within the chromatin may possibly direct them back to their target sites after cell division (Bernstein et al., 2007).
A common theme of the PcG complexes binding and association to H3K27 methylation often involves extensive genomic regions (Lee et al., 2006). These repressive domains are comparable in size to activating domains of H3K4 methylation in animal's HOX clusters (Bernstein et al., 2005). But, most notably, these activating domains are also occupied by the trxG protein MLL1 (Guenther et al., 2005). Thus, chromatin domains could theoretically provide a robust epigenetic memory to maintain expression or repression of critical cell type-specific genes. While disperse modification sites of just a few adjacent histones could be lost during mitosis when histones segregate randomly to the daughter strands, large domains with significant numbers of modified histones would be inherited by both daughter strands and could promote similar modification of newly deposited histones. All this supports a central role for chromatin domains with PcG or trxG proteins in the epigenetic control of developmental regulator genes (Henikoff et al., 2004; Bernstein et al., 2007).
PcG AND trxG PROTEIN COMPLEXES
There are two main epigenetic systems that have been studied in a variety of organisms that are involved in the molecular basis of “heritable” epigenetics, because alterations in these systems are often inherited by subsequent generations of cells and occasionally organisms: the DNA methylation system and the Polycomb/Trithorax systems (Bird, 2007). DNA methylation typically occurs in a CpG dinucleotide context in adult somatic tissue, and is associated with stable gene silencing, through interference with transcription-factor binding or through the recruitment of repressors that specifically bind methylated CG. In plants, cytosines can be methylated both, symmetrically (CpG or CpNpG) and asymmetrically (CpNpNp) (Cao and Jacobsen, 2002).
On the other hand, in Drosophila, the Polycomb group (PcG) proteins maintain a repressive state of homeotic gene (HOX gene) expression, while the trithorax group (trxG) proteins maintain HOX gene activity (Simon and Tamkun, 2002). Specifically, the PcG has been defined as a group of genes whose individual mutation results in phenotypes similar to those of Polycomb (Pc) mutants (mutations that lead to the global transformation of embryonic segments into the posterior-most segment), or which can enhance the phenotypes of Pc mutant alleles, whereas trxG genes were defined by their ability to counteract the activity of PcG genes in homeotic gene regulation (trxG mutations lead to the ectopic repression of homeotic genes) (Grimaud et al, 2006). The molecular analysis of PcG and trxG genes has revealed that their products act as large multimeric complexes at the level of chromatin structure. Biochemical evidence for the existence of PcG complexes was obtained by Franke et al. (1992) who showed that (i) PcG proteins in Drosophila bind polytenic chromosomes in an overlapping pattern, suggesting that they interact at PcG target sites, and (N) by co-immunoprecipitation experiments showing an interaction between three proteins, Polycomb (PC), Polyhomeotic (PH) and also Posterior sex combs (PSC) (Franke et al. 1992). This complex was later purified from Drosophila embryos and named “POLYCOMB REPRESSIVE COMPLEX 1” (PRC1; Shao et al., 1999). In addition to a minimal core, containing the proteins PC, PH, PSC, and dRING (also referred to as PCC, Polycomb core complex), the PRC1 contains additional proteins such as Sex combs on midleg (SCM), ZESTE, and general transcription factors (GTFs). The Drosophila E(z) protein is the catalytic subunit of a second PcG complex, POLYCOMB REPRESSIVE COMPLEX2 (PRC2), which mediates H3K27me3 (histone H3 lysine 27 tri-methylation) and also contains EXTRA SEX COMBS (ESC), p55 and SUPRESSOR OF ZESTE12 (Su[z]12) (Lafos et al., 2009). In mammals two related complexes have been purified that differ mainly in the isoform of EED (the mammalian ESC homolog): PRC3 and PRC4 (Kuzmichev et al., 2002, 2004). The characterization of their biochemical functions has shown that PcG and trxG genes are significantly involved in epigenetic phenomena, in particular the acquisition of specific histone marks on their target genes (Grimaud et al 2006). They are not required for initiation of gene repression but maintain repression/activation states during development (see Figure 2).
Genetic data suggest that trxG proteins also physically interact. At least three trxG multiprotein complexes have been identified from Drosophila embryonic extracts. The analysis of one of these complexes, called BRM, revealed the presence of the trxG protein brahma (BRM; Dingwall et al., 1995) and moira (MOR) (Crosby et al., 1999), whereas the majority of BRM-associated proteins are not encoded by trxG genes. The two additional trxG complexes were characterized for the presence of other trxG proteins, such as absent, small or homeotic discs-1 (ASH1) and ASH2 (Papoulas et al., 1998; reviewed in Breiling et al, 2007). The Brahma gene presents similarities to yeast SWI2/SNF2, which functions as the ATPase subunit of the chromatin-remodeling complex SWI/SNF (Peterson and Tamkun, 1995). This class of complexes shifts nucleosomes along the DNA and help activators and transcription factors in reaching their target sites. A “trithorax acetylation complex” (TAC1), containing trithorax (TRX) and the histone acetyl-transferase HAT CBP/p300, has been identified in Drosophila and is required for maintenance of the homeotic Ubx gene (Petruk et al., 2001). Thus, trxG complexes are involved in the formation of an open chromatin structure more accessible to the transcription machinery by promoting an active epigenetic modification (e.g., methylation/acetylation of histone tails) at specific cis-regulatory elements and target promoters, and facilitate transcription by being involved in chromatin remodeling (see Figure 2; Breiling et al, 2007).
Several PcG and trxG proteins multimeric complexes contain SET-domain proteins responsible for different types of lysine methylation at the N-terminal tails of the core histone proteins. In Drosophila these post-translational histone modifications control chromatin state and, as a result, regulate the accessibility of the transcription machinery to the HOX gene clusters and other target genes. The histone methyltransferase (HMTase) activity is conferred by the SET domains (for Su(var)3–9, E(z), Trithorax), encoded by the Drosophila melanogaster Su(var)3-9-, E(z)-, and Trithorax-related genes which can catalyze histone lysine mono-, di-, or trimethylation of several lysines in the histones H3 and H4 (Lachner, et al., 2004). These lysine methylation states have been experimentally classified into repressive and activating marks, depending on their effect on gene expression. In general, methylated H3K9, H3K27 and H4K20 are considered marks for repressive chromatin structures, while methylated H3K4 and H3K36 are classified as activating marks (Fuchs et al., 2006). However, the degree of lysine methylation (mono-, di-, or tri-methylated lysine amino-groups) has a distinct impact on gene expression of diverse genes and may be interpreted in a different way among different eukaryotes (Bernstein et al., 2002; Santos-Rosa et al., 2002; Ng et al., 2003; van Dijk et al., 2005; Kouzarides, 2007).
The Arabidopsis genome encodes 43 SET-domain proteins — named SDG1 to SDG43 in the Plant Chromatin Database ( www.chromdb.org). Up until now, no trxG complexes have been isolated in plants. However, the discovery of activating histone marks associated with active states of gene expression, together with the regulation of floral homeotic genes by trxG homologues provide arguments supporting the conservation of the trxG mechanism in plants.
PcG Proteins and Their Targets in Plants.
In contrast to animals, organ development in plants is not restricted to the embryonic stage: the lateral organs (leaves), the reproductive organs (flowers), and the seeds originate from the same undifferentiated meristem that is active throughout the life cycle. Because in plants differentiation and organogenesis are not fixed in embryogenesis, it was not evident that PcG/TrxG functions would participate in plant developmental processes. However, the discovery that genes encoding PcG/TrxG homologs play roles in development and survival strategies of Arabidopsis changed this view (Alvarez-Venegas et al, 2003). In plants, as in animals, development of a wrong organ at a wrong place (homeosis) is a consequence of a mutation of a homeotic gene. Unlike the animal counterparts, however, the plant homeotic genes are not clustered (e.g. HOX genes) and they belong to the MADS-box family of transcription factors (Avramova, 2009).
Several plant PcG genes, in particular PRC2 components, have been identified in forward genetic screens for mutations affecting flowering time, flower and seed development and the vernalization response (see Figure 3). The products of these genes show high sequence identity to animal PRC2 proteins, in particular to E(Z) and its orthologs and to Su[z]12 (Breiling et al., 2007). It is possible that genes for PcG proteins were present in the last common ancestor of plants and animals and were subsequently lost in unicellular lineages. This could explain the presence of only FERTILIZATION INDEPENDENT ENDOSPERM (FIE) but no CURLY LEAF (CLF, or SDG1) or EMBRYONIC FLOWER 2 (EMF2) homologs in the green algae Chlamydomonas reinhardtii. There is a tendency for the complexity of genes encoding PcG proteins to increase during evolution. For example, the moss and fern genomes mostly have single copies of the genes encoding PcG proteins, whereas seed plants usually have small gene families (Hinnig and Derkacheva, 2009).
Arabidopsis thaliana has 12 homologs of Drosophila PRC2 subunits: the three E(z) homologs CLF, MEDEA (MEA, or SDG5) and SWINGER (SWN, or SDG10); the three Suppressor of zeste (Su(z)12) homologs EMF2, FERTILISATION INDEPENDENT SEED2 (FIS2) and VERNALIZATION2 (VRN2); the single Extra sex combs (Esc) homolog FERTILIZATION INDEPENDENT ENDOSPERM (FIE); and the five p55 homologs MULTICOPY SUPPRESSOR OF IRA MSI1–5. Experimental evidence suggests that these proteins form at least three similar PRC2-like complexes in Arabidopsis: the FIS (FERTILIZATION INDEPENDENT SEED) complex which controls seed development (also known as FIS—PRC2 or MEA—FIE complex, of ancient origin and evolutionarily conserved between plants and animals), the VRN (VERNALIZATION) complex which mediates the vernalization response, and the EMF (EMBRYONIC FLOWERING) complex which represses early flowering and flower development (reviewed in Schatlowski et al, 2008; Hinnig and Derkacheva, 2009). Whereas the Arabidopsis homologues of ESC and p55 (FIE and MSI1, respectively) are likely a common component of all three complexes, the three different homologues of E(Z) (CLF, SWN and MEA) and of SU(Z)12 (EMF2, VRN2 and FIS2) show differences in expression and target gene specificity that suggest that they are specific for the different complexes which have partially discrete functions (see Figure 3; Schatlowski et al, 2008).
Köhler et al (2003) have shown that FIS, MSI1 and MEA proteins are part of a 600 kDa complex involved in the correct initiation and progress of seed development, suggesting that additional proteins are present in the complex. One candidate is FIS2, a C2H2 zinc-finger protein homologue of SU(Z)12 that has been shown to interact in vivo with MEA (Wang et al., 2006). Similar to MEA and FIE, MSI1 is a gametophytic maternal effect gene because the paternal copy of MSI1 has no effect on the fate of the offspring. In msi1 mutants, endosperm development of mutant seeds initiates independently of fertilization even in pollinated gynoecia, leading to the formation of embryos surrounded by diploid endosperm. This most likely leads to an earlier arrest of embryo development compared with mea and fie mutants (Köhler et al, 2003). Thus, the FIS complex silences target genes during gametogenesis and early seed development, whereas the EMF complex, which most likely contains CLF/SWN, EMF2, FIE and MSI1, silences some of the same target genes during subsequent sporophytic development, because CLF and SWN take over MEA function during later sporophytic development (Makarevich et al., 2006).
Recently, an intact PRC2 complex involved in the vernalization response was biochemically purified from Arabidopsis vernalized seedlings and showed formation of a vernalization specific complex, consisting of core PRC2 components (VRN2, SWINGER, FIE, MSI1), and three PHD finger proteins, VERNALIZATION5 (VRN5), VERNALIZATION INSENSITIVE3 (VIN3), and VEL1 (de Lucia et al., 2008).This PHD-VRN complex increases H3K27me3 levels in FLOWERING LOCUS C (FLC) chromatin, leading to its stable silencing during vernalization. VRN5 and the histone mark H3K27me3 are initially restricted to a small region from the transcriptional start to the beginning of the first intron, and only spread across the entire FLC locus after return to warm conditions. Post-cold, VRN5 associates more broadly over FLC coincident with increased H3K27me3. Therefore, it was proposed that vernalization-induced epigenetic silencing of FLC involves differential association and changed composition of distinct Polycomb complexes, a mechanism that shows many parallels with Polycomb silencing in mammals (De Lucia et al., 2008; Hinnig and Derkacheva, 2009).
The EMF (EMBRYONIC FLOWERING) complex represses precocious flowering and flower development by repressing the transcription of flowering activators such as FLOWERING LOCUS T (FT), the main flowering time integrator, and AGAMOUS-LIKE 19 (AGL19) (Hinnig and Derkacheva, 2009). Jiang et al (2008) recently reported that the Arabidopsis PRC2-like complex subunits CLF, EMF2 and FIE repress the expression of FLC and FLC relatives, including , MADS AFFECTING FLOWERING (MAF4) and MAF5, and that CLF directly binds to and mediates the deposition of H3K27me3 in FLC, MAF4 and MAF5 chromatin. Additionally, they showed that during vegetative development CLF and FIE strongly repress FT expression, and that CLF also directly interacts with and mediates the deposition of H3K27me3 in FT chromatin. Their results imply that PRC2-like complexes containing CLF, EMF2 and FIE deposit repressive H3K27me3 in and directly repress the expression of these flowering genes, and thus control the flowering program in Arabidopsis (Jiang et al, 2008).
In Drosophila, PRC1 is composed of five core subunits, Polycomb (PC), Polyhomeotic (PH), Posterior sex combs (PSC), dRING, and Sex combs on midleg (SCM). Genome sequence analysis indicates that PRC1 genes originated early in animal evolution. The PRC1 gene set is complete in several insect and vertebrate species but a varying number of PRC1 genes are missing in species from other phyla. For instance, all PRC1 core genes (except Scm) are absent in two Caenorhabditis species (C. elegans and C. briggsae), and at least three PRC1 subunits are not found in the urochordate Oikopleura dioica (reviewed in Schuettengruber et al 2007). Thus, it has been proposed that PRC1 genes can be lost as a consequence of the disintegration of the Hox gene cluster, which has occurred repeatedly during evolution. The argument for this is that most PRC1 genes are absent in the marine urochordate Oikopleura dioica (a small pelagic chordate belonging to the class of larvaceans and derived from the most basal branch of urochordates), which is a great example owing to its nine unlinked Hox genes (Seo et al., 2004). PRC1 genes are also absent in those two Caenorhabditis species, which have rearranged Hox clusters. However, the integrity of Hox gene clusters does not strictly correlate with the presence of a full set of PRC1 genes given that most or all PRC1 genes are found in several species with degenerated clusters (reviewd in Aboobaker and Blaxter, 2003; Schuettengruber et al, 2007). In Drosophila and mammals, PRC1 can bind H3K27me3 via the chromo domain protein POLYCOMB (PC) and is considered to confer stable, long-term silencing. However, since the PRC1 components are not conserved in plants, an alternative plant-specific mechanism for reading of the H3K27me3 mark may have evolved. Consistent with this, several studies indicate that the Arabidopsis chromodomain protein LIKE HETEROCHROMATIN PROTEIN1 (LHP1) [also known as TERMINAL FLOWER2 (TFL2)] might carry out an equivalent role to the PC protein (Zhang, X. et al, 2007b). It was thought that LHP1 functions in plants, as does HP1 in animals, to silence heterochromatic loci (Jackson et al, 2002). However, LHP1 (unlike HP1) was usually found in euchromatin and was found to be needed for the silencing of euchromatic genes, including many PcG protein targets, but not for the silencing of genes in heterochromatin (Turck et al, 2007; Schatlowski 2008; Hinnig and Derkacheva, 2009). LHP1 might form a complex comparable in domain composition and function to animal PRC1 but it is essential to determine if it fulfills the role of Pc, to determine its interacting members, and whether it functions together with the plant-specific EMF1 and VRN1 proteins, for example. On the other hand, Sanchez-Pulido et al (2008) recently characterized several PRC1 Ring finger proteins present in vertebrates' PRC1 complexes, and identified a set of proteins in Arabidopsis, Oryza sativa, Vitis vinifera and worms that share the PRC1 ring finger domain architecture, an N-terminal Ring finger domain and a Ubq-like domain (RAWUL domain) at their C-terminal region. All this indicates that these plant and worm proteins are potential orthologs of animal PRC1 Ring finger proteins. Finally, it has been shown that EMBRYONIC FLOWER1 (EMF1) participates in the PcG mediated gene silencing of the flower homeotic gene AGAMOUS (AG) during vegetative development in Arabidopsis thaliana and that despite the lack of homology at the protein level, EMF1 plays a PRC1-like role and could have a function in part analogous to Drosophila Psc (Calonje et al, 2008).
Thus, the possibility that a PRC1 -like complex is also involved in PcG-mediated gene silencing mechanism in plants is fascinating and opens new opportunities in plant PcG investigation.
Trx Homologues in Plants and Target Recognition.
Given the tightly balanced PcG/TrxG interaction for the control of homeotic genes, it is logical to expect that counteracting H3K27/H3K9 and H3K36/H3K4-modifying activities would be regulating plant genes as well (Avramova, 2009). In contrast to PcG complexes, studies of the plant TRITHORAX (TRX) homologs are practically absent. trxG orthologs have been identified in plants (Alvarez-Venegas and Avramova, 2001; Alvarez-Venegas et al., 2003), which shows that this memory system has been conserved through evolution. One of the main activities of the TrxG is correlated with H3K4 methylation, particularly H3K4me3, a mark associated with gene expression.
The yeast homolog of trithorax, Set1 (the sole histone H3–K4 methyltransferase in yeast), is found in a ∼400 kDa protein complex (COMPASS or Set1C) and consists of seven polypeptides (Set1, Cps25, Cps30, Cps35, Cps40, Cps50, and Cps60) (Miller et al, 2001). The only known biochemical activity of this complex is methylation of H3K4. In humans, there are three TRX homologs, called MLL1, MLL2, and hSET1 Of this, the MLL1 protein has been found in a large complex (more than 10 polypeptides) with proteins shared in the yeast and human SET1 histone methyltransferase complexes, including a homolog of Ash2 (Yokoyama et al., 2004).
Phylogenetic analyses performed only with the SET domain have shown that in the TRITHORAX family (those genes involved in the covalent modification of the amino-terminal tails of the core histones) there are two distinct sub-families: the SET1 and the TRX (Alvarez-Venegas and Avramova, 2002; Baumbusch et al 2001; Springer et al 2003). SET1-related proteins have been found in unicellular organisms as well as in animals and plants. It has been hypothesized that in the genomes of unicellular organisms, filamentous fungi, and higher eukaryotes, the SET1-related genes are othologs involved in core cellular activities not connected with functions required for multicellularity (Aravind and Subramanian, 1999; Alvarez-Venegas and Avramova, 2002; Avramova, 2009). In contrast, members of the TRX subfamily are not represented in the genomes of unicellular and filamentous fungi but they carry SET-postSET regions highly related to the proteins from the SET1-subfamily. Presumably, the ancestral SET1-related gene has multiplied and diversified its structure and function, after the separation from the lineages carrying only the SET1 gene (Avramova, 2009).
In Saccharomyces cerevisiae Set1 protein can catalyse di- and tri-methylation of H3K4 and stimulate the activity of many genes (Santos-Rosa et al, 2002). In Chlamydomonas, a SET1-homolog deposits a K4-monomethyl mark (van Dijk et al., 2005). SET1-orthologs are present as single copies in the genomes of filamentous fungi suggesting that these organisms use H3K4me mechanisms similar to those of yeasts (Veerappan, et al, 2008). By contrast, known plant Trithorax proteins, like ARABIDOPSIS HOMOLOG OF TRITHORAX-1 (ATX1), modify only a limited fraction of target nucleosomes (Alvarez-Venegas and Avramova, 2005) implying involvement of multiple K4 methyltransferases. ATX1 (SDG27) is a close homolog of TRX, the mouse mixed-lineage leukemia (MLL) protein, and the yeast SET1 protein. It contains a SET domain and additional domains characteristic of trxG proteins (Alvarez-Venegas and Avramova, 2001). Disruption of ATX1 causes pleiotropic phenotypes including homeotic stem growth, root, and leaf defects and is required to maintain normal expression levels during flower development of the homeotic genes, APETALA 1 (AP1), AP2, AGAMOUS (AG), and to a lesser extent, of PISTILLATA PI and AP3. Recombinant ATX1-SET-domain peptides displayed in vitro H3K4 methyltransferase activity, a histone modification associated with an active chromatin state and gene expression (Alvarez-Venegas et al, 2003). Recently, ATX1 was shown to bind AG chromatin and to be required for H3K4me3 deposition at this locus (Saleh et al., 2007). Furthermore, ATX1 is directly involved in ‘writing’ the H3K4me3 marks on FLC nucleosomes, but not on the homeotic gene AP1, indicating that its effect on AP1 is indirect (Saleh et al., 2008a). In addition, it has been shown that ATX1 directly regulates the floral regulator FLC by mediating the H3K4me3 modification, and H3K4me3 deposition is accompanied by a decrease in H3K27me2 levels at the FLC locus (Pien et al, 2008). Accordingly, ATX1 directly binds the active FLC locus before flowering and this interaction is released upon the transition to flowering. This dynamic process stand in contrast with the stable maintenance of homeotic gene expression mediated by trithorax group proteins in animals but bears a resemblance to the dynamics of plant Polycomb group function (see Figure 4; Pien et al, 2008). Also, an Arabidopsis homolog of the human WDR5, namely, WDR5a, which is a conserved core component of the human H3K4 methyltransferase complexes called COMPASS-like, interacts with the ATX1 methyltransferase, and both may act in a complex that is enriched at the FLC locus by a functional FRI to methylate H3K4, leading to FLC activation (Jiang et al, 2009). Moreover, ATX1 activates the expression of the WRKY70 gene (a gene positioned at the convergence nod of the Salicylic acid SA- and Jasmonic acid JA-signaling pathways) and is involved in establishing the H3K4me3 pattern of its nucleosomes. Anti-ATX1 -specific antibodies showed that ATX1 was bound to WRKY70 nucleosomes defining it as a ‘primary’ target (Alvarez-Venegas et al, 2007a).
Multiplication of an ancestral TRX-gene in Arabidopsis has produced five copies clustered in two sister groups: (a) ATX1 and ATX2 (SDG30), originating from a segmental chromosomal duplication and belonging in the same clade as sister paralogs, forming one group, and (b) ATX3 (SDG14), ATX4 (SDG16), and ATX5 (SDG29) forming the second (Baumbusch et al., 2001; Alvarez-Venegas and Avramova, 2002; Alvarez-Venegas et al, 2007b). In rice, the protein XP_450166 (SDG723) is a putative ortholog of both ATX1/ATX2, while the rice NP_913370 clusters with the ATX3/ATX4/ATX5 sister group (Avramova, 2009; Ng et al., 2007). Apparently, the divergence of the two sister groups has taken place before the separation of the mono- and the di-cots. The respective maize homologs (Springer et al., 2003) are available only as short peptides and cannot be clustered with confidence in these groups. ATX1 and ATX2 are 65% identical and 75% similar at the amino acid level, and the two proteins have similar architectural motifs. However, according to genome-wide expression analyses of mutant plants, ATX2 has a more restricted role in Arabidopsis, and in contrast to ATX1, possess an H3K4m2 activity, without ruling out its potential ability to carry out trimethylation (Saleh et al, 2008b). On the other hand, the Trithorax group gene ATX3 (At3g61740) was shown to be predominantly expressed in the Arabidopsis egg and central cell (Johnston et al, 2007). However, nothing is known regarding ATX3 target genes or its enzymatic activity. Also, nothing is known so far regarding ATX4/ATX5. Isolation and characterization of specific complexes assembled by ATX1, ATX2, or ATX3, as well as structural analysis of the ATXSET domain peptides, will be critical steps toward overcoming the obstacles for direct biochemical assessment of Trithorax function in plants.
In addition to the ATX family, seven Arabidopsis proteins have been classified as Trithorax-Related, ATXR (Baumbusch et al., 2001). Phylogenetic analysis, however, identified only ATXR7 (SDG25) as a Trithorax family member representing the Arabidopsis ortholog of SET1, while the AAN01115 protein (encoded by the Os 12g41900 gene) is the SET1 ortholog in rice. The other ATXR proteins cluster in separate groups distantly related to Trithorax (Avramova, 2009). This is in contrast to an earlier report indicating that ATXR5 (SDG15)/ATXR6 (SDG34) belong in the SET3/SET4 group of S. cerevisiae (Springer et al., 2003). Furthermore, comparative analyses revealed that the ATXR5/6-SET domain sequences do not carry the hallmark amino acid substitutions defining the SET3 subfamily (Veerappan et al., 2008). The two proteins differ in their subcellular localization: ATXR5 has a dual localization in plastids and in the nucleus, whereas ATXR6 is solely nuclear. The two paralogs interact with the proliferating cellular nuclear antigen (PCNA) and they seem to play a role in cell-cycle regulation or progression (Raynaud et al., 2006). Recently, Jacob et al (2009) showed that the divergent SET-domain proteins ATXR5 and ATXR6 are H3K27 monomethyltransferases that are not homologous to the Drosophila protein E(Z), but are the only enzymes that have been shown biochemically to catalyze the methylation of H3K27 in Arabidopsis and are involved in chromatin condensation and gene silencing. Furthermore, ATXR5 and ATXR6 form, in fact, a new class of H3K27 methyltransferases (Jacob et al, 2009). In a recent article, Berr et al (2009) provided evidence that recombinant ATXR7 proteins could methylate histone H3 from oligonucleosomes and that a the loss-of-function mutant sdg25-1 has an early-flowering phenotype associated with suppression of FLC expression and reduced levels of H3K36 di-methylation at FLC chromatin (Berr et al, 2009). What's more, Tamada et al (2009) have established that ATXR7, a putative Set1 class H3K4 methylase, is required for proper FLC expression. The rapid flowering of atxr7 is associated with reduced FLC expression and is accompanied by decreased H3K4 methylation and increased H3K27 methylation at FLC. Also, these researchers have indicated that the flowering phenotype of atx1 atxr7 double mutants is additive relative to those of single mutants. Therefore, both classes of H3K4 methylases (ATX1, ATXR7) appear to be required for proper regulation of FLC expression (Tamada et al, 2009).
Another Drosophila SET domain gene, absent, small, or homeotic discs 1 (Ash1), has also been classified as a trxG gene. Accordingly, sequence analysis of the TRX family in Arabidopsis has also identified gene homologs to the Drosophila Ash1 gene. Four proteins that group closely together with ASH1 and its yeast homolog SET2 are the ASH1-homologs: ASHH1/ SDG26, ASHH2/SDG8, ASHH3/SDG7 and ASHH4/SDG24, and three ASH1-related genes (ASHR1/SDG37, ASHR2/SDG39 and ASHR3/SDG4) (Baumbusch et al., 2001). Only a few of these genes have been partially characterized.
ASHH2 or SDG8, a 1759-amino-acid protein encoded by a gene containing 15 exons (At1g77300), shows the highest homology with SET2, the sole H3K36 HKMT of S. cerevisiae. The sequence homology between the two proteins is limited to the region spanning the SET domain and its surrounding cysteine-rich AWS (Associated With SET) and C (Cysteine-rich) domains (Zhao et al, 2005). ASHH2, also known as EFS, was originally isolated as a novel early-flowering mutant, early flowering in short days (efs), involved in controlling an inhibitor of flowering (Soppe et al, 1999). It has been shown that loss-of-function of ASHH2 results in reduced dimethylation of histone H3K36, particularly in chromatin associated with the FLC promoter and the first intron, regions that contain essential cis-elements for transcription. ashh2 mutants display reduced FLC expression and flower early, establishing SDG8-mediated H3K36 methylation as a novel epigenetic memory code required for FLC expression in preventing early flowering (Zhao et al, 2005). On the other hand, efs (ashh2) mutations suppress FLC expression in FRI-containing or autonomous pathway mutant backgrounds. Lesions in EFS also reduce the level of histone H3K4 trimethylation in FLC chromatin (Kim et al, 2005). These results indicate that ASHH2 is a multifunctional enzyme with H3K4 and H3K36 methylation activity. Taking into account that a knock-out mutation of ASHH2 has a pleiotropic effect, ashh2 mutants also exhibited increased shoot branching, repression of SPS (SUPERSHOOT) transcript, and a significant increase in UGT74E2 transcript (a UDP-glucosyltransferase). The altered expression of SPS and UGT74E2 correlates with changed H3 methylation patterns at both loci, suggesting that SDG8 plays also an important role in regulating the expression of genes controlling shoot branching in Arabidopsis (Dong et al, 2008). Recently, Cazzonelli et al (2009) reported that ASHH2 is also involved in regulating carotenoid biosynthesis by modifying the histone methylation status of chromatin surrounding the CAROTENOID ISOMERASE (CRTISO) gene, thereby reducing CRTISO transcript levels.
In contrast to the early-flowering phenotype of the sdg8 mutants, ashh1/sdg26 mutants show a late-flowering phenotype associated with up-regulation of the FLC gene, suggesting that ASHH1/SDG26 contributed essentially to maintaining repression of genome transcription (Xu et al, 2008). Although no specific histone methyl-transferase activity has been reported for ASHH1, yet.
Yeast two-hybrid data revealed that ASHR3 interacts with the putative MYC basic helix-loop-helix (bHLH) transcription factor ABORTED MICROSPORES (AMS), a key regulator of both anther development and stamen filament length (Sorensen et al., 2003). This suggests a role for ASHR3 in regulation of genes involved in stamen and anther development and function. Loss- or gain-of function of ASHR3/SDG4 causes male sterility (Cartagena et al., 2008; Thorstensen et al., 2008), indicating that SDG4 is capable of regulating the pollen tube growth in Arabidopsis by altering the expression of pollen-specific genes via histone methylation (Cartagena et al., 2008).
REGULATION OF GENE EXPRESSION BY PcG AND trxG PROTEINS
Histone Modifications
Studies in mammals, yeast and Drosophila have found conserved modifications at some residues of histones as well as non-conserved modifications at some other sites. Mass spectrometry, combined with chromatographic separation, has been used to analyze modifications of all core histones in Arabidopsis. This kind of analysis has confirmed acetylation and methylation at some conserved lysine residues in the four core histones (H2A, H2B, H3 and H4). These unique modifications include acetylation at K20 of H4, acetylation at K6, K11, K27 and K32, phosphorylation at S15 and ubiquitination at K143 of H2B, acetylation at K144 and phosphorylation at S129, S141 and S145 of H2A (reviewed in Zhang, K. et al 2007).
Histone modifications represent additional epigenetic information on chromatin that alters the functional properties of the underlying genetic information. Moreover, different histone methylation states or combinations between several methylation marks could additional discriminate different chromatin regions or entire chromosomes. Together with DNA methylation, chromatin remodeling and a variety of non-histone factors, this histone code forms a complex epigenetic code, and different biological systems have evolved different ways of implementing the histone marks suggesting that the ‘language’ is species-specific (Loidl, 2004). For example, H4K20 trimethylation in metazoan is a repressive mark in gene silencing mechanisms and suggests that the sequential induction of H3K9 and H4K20 trimethylation by distinct histone lysine methylation systems can index repressive chromatin domains (Schotta et al., 2004). However, in Schizosaccharomyces pombe, H4K20 methylation mediated by the single methyltransferase Set9 (with mono-, di-, and trimethylation activity), is involved in DNA damage and is required for the recruitment of Crb2, a protein involved in DNA-damage checkpoint signaling to DNA double-strand breaks (Du et al, 2006; Sanders et al, 2004). On the other hand, through mass spectrometry, it has been shown that lysine 20 of H4 in Arabidopsis is free of methylation marks, but that it is acetylated. This suggests that acetylation at lysine 20 alone or together with the nearby lysine 16 acetylation may play a role in activating transcription in Arabidopsis (Zhang, K. et al, 2007).
In soybean leaves, for example, mono-, di- and tri-methylation at Lysine 4, Lysine 27 and Lysine 36, and acetylation at Lysine 14, 18 and 23 were detected in histone H3. Lysine 27 was noticeably to being mono-methylated, while tri-methylation was predominant at Lysine 36. Lysine 27 methylation and Lysine 36 methylation usually excluded each other in soybean histone H3 (Wu et al, 2009). Although methylation at histone H3K79 has not been reported in A. thaliana, mono- and di-methylated H3K79 were detected in soybean, a highly conserved modification in non-plant systems as well (Zhang, K. et al 2007, Wu et al, 2009). Besides, in soybean two variants of histone H3 were detected (H3.1 and H3.2) and their methylation patterns also exhibited differences. That is, lysine 4 and lysine 36 methylation were only present in H3.2, suggesting this variant might be associated with actively transcribing genes. What's more, two variants of histone H4 (H4.1 and H4.2) were also detected in soybean by mass spectrometry, which were missing in other organisms (Wu et al, 2009). These results illustrate that although the amino acid sequences of histones have been conserved in evolution, their modification patterns are rather different.
Histone modification patterns are critical for establishing and maintaining stable epigenetic states in Arabidopsis. For example, maintenance of the appropriate expression patterns of the AGAMOUS gene, in reproductive and non-reproductive tissue, involves the opposite activities of Polycomb group (PcG) and trithorax group (trxG) proteins. The PcG gene CURLY LEAF, a component of the PRC2 (Chanvivattana et al. 2004), acts as a direct transcriptional repressor of AG expression in leaves, inflorescences, and the outer whorls of flowers (Goodrich et al. 1997) and mediates H3K27me3. On the contrary, the TrxG gene ATX1, an H3K4me3 (Alvarez-Venegas et al. 2003; Alvarez-Venegas and Avramova 2005; Saleh et al. 2007), maintains high-level AG transcription in flowers, suggesting that ATX1 is required to maintain the normal expression level of AG and antagonizes the repressive activity of CLF (Alvarez-Venegas et al. 2003). It is important to point out that nucleosomes at silent AG loci carry both, the activating H3K4me3 and the repressive H3K27me3 marks (Saleh et al. 2007), a bivalent chromatin state of silent genes poised for transcription later in life (Bernstein et al, 2006). Simultaneous loss of ATX1 and CLF restored AG repression and normalized leaf phenotypes (Saleh et al. 2007). Thus, CLF and ATX1 maintain the AG locus in either a repressed or an active state in a tissue specific manner. Recently, Carles and Fletcher (2009) showed that in an ult1 (ultrapetala-1) clf-2 double mutants, the different ult1 mutant alleles independently rescued all clf-2 vegetative and reproductive defects, indicating that ULT1 (a SAND domain protein) and CLF have opposite effects on plant development. And because ult1 mutations suppressed all PcG clf-2 mutant phenotypes, ULT1 was considered a trxG gene. Specifically, ULT1 binds to AG regulatory sequences during flower development, physical interacts with ATX1 inside the nucleus, and regulates the deposition of the epigenetic marks, preventing inappropriate PcG silencing of the AG locus in the center of the flower. During the switch of the AG locus from a repressed to an active state, ULT1 seems to function as a co-activator to recruit additional trxG proteins, such as ATX1, involved in subsequent local H3K4 methylation and/or reading of the chromatin marks for transcription initiation and elongation (Carles and Fletcher, 2009). These kinds of results expand the list of epigenetic regulators involved in plant development.
Another example in plants is vernalization, which increases H3K9 and H3K27 dimethylation and decreases H3K4 trimethylation and histone acetylation at the FLC locus, causing a stable repression of FLC that is maintained through mitosis even at warm temperatures (Bastow et al, 2004; Sung and Amasino 2004). Loss of function of VRN2 or VIN3 leads to a loss of the vernalization response and a failure to down-regulate FLC after vernalization. VIN3 protein binds to regions of the promoter and first intron of FLC. VIN3 mRNA is present at very low abundance during growth at warm temperatures, with expression increasing progressively during a vernalization treatment and returning to pre-vernalized levels when the plant is returned to normal temperatures (Sung and Amasino, 2004). This cold-driven accumulation of VIN3 mRNA seems to be part of a mechanism to time the duration of vernalization and ensure that short cold periods do not promote flowering. Meanwhile, VRN2 is associated with the PcG protein homologues FIE, SWINGER and CURLY LEAF (CLF) in a PRC2-like complex (that might include VIN3) and these proteins are required for the repression of FLC by vernalization (Wood et al, 2006). The repression of FLC expression after vernalization is accompanied by modifications to histones associated with the FLC locus. After vernalization, H3Ac and H3K4me3 are reduced and H3K9me2 and H3K27me2 are increased. These changes suggest that the formation of a repressed chromatin state at FLC after vernalization is the basis of the epigenetic regulation of FLC (Wood et al, 2006). On the other hand, it has been shown that levels of H3K4me3 are increased in actively transcribed FLC chromatin (He et al., 2004; Pien et al., 2008). An ELF7-containing complex known as PAF1c is required for FLC upregulation and for the associated H3K4me3 increase in FLC in the FRI background or AP mutants (He et al., 2004; Oh et al., 2004). Furthermore, ARABIDOPSIS HOMOLOG OF TRITHORAX-1 (ATX1), an H3K4 methyltransferase (Alvarez-Venegas et al., 2003), is also required for H3K4me3 in FLC, and the atx1 mutation moderately suppresses FLC expression in the FRI background (Pien et al., 2008). In addition, ARABIDOPSIS HOMOLOG OF TRITHORAX-2, is also involved in FLC regulation because the atx1 atx2 double mutation strongly suppresses FLC expression in the FRI background (see Figure 5; Pien et al., 2008).
Regardless of the specific usage of the histone-tail marks, acetylated histones and methylated histone H3 lysines 4 and 36 are generally associated with transcribed genes, while deacetylated histones and methylated lysines 9 and 27 are representing silent loci (Kouzarides, 2007). However, new evidence is pointing to more complex correlations than simply activating/silencing tags. For example, histone deacetylation of the coding regions in transcribed genes has been linked directly with active transcription (elongating RNA polymerase II transcription complexes) and with histone H3K36me2, a mark of actively transcribed genes (Keogh et al., 2005); simultaneously present H3K4me3 and H3K27me3 marks found at silent genes in embryonic stem cells suggested that the co-existence of activating and silencing nucleosomal modifications establish a bivalent chromatin state at loci ‘poised’ for transcription later in development (Bernstein et al., 2006). It is significant that the chromatin at the flower homeotic gene locus AG, is similarly tagged by H3K4me3 and H3K27me3 in its silent state (Saleh et al., 2007), suggesting that dual methylations might be chromatin marks for genes involved in plant developmental processes as well. Furthermore, Arabidopsis genes may carry methylated H3K9, H3K27 and H3K4 in various combinations in a gene-, tissue- or developmentally controlled patterns. Absence of H3K4me3 tags does not necessarily correlate with low expression levels (Alvarez-Venegas and Avramova, 2005; Saleh et al., 2008b). Thereby, correlations between histone methylation profiles and gene activity appear to be much more complex. Whether histone H3 lysine methylation modifications precede or trail established transcriptionally active states in plants is something that has to be decoded (Avramova, 2009).
Recent work has combined chromatin immunoprecipitation and high-resolution whole-genome tiling microarrays (ChIP-chip) to characterize genome-wide distribution patterns H3K4me1, H3K4me2 and H3K4me3, and to identify H3K27me3-associated regions across the entire genome of Arabidopsis thaliana at high resolution (Zhang et al, 2007a; Zhang et al, 2009). These results have shown that unlike in mammalian cells, H3K4me3 and H3K27me3 do not preferentially co-localize on a genomewide level in Arabidopsis, and also that in plants the presence of H3K4me3 is usually correlated with active transcription (unlike in mammals, where H3K4me3 is present at active promoters as well as in ‘poised’ promoters) (Zhang et al, 2009). This kind of research indicates that although the histone modifications are conserved, fundamental differences between plants and animals exist in the mechanisms by which the different marks are established or maintained.
PcG/Trx-G Counteracting Activities in Plants vs. Metazoans
As mentioned before, the patterns of growth and development differ dramatically between plants and animals. In animals, the pattern of the body plan is established early in embryonic development and the activity of genes established at the initial stage is faithfully propagated during subsequent cellular divisions through the activity of genes from the trithorax group and the Polycomb group (Simons and Tamkun, 2002). When animals have developed into adult organisms, growth and morphogenesis cease and cell division mainly replaces dead cells or specialized cells that undergo continuous turnover. In contrast to animals, organ development and growth in plants is not restricted to the embryonic stage: the lateral organs (leaves), the reproductive organs (flowers), and the seeds originate from the same undifferentiated meristem active throughout the life cycle. However, in plants, as in animals, development of a wrong organ at a wrong place (homeosis) is a consequence of a mutation of a homeotic gene. Unlike the animal counterparts, plant homeotic genes are not clustered and belong to the MADS-box family of transcription factors. In the absence of migration of cells, plant morphogenesis is determined by cell division and expansion. Because cell division occurs preferentially in meristematic regions, the identity of a cell that leaves the meristematic region is determined mostly by the position of the cell relative to the position of its neighbors (Loidl, 2004). In addition, plant cells are much more prone to environmental stress due to their immobility. Altogether, these basic differences indicate that plants developed mechanisms of gene regulation that are distinct from those of animals. For example, plants differ in the histone code they use and in the enzymes involved (members of PcG and TrxG). Some of the main differences discovered relate to the sites of modification, the presence of a whole plant specific HDAC family, as well as distinct ways of regulating histone modifying enzyme activity (Loidl, 2004).
One illustration of PcG/Trx-G counteracting activities relates to the conserved complex Polycomb Repressive Complex 2. PRC2 functions to maintain patterns of gene repression in both plants and animals, using H3K27 methylation. However, in plants, there are several PRC2 complexes, with overlapping subunit compositions, and specialized for distinct developmental roles. For example, PcG proteins are involved in the regulation of imprinted gene expression. In Arabidopsis, MEA shows maternally imprinted expression and this process has been found to involve MEA autoregulation, using H3K27 trimethylation (Baroux et al, 2006). In a similar way, the mammalian PcG protein EED (embryonic ectoderm development) has also been shown to have a role in the control of imprinted gene expression. Additionally, PcG mediated regulation in Arabidopsis involves silencing of the FLOWERING LOCUS C gene during the vernalization response by a distinct PRC2 (Sung and Amasino, 2004).
On the other hand, unique modifications in plants include H2B K6, K11, K27 and K32 acetylation, S15 phosphorylation and K143 ubiquitination, and H2A K144 acetylation and S129, S141 and S145 phosphorylation, and H2A.X (histone variant) S138 phosphorylation. Also H3K79, which is highly conserved and modified by methylation and plays important roles in telomeric silencing in non-plant systems, is not modified in Arabidopsis. (Zhang, K. et al, 2007). In Arabidopsis thaliana, Glycine max, and in mammals, three lysine residues of H3 (lysines 14, 18 and 23) can be acetylated, but only in Arabidopsis lysine 56 has been show to be acetylated as opposite to metazoans and soybean, where modification in these residues are not detected, raising the probability of more combinatorial modifications regulating gene expression in Arabidopsis (Wu et al, 2009). Similarly, methylation of H3K64 has only been detected in mammals, while acetylation of H4K20 has been detected in Arabidopsis, but not in soybean or mammals, and this same residue is methylated in mammals and soybean, and not in Arabidopsis (Wu et al, 2009). These unique modifications reveal distinctive histone modification patterns in plants when compared to metazoans.
In yeast, SET1 is responsible for the overall chromatin modification and for establishing mono-, di-, and trimethyl- H3K4 marks (Bernstein et al., 2002; Santos-Rosa et al., 2002), whereas in Chlamydomonas, a SET1-homolog deposits a K4-monomethyl mark (van Dijk et al., 2005). In contrast, animal and plant Trithorax enzymes modify only a limited fraction of the target nucleosomes, implying involvement of multiple K4 methyltransferases (Alvarez-Venegas and Avramova, 2005; Wysocka et al., 2003). But, unlike patterns reported in animals or yeast (Schneider et al, 2004), in Arabidopsis it has been shown that in certain genes H3K4m3 always co-localized with H3K4m2 at both the 5′-end and downstream gene regions and that absent H3K4m3 did not necessarily correlate with low levels of gene expression (Alvarez-Venegas and Avramova, 2005).
Finally, in Drosophila, the PcG and TrxG proteins bind to the cis-regulatory elements, the Polycomb/Trithorax Response Elements (PRE/TREs), considered bi-stable switchable elements that can act as activators or silencers (Hekimoglu and Ringrose, 2009). In mammals, PRE/TRE sequences are less well defined, but the PcG and TrxG proteins also constitute a switchable system that acts on common target genes. However, in plants no epigenetic DNA elements equivalent to PRE/TRE have been found. Thus, the identification of PRE/TRE in plants, if present, will allow us to determine PcG and TrxG target genes and could contribute to establish the effect that epigenetic complexes have on plant development.
CONCLUSION
Recent advances in biological research such as high-throughput sequencing and the combination of chromatin immunoprecipitation (“ChIP”) with microarray technology (“chip”), or ChIP-on-chip (also known as ChIP-chip), are allowing the generation of large-scale data sets for epigenetic modifications that are widening our analysis of the mechanisms of epigenetic regulation to a genome-wide scale. For example, the ChIP-on-chip assay can be used to study gene regulation by the distribution of epigenetic modifications, such as histone and/or DNA modification, and their localizations. This will allow us to determine how epigenetic information and regulation is propagated. Alternatively, PTMs and histone variants can also be determined at a whole-genome level by using mass spectrometry (matrix-assisted laser desorption/lonization-time-of-flight mass spectrometry, MALDI-TOF), in combination with nano-liquid chromatography (nano-LC). However, more sensitive and higher resolution MS machinery is needed, since the individual PTMs of every histone vary in different tissues and developmental stages cannot be detected at the present time. These and other approaches should provide important insight in our understanding of epigenetics in plant systems, for example: how the plant epigenome changes in response to developmental or environmental stimuli, how chromatin modifications are established and maintained, to which degree they are used throughout the genome, how chromatin modifications influence each another, and how epigenetically distinct chromatin compartments are established and maintained.
On the other hand, more research is needed in order to determine whether or not plant PRE/TREs exists and how they are organized, if plant PRC1 complexes “truly” exist and their composition, and how many distinct PRC2 complexes exist in plants. Also it will be remarkable the identification and characterization of plant trxG complexes and if more PcG and TrxG proteins function as specific pairs in generating bivalent chromatin marks, as it has been shown with the interaction between ATX1 and CLF.