Epigenetic aspects of differentiation

A major challenge in biology is to understand how genetic information is interpreted to direct the formation of specialized tissues within a multicellular organism. During differentiation, changes in chromatin structure and nuclear organization establish heritable patterns of gene expression in response to signals. Epigenetic states can be broadly divided into three categories: euchromatin, constitutive heterochromatin and facultative hetereochromatin. Although the static epigenetic profiles of expressed and silent loci are relatively well characterized, less is known about the transition between active and repressed states. Furthermore, it is important to expand on localized models of chromatin structure at specific genetic addresses to examine the entire nucleus. Changes in nuclear organization, replication timing and global chromatin modifications should be integrated when attempting to describe the epigenetic signature of a given cell type. It is also crucial to examine the temporal aspect of these changes. In this context, the capacity for cellular differentiation reflects both the repertoire of available transcription factors and the accessibility of cis-regulatory elements, which is governed by chromatin structure. Understanding this interplay between epigenetics and transcription will help us to understand differentiation pathways and, ultimately, to manipulate or reverse them.

of H3, and Arg4 of H4, is associated with transcriptional activation, and there is evidence of cross-talk between this pathway and histone acetylation .
A relatively new area of research is the study of modified residues within the globular domains of histone proteins, in particular methylation of H3 Lys79. At the mammalian βglobin locus, Lys79 methylation is associated with gene expression (Im et al., 2003). Ng and colleagues have also shown that H3 Lys79 hypomethylation is associated with gene silencing in both yeast and mammals, suggesting that methylation of this residue is a hallmark of euchromatin (Ng et al., 2003). Conversely, deletion of the Saccharomyces cerevisiae histone methyltransferase (HMTase) Dot1p, which is specific for H3 Lys79, leads to a loss of silencing within telomeric regions mediated by Sir proteins (van Leeuwen et al., 2002). These apparently contradictory results can be explained by proposing a role for H3 Lys79 methylation as an 'antisilencer', acting to exclude silencing factors such as Sir proteins from active genes and establish domains of transcription. In DOT1 mutants, Sir proteins are proposed to relocate from the telomeres to newly hypomethylated regions elsewhere in the genome, leading to a loss of telomeric silencing (Ng et al., 2003;van Leeuwen et al., 2002).
Histone octamers generally contain H2A, H2B, H3 and H4, but there are several additional variant forms of some of these proteins. For example, the variant H2A.Z is found at euchromatic regions and may have an important role in rendering regions 'poised' for transcription, although it is dispensable for maintaining gene activation (Santisteban et al., 2000). By contrast, the variant macroH2A is enriched on the inactive X chromosome (Costanzi and Pehrson, 1998), which indicates a correlation with facultative heterochromatin formation (see below). CENP-A and Drosophila Cid are variant forms of histone H3 found within nucleosomes at centromeric regions (Henikoff et al., 2000;Malik and Henikoff, 2001). Recently, much interest has been focused on the H3 variant H3.3. This variant is deposited within chromatin independently of DNA replication; this is unlike H3.1, which is synthesized and deposited during S phase. H3.3 is enriched at sites of transcription (Ahmad and Henikoff, 2002a;Ahmad and Henikoff, 2002b) and also accumulates in non-cycling cells, such as neurons (Bosch and Suau, 1995), myotubes (Wunsch et al., 1987) and human lymphocytes (Wu et al., 1983).
Many 'euchromatic' regions of the genome probably contain a mixture of transcribed or potentially active loci interspersed with transiently silenced genes and areas of established facultative heterochromatin. Actively transcribed regions are often described as having an 'open' chromatin structure, comprising nucleosomes that are loosely or irregularly packed compared with the tightly packed, regular nucleosomal arrays seen at silenced regions. The link between open chromatin conformation and gene activity was described as long ago as 1976 (Weintraub and Groudine, 1976). In the intervening decades, this concept has been widely explored and we now know that chromatin-remodelling factors, which can slide or reposition nucleosomes on DNA templates, have an important role in establishing and maintaining euchromatic domains (Havas et al., 2001). There is also evidence for non-genic transcription playing a role in opening up chromatin for gene expression, presumably with the aid of transcription-linked remodelling complexes, or even in gene silencing (reviewed by Cook, 2003). Such an activity has been described at the β-Journal of Cell Science 117 (19) globin locus (Gribnau et al., 2000) and at the bithorax gene complex of Drosophila Drewell et al., 2002).

Constitutive heterochromatin
Constitutive heterochromatin describes the highly condensed regions of the genome that are visible as bright nuclear areas following staining with DNA dyes such as 4′,6-diamidino-2phenylindole (DAPI) and Hoechst. These regions often comprise repetitive DNA (such as satellite sequences surrounding centromeres) and are generally thought to be 'gene poor'. Such areas can exert a strong repressive effect on gene transcription, for example when a gene is inserted within or close to satellite repeats (Schotta et al., 2003). The hallmarks of constitutive heterochromatin (reviewed by Lachner et al., 2003) include trimethylation at Lys9 of histone H3, a paucity of methylation at H3 Lys4 and trimethylation of H4 Lys20 (Kourmouli et al., 2004;Schotta et al., 2004) (Table 1).
Heterochromatin-associated proteins such as HP1α and HP1β are highly enriched at centromeric heterochromatin in mouse cells (Minc et al., 1999), by virtue of interactions between the HP1-family chromodomain and methylated histone H3 Lys9 Lachner et al., 2001). In human cells, HP1α is significantly enriched at centromeres , and HP1 interactions are generally thought to contribute to the stable formation of condensed chromatin structures. These structures might be required to maintain genomic stability and contribute to chromosome compaction during mitosis. Several HMTases, including SuVar3-9, have been found at constitutive heterochromatin domains in cells, which is consistent with their presumed role in maintaining H3 Lys9 methylation levels in these regions (Aagaard et al., 1999).
Additional characteristics of constitutive heterochromatin domains include generalized hypoacetylation of histones, DNA methylation and replication late in S-phase. Emerging evidence also suggests that histone H3 methylation recruits DNA methyltransferase activity to these regions, thereby providing a 'belt and braces' mechanism for heterochromatin formation (Lehnertz et al., 2003). Furthermore, experiments with fission yeast indicate that small RNA species are transcribed from centromeric repeats Volpe et al., 2002). This suggests that the RNA interference (RNAi) machinery might play a role in the establishment and maintenance of heterochromatic structure in these domains, through the recruitment of HMTases and heterochromatin proteins. There are also hints that a similar mechanism might be at work in higher organisms. Maison et al. documented the importance of an RNA component in maintaining pericentromeric heterochromatin in mammalian cells (Maison et al., 2002), whereas Lehnertz and co-workers recently described the transcription of centromeric satellite sequences in murine embryonic stem cells (Lehnertz et al., 2003).

Facultative heterochromatin
The term facultative heterochromatin describes a previously permissive chromatin environment that is subject to transcriptional silencing. Such an environment is often confused with constitutive heterochromatin, but the two are distinct. The precise nature of the chromatin structures found within facultative heterochromatin has not been fully defined, and different genes might employ a variety of silencing mechanisms. Unlike constitutive heterochromatin, regions within the genome that are silenced in this way often cannot be visualized, although they might associate together (or with constitutive heterochromatin) to form silent domains within the nucleus. One notable exception is the formation of the Barr body on the inactivated X chromosome in female mammalian cells, which is visible as a discrete DNA-dense structure often located close to the nuclear periphery in female somatic cells (Barr and Bertram, 1949).
DNA methylation and characteristic histone modifications typify the repressive environment of facultative heterochromatin. DNA methylation is associated with silenced genes, particularly at promoters and some other regulatory elements, such as those found at imprinted loci (Table 1) (Attwood et al., 2002). Methylation marks can also be lost upon transcriptional activation (e.g. Lefevre et al., 2003). Much of the evidence for changes in DNA methylation upon differentiation has been obtained from studies of viral and transgenic loci, although changes in DNA methylation at endogenous genes are commonly associated with cancer (Claus and Lubbert, 2003). Histone modifications that occur within regions of facultative heterochromatin include dimethylation at H3 Lys9 and hypoacetylation of lysine residues within H3 and H4, commonly around promoter regions. Methylation of H3 Lys27 has also been proposed to be a feature of facultative heterochromatin and is primarily associated with transcriptional silencing by Polycomb group proteins (PcG) as well as with X inactivation (see later).

Silencing mechanisms
Methylation of H3 Lys36 in S. cerevisiae has been correlated with transcriptional repression (Strahl et al., 2002). However, further work has shown an association of this modification with transcriptional elongation (Schaft et al., 2003), which presumably places H3 Lys36 methylation in the euchromatic category. Mono-and dimethylated H4 Lys20 are distributed within the euchromatic compartment of mammalian nuclei (Schotta et al., 2004), although the speckled distribution might relate to regions of facultative heterochromatin rather than to active genes. Indeed, monomethylated H4 Lys20 is enriched on the inactive X chromosome, indicating a link between this modification and gene silencing.
The enzymes responsible for many of these modifications have been characterized (Lachner et al., 2003). In total, 32 residues within the four mammalian core histones are documented or potential sites of covalent modification (acetylation, phosphorylation, methylation or ubiquitylation) and yet it is not clear precisely how these interact or correlate with transcriptional states. The concept of a histone code, in which specific histone modifications dictate transcriptional outcomes, has been postulated (e.g. Jenuwein and Allis, 2001;Strahl and Allis, 2000). This has been taken a step further with the proposal that adjacent sites of modification act as binary switches to control gene activity (Fischle et al., 2003). However, much work remains to be done before we can hope to understand fully how the proposed histone code directs or reflects gene activity during differentiation.
Stable gene silencing often involves the recruitment of specific protein factors. Although the presence of HP1 proteins at centromeric heterochromatin, bound to methylated histone H3, is well characterized (Lachner et al., 2001), the role of HP1 in facultative heterochromatin is less well defined. Artificial recruitment of HP1 to a euchromatic promoter can induce gene silencing (Ayyanathan et al., 2003), although whether this silencing mechanism is used during normal differentiation has yet to be established. Some evidence suggests HP1 can be recruited to aid silencing mediated by the retinoblastoma tumour supressor protein (Rb) (Nielsen et al., 2001), as well as silencing in triplet repeat expansions (Saveliev et al., 2003). However, there is also some evidence that HP1 might be dispensable for the formation of facultative heterochromatinfor example, during erythroid differentiation .
Other proteins that participate in gene silencing during differentiation include chromatin-remodelling complexes such as SWI/SNF (Langst and Becker, 2001) and tissue-specific factors such as the DNA-binding protein Ikaros in lymphocytes (Sabbattini et al., 2001;Trinh et al., 2001). In addition, the complementary PcG and Trithorax (Trx) group proteins (Orlando, 2003), which have opposing roles in maintaining the inactive or active state of genes through cell division, have recently been shown to affect chromatin. For example, mammalian Eed (a homologue of the Drosophila PcG protein Extra sex combs) associates with a histone deacetylase (van der Vlag and Otte, 1999) and also with the PcG protein Enhancer of Zeste (Ezh2 or Enx1 in mammals). Ezh2 has HMTase activity, targeting Lys27 of histone H3, and this modification is thought to recruit additional PcG members to stabilize silencing at genes regulated by Polycomb response elements (Cao et al., 2002;Czermin et al., 2002). By contrast, the Trx member ASH1 functions as an H3 Lys4 (activating) HMTase. The discovery that PcG and Trx proteins can exert chromatinmodifying effects provides a means to understand, at a mechanistic level, how patterns of gene activity can be changed, maintained and inherited.
Inactivation of one X chromosome in female mammals ensures gene dosage compensation between males and females. The establishment and maintenance of this silent state has often been used as a classic example of facultative heterochromatin formation (reviewed by Plath et al., 2002;Heard, 2004). Certainly, the process has many features that might be common to other examples of gene silencing. Random X inactivation occurs in the early embryo, initially in cells derived from the inner cell mass of the blastocyst. Conveniently, aspects of the process can be easily studied in ES cells as they undergo differentiation.
One of the earliest recognized events is expression of the noncoding RNA Xist initially from both X chromosomes. Ultimately, transcription is stabilized only from the prospective inactive X and the Xist RNA coats the silenced chromosome. Establishment of Xist expression is regulated by the transcription of an antisense RNA, Tsix (Lee et al., 1999), which blocks Xist transcription on the future active X chromosome. Xist expression may also be affected by other noncoding RNAs, such as the newly discovered Xite (Ogawa and Lee, 2003). However, recent evidence suggests that certain chromatin modifications may precede stable monoallelic Xist transcription, at least in female ES cells. O'Neill and coworkers have described core histone hyperacetylation, H3 Lys4 dimethylation and H3 Lys9 hypomethylation on both of the active X chromosomes prior to the onset of inactivation in undifferentiated female ES cells (O'Neill et al., 2003). This cocktail of modifications is distinct from the marks on the X chromosome in male cells or those at autosomal loci, and might signal the presence of more than one X chromosome.
The transient recruitment of the Eed/Ezh2 polycomb complex, and associated H3 Lys27 methylation, to the future inactive X chromosome is another early event in the inactivation process (Plath et al., 2003;Silva et al., 2003). Subsequently, further heterochromatic proteins and chromatinmodifying enzymes are recruited. The fully inactivated X chromosome is characterized by hypoacetylated histones H2A, H2B, H3 and H4 (Boggs et al., 1996;Jeppesen and Turner, 1993), dimethylated H3 Lys9 and trimethylated H3 Lys27 with hypomethylation at H3 Lys4 (Chadwick and Willard, 2003), enrichment of the histone variant macroH2A1 (Costanzi and Pehrson, 1998) and DNA methylation at the promoters of silenced genes (Pfeifer et al., 1990), and it replicates late in S phase (Morishima et al., 1962). Intriguingly, promoter-specific dimethylation of H3 Lys4 is also found at autosomal imprinted loci as well as genes subject to X inactivation, which suggests that it might be a conserved epigenetic mark for monoallelic expression (Rougeulle et al., 2003). Despite the majority of the X chromosome entering a silent state, several genes in the murine distal pseudoautosomal region as well as Xist escape inactivation. In humans, an increasing number of genes are believed to be expressed from both X chromosomes (Carrel et al., 1999), although the mechanism by which this is achieved is unknown.
Although parallels can be drawn between the permanent silencing of virtually an entire chromosome (as in the case of X inactivation) and differentiation-induced silencing of genes in discrete autosomal locations, it is likely that slightly different mechanisms operate. In particular, little is known about dynamic changes in chromatin structure upon gene silencing during development, although the molecular aspects of gene activation have been studied in various systems (reviewed by Cosma, 2002), notably during haematopoiesis (e.g. Bottardi et al., 2003;Kontaraki et al., 2000;Lefevre et al., 2003;Tagoh et al., 2004). Here, for example, the activation of a transgenic chicken lysozyme locus in mouse bone marrow first requires DNA demethylation and chromatin remodelling at key regulatory elements, followed by transcription factor binding and histone modification (Lefevre et al., 2003;Tagoh et al., 2004).
In contrast to descriptions of gene activation, there are few well-characterized examples of genomic regions that are progressively silenced during development -a fact that might reflect the paucity of good model systems. Recent work has examined the murine terminal transferase Dntt locus, a gene that becomes silenced during thymocyte maturation. Su et al. discovered that Dntt silencing is nucleated by the deacetylation of H3 Lys9 at the promoter within 2-6 hours of the onset of differentiation, along with repositioning of the gene to the pericentromeric heterochromatin compartment and a loss of chromatin accessibility (Su et al., 2004). Loss of methylation at H3 Lys4 follows within 4-12 hours, along with methylation of H3 Lys9 shortly after. These modifications then spread bidirectionally from the promoter at a predictable rate (2 kb/hour) to ensure permanent, heritable silencing of the gene.
An increase in CpG methylation can only be detected at the locus much later during differentiation, when the mature thymocytes migrate to the spleen (Su et al., 2004). Studies of the IL4 gene, which is also downregulated during thymocyte differentiation, indicate that a concerted effort between DNA methyltransferases, HMTases, transcription factors and repressors is required to establish and maintain gene silencing (Makar et al., 2003). A further discussion of the role of DNA methylation and demethylation during differentiation within the immune system can be found elsewhere .
It is too early to tell from current studies of differentiating systems if the mechanisms of gene activation and repression are truly mirror images of each other. Certainly it appears that DNA methylation is a late event in gene silencing, and one of the first epigenetic marks to be removed upon activation. One of the real challenges of future work is to discover the temporal relationship between epigenetic modifications, describing the establishment of chromatin states in 'real time'. It is important to bear in mind that the difficulties of such analyses may be compounded by changes in epigenetic states throughout the cell cycle (Bailis and Forsburg, 2003;McNairn and Gilbert, 2003). A further aspect for research is examination of how these different modifications are mechanistically inter-related. In this respect, recent evidence suggests that the pathways mediating histone methylation and DNA methylation might be linked. Firm evidence for this has been established in organisms such as Neurospora and Arabidopsis (Soppe et al., 2002;Tamaru et al., 2003) and there are exciting indications that similar mechanisms allow cross-talk in mammalian cells (Fuks et al., 2003;Lehnertz et al., 2003).

Going global
Linear DNA sequence, arranged within chromosomes, has a three-dimensional structure within the nucleus, and there is emerging evidence that the genome is functionally compartmentalized (reviewed by van Driel et al., 2003). Although much attention has focused on chromatin changes at specific loci during differentiation, it is also important to consider changes that might take place on a genome-wide scale as cells reorganize their gene expression profiles. Changes in nuclear organization might reflect differentiation and indeed could contribute to cell memory by 'indexing' the genome. One highly visible example of nuclear reorganization is the clustering of constitutive heterochromatin into distinct foci within the nucleus of mammalian cells. These clusters are not static but can change during differentiation. Examples of this include the reorganization and reduction in number of centromeric clusters in differentiating mouse ES cells and C2C12 myoblasts upon differentiation into myotubes (R. Williams, R. Terranova and A.F., unpublished), as well as during retinoic-acid-induced differentiation of promyelocytic leukaemia cells (Beil et al., 2002). A reduced number of centromeric clusters is also seen in sertoli cells upon activation of ribosomal (r)RNA synthesis, presumably reflecting the close proximity of the rRNA genes to centromeres (Haaf et al., 1990). A particularly striking example of centromeric reorganization can be seen following fertilization in the mouse, in which centromeres reorganize from large spherical structures into typical clusters from the one-cell to two-cell stage (Arney et al., 2002). The transient clustering of centromeres could reflect key differentiation events occurring within specific time windows (Martou and De Boni, 2000). Analysis using chromosome-specific centromeric probes indicates that there are non-random centromere associations within these heterochromatic clusters and that these combinations might vary according to cell type (Alcobia et al., 2000;Alcobia et al., 2003).
A striking example of global changes in epigenetic modifications during differentiation can be seen during the preimplantation stages of mouse embryonic development. Although dynamic changes in both histone modification and DNA methylation occur in the newly fertilized zygote (Arney et al., 2002;Santos et al., 2002), it is difficult to correlate these with differentiation at this stage. The first true differentiation step in early development is the distinction between the inner cell mass, which goes on to form the embryo, and trophectodermal cells, which form extraembryonic tissues. During the cleavage stages of pre-implantation development there is a global reduction in DNA methylation. However, at the blastocyst stage, an increase in both DNA methylation and histone methylation can clearly be seen in the cells of the inner cell mass compared with the surrounding trophectodermal cells, indicating a global epigenetic difference between these two cell types (Reik et al., 2003;Santos et al., 2002). Whether this difference merely reflects an underlying differentiation process or is in fact a key driving force remains to be determined.
Global organization can also be seen at the level of protein localization. For example, the co-repressor protein TIF1β is diffusely distributed throughout the nucleus in undifferentiated murine embryonal carcinoma cells. Following retinoic acid treatment, the cells differentiate towards a primitive endoderm fate, and the protein relocates to centromeric heterochromatic foci (Cammas et al., 2002). This relocation depends upon interaction of TIF1β with HP1. Conversely, activator proteins might also relocalize to establish active transcriptional domains. In undifferentiated murine erythroleukaemia (MEL) cells, the small subunit of the transactivator NF-E2 localizes to centromeric heterochromatic foci. Upon differentiation of MEL cells towards an erythroid fate, this subunit relocates to the euchromatic compartment, and this correlates with the relocation of the β-globin gene from a pericentromeric location to a euchromatic one (Francastel et al., 2001). Again, this suggests that global genomic reorganization upon differentiation indexes genes by placing them in active or inactive domains and thus takes us from a two-dimensional model of gene regulation to three dimensions.
Another aspect of nuclear organization is the timing of replication of different genomic regions during S phase. Constitutive heterochromatin replicates late in S phase, whereas transcriptionally active chromatin replicates earlier (reviewed by Gilbert, 2002). Analysis using fluorescence in situ hybridization (FISH) techniques suggests that genes subject to heterochromatin-mediated repression replicate late in S phase. Moreover, in yeast (Raghuraman et al., 2001) and Drosophila (Schubeler et al., 2002), microarray analyses reveal a broad global correlation between early replication timing and transcriptional activity. However, analysis of the replication timing of specific genes in mammalian cells, using a sensitive BrdU labelling technique, has shown that many leukocyte-specific genes replicate early in S phase in lymphocytes, where they are expressed, and fibroblasts, where they are silent (Azuara et al., 2003).

Implications of epigenetic signatures for development and differentiation
We are increasingly familiar with the concept that transcriptional profiles, determined by microarray or proteomic techniques, can be used to define the collective gene expression output of a cell. Genome-wide chromatin profiling to define a cellular 'epigenetic signature' may, in the longer term, provide an alternative way of defining the differentiation potential of a cell. The idea of an epigenetic signature can be applied at a local as well as a global level to describe chromatin modifications at specific loci, groups of genes or the genome as a whole. Differentiation requires changes in transcriptional output, usually in response to external signals. Yet, in many situations, similar or identical signalling pathways are used to achieve very different outcomes. One example is Wnt signalling, which is utilized in many stages of embryonic development as well as in various circumstances in adult tissues. There are several divergent downstream signalling cascades that can be triggered by Wnt receptor activation (Veeman et al., 2003) but it is the canonical β-catenin pathway that is most closely linked to transcriptional regulation of a large number of target genes (Huelsken and Behrens, 2002;Miller et al., 1999). Although there are many different Wnt molecules, the downstream β-catenin pathway is virtually identical in all cells. So how are many different cell fates and gene expression patterns specified by such similar mechanisms?
Two different models can account for this. In the first case (Fig. 1, model A) transcription is controlled by regulating the availability of various factors that specifically bind to cisregulatory elements or promoters and stimulate gene expression. These factors include co-activators, such as components of signalling pathways, which are made available in response to extracellular signals for differentiation. In the second model (Fig. 1, model B), transcription is controlled by regulating the accessibility of promoters, and this is governed by local chromatin structure rather than by the presence of dedicated transcriptional activators. These two models are not mutually exclusive, and aspects of both are probably required if we are to explain the generation of tissue-specific gene expression patterns. In different cell types where similar transcription co-factors are expressed but distinct transcriptional outcomes are required, the silencing of particular genes by chromatin-based mechanisms may be of paramount importance. Heritable epigenetic marks are probably crucial for fixing patterns of gene expression -acting as a cellular memory -as well as setting the stage for future differentiation steps.

Conclusion
Although we can make broad generalizations about changes in chromatin status during differentiation at the level of individual loci, it is much harder to extrapolate these to the level of the whole nucleus or to other genes in different cell types. It is likely that there is a much wider range of chromatin 'flavours' than the trinity of constitutive heterochromatin, facultative heterochromatin and euchromatin. Only when we can define both the local and global nuclear changes that take place will we fully understand how cell fate is determined, remembered and inherited. This information will also show us how differentiation might be reversed. Reprogramming of cell fate can be achieved by cloning (Mullins et al., 2003), cell fusion (e.g. Pells et al., 2002) and even by incubation in protein extracts (Landsverk et al., 2002;Sullivan et al., 2003). However, we only know how the epigenetic effects of these processes manifest themselves at a few genetic loci. From DNA sequence to local chromatin organization to nuclear architecture, the future requires that we integrate these levels of analysis within the temporal context of development. We must take chromatin from the two-dimensional linear world into four dimensions to understand truly how differentiation takes place.