Chromatin acts as an organizer and indexer of genomic DNA and is a highly dynamic and regulated structure with properties directly related to its constituent parts. Histone variants are abundant components of chromatin that replace canonical histones in a subset of nucleosomes, thereby altering nucleosomal characteristics. The present review focuses on the H2A variant histones, summarizing current knowledge of how H2A variants can introduce chemical and functional heterogeneity into chromatin, the positions that nucleosomes containing H2A variants occupy in eukaryotic genomes, and the regulation of these localization patterns.
- histone variant
Canonical and variant histones: what's the difference?
Packaging the eukaryotic genome relies on the organization of DNA into nucleosomes, an octameric core of histone proteins around which the genetic material is wound. Nucleosomes are the simplest subunits of chromosomes, and the properties of chromatin fibres are influenced by the chemical identity of the proteins within each nucleosome particle, which usually includes two molecules of each of the canonical core histones: H2A, H2B, H3 and H4. These highly conserved proteins are essential in all eukaryotes and are encoded by multiple genes, often physically located in clusters. Their expression is timed to coincide with the major task of packaging the newly synthesized genome during S-phase (see  for a review of canonical histone gene organization and regulation). In addition to these canonical histones, separately encoded variant histones are present in eukaryotic cells. Their cognate genes are evolutionarily distinct from the canonical histone genes and they are physically located outside the replication-dependent histone clusters. Histone variant expression is not restricted to S-phase and protein levels are much lower than the canonical histones (~5–10% ). Like the canonical histones, variants are generally highly conserved between species, although some variants have a restricted species distribution (see the section below). Importantly, canonical and variant histone proteins differ in their amino acid sequences (Figure 1A). This sequence divergence can be significant, with over half of all residues differing between H2A and its variant H2A.B (previously called H2A.Bbd), and is particularly notable in the light of the overall high conservation of histone proteins. Variant histones are generally denoted by a prefix or suffix to the histone name, e.g. mH2A (macroH2A) and H2A.B (see  for recommendations on variant histone nomenclature).
Some sequence differences also exist between proteins encoded by different copies of the canonical histone genes. However, these are generally conservative changes restricted to a small number of residues. For example, approximately half of the canonical H2A proteins in humans have a threonine residue at position 16, whereas the other half have a serine residue, and this is one of between three and nine residues that differ among the canonical H2As. It is not known whether these small sequence differences functionally distinguish classes of canonical H2As, but they are not regarded as histone variants because their cognate genes are so closely related.
Variants of H2A: types and species distribution
Of the four core histones, variants of H3 and H2A are the most common, with the H2A family containing the highest number of variant forms. All eukaryotic genomes contain genes encoding both canonical and variant H2As, with the number of variants generally increasing with evolutionary complexity (Figure 1B). Budding and fission yeasts each have one separately encoded H2A variant, H2A.Z, and H2A.X function is provided by the canonical H2A in these species. Mammalian genomes have a separate gene encoding H2A.X, as well as multiple genes encoding H2A.Z, mH2A and H2A.B. mH2A and H2A.B, although not ubiquitous in eukaryotes, are found in vertebrates and may have been present in the ancestor of all animals . Some exceptions to this general distribution are known, with H2A.X absent from the nematode lineage and present as a fusion with the canonical H2A in yeasts or with H2A.Z in Drosophila.
In addition to these four near-ubiquitous H2A variants, a handful of lineage-specific H2A proteins have been identified. Examples include Tetrahymena thermophila H2A.Y, which has a long non-histone N-terminus that regulates H3 S10ph during mitosis , and plant H2A.W proteins containing C-terminal SPKK motifs . Although these variants are likely to represent interesting evolutionary solutions to particular environmental niches, they have not been studied in detail and only the more generally distributed and conserved H2A.X, H2A.Z, mH2A and H2A.B will be discussed in the present review.
Variant-specific sequences affect nucleosome properties
The overall histone fold architecture is preserved in all variants (Figure 1), allowing them to form nucleosomes that differ surprisingly little in structure from canonical nucleosomes, despite significant sequence diversity [6,7]. Notwithstanding this overall structural conservation, variant-containing nucleosomes display differences in their stabilities and biochemical properties, and arrays composed of variant nucleosomes have different tendencies to fold into compact structures (reviewed in [8,9]). Certain regions within H2A proteins appear to be particularly important for nucleosome properties.
One key region that lies on the surface of the nucleosome core particle is an acidic patch made up of a cluster of negatively charged residues, the majority of which are contributed by H2A . This acidic patch is important for folding of nucleosome arrays and some H2A variants alter the size of the negatively charged region because of sequence differences (Figure 1A). H2A.Z-containing nucleosomes have an extended acidic patch that causes nucleosome arrays to be more compact, whereas H2A.B has a truncated acidic patch and a consequent decrease in nucleosome array folding [11,12]. Sequence changes between variants can therefore have a direct effect on chromatin conformation. The acidic patch has also emerged as an important docking site for a number of proteins, including HP1 (heterochromatin protein 1) and IL (interleukin)-33 (reviewed in ). The binding of some of these proteins is dependent on the identity of the H2A protein; for example IL-33 binds nucleosomes containing H2A or H2A.Z, but not H2A.B, because of differences in the acidic patch . Varying the H2A protein is therefore a way to regulate the binding of proteins to the nucleosome surface and presumably also to other divergent regions.
Outside of the histone fold, sequence differences between H2A and its variants are prevalent in the N- and C-terminal tails. The C-terminal tail in particular appears to be an important determinant of H2A variant identity, and both the length and the sequence are highly divergent . mH2A is the most extreme example due to the presence of a basic region and a macro domain that together are approximately twice the size of the histone portion of the protein (Figure 1B). This C-terminal extension reduces access to linker DNA  and can influence the binding of remodelling enzymes . H2A.B conversely has a shorter C-terminus than canonical H2A (Figure 1), which means that H2A.B-containing nucleosomes wrap only 118 bp of DNA instead of 146 bp and are less stable . Both H2A.X and H2A.Z have similar length C-terminal tails to canonical H2A, but these are divergent at the sequence level (Figure 1A). The C-terminal tail of H2A.X includes a number of phospho-acceptor sites that are integral to the function of H2A.X in the DNA-damage response (reviewed in ), whereas the C-terminus of H2A.Z is important for its function in budding yeast, flies and human cells [20–22]. Genetic experiments in fission yeast have also shown that the H2A.Z N-terminal tail is important as its removal affects H2A.Z function, most probably due to the removal of four acetylation sites . In budding yeast, part of the N-terminus is also required for an interesting extra-chromatin role of H2A.Z in correctly targeting the Mps3 (monopolar spindle 3) protein to the inner nuclear membrane . The roles of the N-termini of other H2A variants have not yet been examined, but their sequence divergence indicates that they may be important for variant-specific functions.
Varying the H2A variants
Sequence diversification between H2A variants means that changing the H2A protein in the nucleosome alters the chemical environment, thereby affecting nucleosome characteristics and influencing the binding of other chromosomal proteins. Other mechanisms that can produce altered H2A variant proteins therefore have the potential to further diversify nucleosome function.
In most vertebrates, H2A variants are encoded by multiple genes, which often encode different proteins, e.g. mH2A.1 and mH2A.2 (Table 1). The differences between these proteins are generally fewer than the differences between them and H2A. However, genetic evidence indicates that some of these may play unique roles, as a knockout of the H2A.Z.1 gene (h2afz) is lethal in mice even though the H2A.Z.2 gene (h2afv) is present . Protein sub-functionalization may have occurred after gene duplication, giving the closely related proteins distinct roles. Alternatively, different expression patterns may underlie the unique functions for variant gene isoforms and there is evidence for tissue-specific expression for multiple variant histone isoforms (Table 1). Either way, the presence of multiple genes is one mechanism that facilitates further diversification of H2A variants.
In addition to the presence of multiple genes, downstream processing steps can result in different mature variant proteins. Unlike the canonical histone genes, some variant genes contain introns in higher organisms and alternative splicing has been shown to produce two forms each of mH2A.1 and H2A.Z.2 (Table 1). The H2A.Z.2 gene can produce a minor splice isoform, H2A.Z.2.2, that encodes a protein with an alternative shorter C-terminus. This splice isoform is present predominantly in the brain, and the protein forms nucleosomes that are less stable than those containing the full-length H2A.Z.2.1 [21,22]. mH2A.1.1 and mH2A.1.2 proteins differ in a short stretch of amino acids between the histone and macro domains due to alternative splicing. Only mH2A.1.1 can associate with metabolites of NAD, and this activity is responsible for the recruitment of mH2A.1.1 to sites of DNA damage and can regulate the conformation of the macro domain [26,27]. mH2A.1.1 expression is more restricted than that of mH2A.1.2, being present predominantly in adult differentiated cells [28,29]. However, alterations to this expression pattern have been observed in human cancers, where changes to alternative splicing of the gene encoding mH2A.1 lead to a reduced level of mH2A.1.1 . The ADP-ribose-binding function of mH2A.1.1 is required for suppression of lung cancer cell proliferation, indicating that it is loss of this function that contributes to carcinogenesis when mH2A.1.1 levels are reduced due to alterations in splicing patterns . Alternative splicing therefore represents an important way to alter H2A variant functions.
In organisms where there is only a single gene encoding a H2A variant, a number of chemically distinct proteins can still be produced. Again, alternative splicing may be used, as has been found for the H2A.Z transcript in the tunicate Oikopleura dioica, resulting in proteins containing variable numbers of N-terminal lysine residues, thereby potentially regulating the extent of acetylation in this organism . The H2A.Z N-terminus is also subject to post-synthetic processing, which produces two forms of the protein in Schizosaccharomyces pombe, one of which lacks the first two methionine residues . Most other forms of post-translational regulation involve the addition of modifications, such as acetylation, methylation, phosphorylation and ubiquitination. In some cases, such as phosphorylation of residues within the C-terminus of H2A.X, these modifications are key to the function of the protein and act through recruitment of other proteins . Other modifications may alter the biophysical properties of the nucleosome, with H2A.Z acetylation affecting nucleosome stability . A more detailed discussion of H2A variant post-transcriptional modifications is given in .
Additional H2A variant isoforms, produced through altered RNA or protein metabolism, are likely to exist, which may play ever more specialized roles in chromatin regulation. The general theme seems to be that evolution continues to act on H2A variants to produce further variant forms, and the examples so far implicate the non-histone portion of mH2A and the N- and C-terminal tails of H2A.Z as functionally important. The fact that the H2A family has diversified to include highly conserved variant proteins indicates that these variants play important roles in chromatin organization. Indeed, deletions of genes encoding H2A variants generally have severe consequences, with H2A.Z essential for viability in organisms from nematodes to mice [25,35–37] and single mH2A gene deletions causing severe developmental defects in zebrafish and metabolic defects in mice [38,39]. As the importance of H2A variants derives from the effects they have on nucleosomal properties, a key to understanding their roles is to map the nucleosomes that contain H2A variants across whole genomes. Genome-wide studies have been instrumental in mapping the distributions of histone modifications as a first step to understanding their functions [40,41] and recent work has begun to describe the distribution of H2A variants in a similar manner.
THE H2A LANDSCAPE OF THE GENOME
Theoretically, H2A variants could occupy nucleosomes randomly, with prevalence proportional to the amount of variant protein, or they could be enriched in certain genomic locations where they are functionally important. In fact, H2A variant nucleosomes occupy distinct locations in genomes relative to genes and other features. For most H2A variants, localization to specific genomic compartments was initially determined at low resolution by microscopy and has more recently been examined at high resolution by ChIP (chromatin immunoprecipitation). ChIP followed by identification of immunoprecipitated DNA fragments by microarray hybridization or high-throughput sequencing has revealed the genomic localizations of many H2A proteins. Although these studies are limited by the fact that they present a snap-shot of nucleosome occupancy in a cell or tissue at a particular time point, and nucleosomes can be highly dynamic, general localization patterns emerge, particularly when variants are examined in multiple species.
H2A.Z localization has general and cell-type-specific features
H2A.Z localization has been examined in representative organisms from multiple eukaryotic kingdoms (fungi, animalia and plantae), revealing a conserved localization pattern at the 5′ ends of genes across vast distances of evolutionary time [42–53]. In higher resolution studies, this 5′-end-enrichment resolves to occupancy in individual nucleosomes surrounding the TSS (transcription start site) (Figure 2A) [49,50,54,55]. There are some species-specific differences in the relative abundance of H2A.Z in the +1 and −1 nucleosomes, with the −1 nucleosome enriched in Saccharomyces cerevisiae and Homo sapiens, but not in Drosophila melanogaster or S. pombe [47,56]. Different tissues within the same species may also turn out to have slightly different patterns, as for example H2A.Z is absent from the +1 nucleosome in mouse testis but not in other mouse cell types that have been examined [51,57,58] (Figures 1A and 1B). Some of these differences may reflect cell-cycle-related changes in abundance of H2A.Z at TSS-flanking nucleosomes, which have recently been observed in mouse trophoblast cells . Away from the TSS, H2A.Z levels in genes are lower in transcribed regions (Figure 2A). H2A.Z is also enriched at gene enhancers in human and mouse cells [50,52,58,60], and the high enrichment of H2A.Z at genes and gene regulatory elements is the most generally conserved feature of H2A.Z localization.
The relationship between H2A.Z occupancy and gene activity levels is complex. H2A.Z levels are generally highest at the most active genes in higher organisms, and a positive effect on gene expression has been inferred because depletion of H2A.Z reduces pol II (polymerase II) recruitment to chromatin . Exceptions to this general genome-wide trend exist for some genes and in some cell types. H2A.Z appears to repress transcription at p21 , U-PAR (urokinase-type plasminogen activator receptor)  and at ΔNp63α target genes  in human cell lines. H2A.Z is absent from stably repressed promoters in ES (embryonic stem) cells , but it is enriched at many inducible promoters while they are inactive and is required for activation of these genes during differentiation . This is similar to the effect of H2A.Z removal on inducible genes in budding yeast, e.g. GAL1, where H2A.Z is required for normal induction kinetics [64–66]. At the genome scale in yeast, H2A.Z-containing nucleosomes are not positively correlated with gene expression [32,42,43,45,48,54] and H2A.Z deletion results in both up- and down-regulation of genes, although some of these effects may be indirect . The different relationships between H2A.Z and gene activity may represent alternative modes of gene regulation in different genes, species or cell types, or may reflect subpopulations of H2A.Z that are differentially marked by post-translational modifications or associated with nucleosomes whose other components are different. For example, acetylated isoforms of H2A.Z are more enriched at active than inactive genes in yeast and vertebrate cells [46,68,69] and a doubly modified form of H2A.Z that is acetylated and ubiquitinated is found specifically at bivalent genes in mouse ES cells . These modified forms of H2A.Z may facilitate distinct functions of H2A.Z that result in different transcriptional outcomes.
Outside of pol II-transcribed genes, budding yeast H2A.Z is found flanking numerous other genomic features , although levels here tend to be lower than at genes (see Table 1 for a summary of H2A.Z-enriched genomic regions). Occupancy in some transposon classes has been observed in yeast, flies and plants, although in Arabidopsis thaliana only transposons with low levels of DNA methylation have H2A.Z enrichment . This anti-correlation between H2A.Z occupancy and DNA methylation has also been observed in Fugu  and mammals , although it is not a strong anticorrelation in mammalian cells . Overall, many different types of euchromatic features contain some level of H2A.Z enrichment, but the significance of these H2A.Z-containing nucleosomes has not yet been tested. H2A.Z has been localized to heterochromatic regions in some organisms and cell types. In both budding and fission yeasts H2A.Z is enriched near telomeres [32,42], but is absent from telomeric heterochromatin and in fact acts to prevent the spread of heterochromatic factors into neighbouring euchromatin in S. cerevisiae . Centromeric heterochromatin in S. pombe and A. thaliana also lacks H2A.Z [32,44,49], but pericentric heterochromatin in mouse cells from the early embryo and in mouse and human sperm contains H2A.Z [72–74]. Indeed, high enrichment of H2A.Z is restricted to pericentric heterochromatin in human sperm and the protein is unusually absent from euchromatin in this cell type . Centromeric and pericentromeric heterochromatin differ in their other chromatin components, e.g. CenH3 or H3K9me3 (histone H3 trimethylated on Lys9), and H2A.Z may preferentially associate with only some classes of heterochromatin. Other examples of H2A.Z occupancy in heterochromatin regions have been documented in flies  and in human osteoscarcoma cells , but overall it is not completely clear which subtypes of heterochromatin contain H2A.Z. One confounding factor is that many ChIP studies do not report a complete picture of heterochromatin localization, particularly in higher organisms where the highly repetitive nature of some heterochromatic regions poses problems because recovered DNA fragments cannot be attributed to a unique locus. Lack of H2A.Z causes defects in chromosome segregation in many organisms [76–78], which in mammalian cells is due to a defect in HP1α recruitment to pericentric heterochromatin , indicating that H2A.Z is a functionally important component of heterochromatin. Interestingly, levels of H2A.Z at centromeric heterochromatin increase specifically during mitosis, the time when the conformation of this region becomes critically important .
ChIP experiments incorporating multiple time points have revealed the dynamic nature of H2A.Z-containing nucleosomes. Thus far these have generally been carried out at the single gene level, e.g. FIG1, PHO5, p21 and U-PAR [40,54,61,62]. These studies show that H2A.Z is removed from nucleosomes during gene activation and re-incorporated during repression. H2A.Z occupancy is also regulated in response to DNA damage and the cell cycle [59,79,80]. The association of H2A.Z with nucleosomes is therefore dynamic, and at the whole-genome level H2A.Z occupancy correlates well with nucleosomes undergoing rapid turnover in yeast . Whether this is a consequence of H2A.Z acting as a replacement histone at any nucleosomes undergoing turnover, due to its cell-cycle-independent presence, or whether H2A.Z is required for nucleosome dynamics is still unresolved. However, it is clear that understanding the many roles of H2A.Z will require the examination of H2A.Z enrichment over multiple time points and in different cell types.
mH2A is abundant in heterochromatin and large euchromatic domains
mH2A has historically been linked to heterochromatin and was originally identified as an abundant component of the Xi (inactive X) chromosome in female mammalian cells by immunofluorescence and GFP (green fluorescent protein) tagging [82,83]. mH2A also associates with the XY bodies that form during male gametogenesis  and with SAHF (senescence-associated heterochromatic foci), facultative heterochromatic regions that form in senescent cells . ChIP studies using human and mouse cells and tissues have validated the presence of mH2A on the Xi, with fairly uniform chromosome-wide enrichment [86,87]. Despite this enrichment, mH2A appears to be dispensable for X inactivation, as female mouse ES cells that are depleted for both mH2A isoforms do not have apparent defects in X inactivation . Single gene deletion of mH2A.1 in mice similarly allows normal X inactivation . It may be that a small amount of mH2A persists after RNAi (RNA interference) depletion or deletion of one of the two genes, and that this is sufficient for X inactivation. Alternatively, mH2A may act redundantly with other chromatin marks to maintain the Xi in a silent state.
In addition to the well-known enrichment on the silent X chromosome, mH2A is also present in euchromatin where it is found in large (>500 kb) domains that are also enriched for other repressive chromatin marks such as H3K27me3 (histone H3 trimethylated on Lys27) . It is not known whether these large domains are a feature of higher-order chromosome organization, but their boundaries often coincide with promoter-proximal regions. Where mH2A is enriched on genes, it is present both upstream and downstream of the TSS, including within coding sequences (Figures 2A and 2B). In ES cells, mH2A is absent from the region immediately adjacent to the TSS [38,91], whereas in transformed cell lines mH2A is present at the TSS as well as upstream and downstream [90,92]. mH2A levels are much higher in differentiated cells than in pluripotent cells during mouse embryogenesis, and mH2A acts as a barrier to induced pluripotent cell reprogramming . It has been proposed that the different levels and distributions of mH2A in stem cells and differentiated cells are an important reflection of differences in their developmental plasticity .
mH2A is generally less enriched on active genes, but is present at low levels on some active genes, including genes involved in development and cell–cell signalling in human cells, and lipid metabolism genes in mouse liver [38,87,90]. Genes that have increased expression in mh2a.1−/− mice are enriched for mH2A.1 in mouse liver, indicating that mH2A.1 can act as a direct repressor . However, not all genes that have mH2A enrichment are derepressed when it is absent. Indeed, in MCF7 cells, a small number of genes are down-regulated upon knockdown of mH2A1 [90,95]. mH2A also contributes to the fine-tuning of temporal activation of HOXA (homeobox A) cluster genes during neuronal differentiation . Although the effect of mH2A depletion on HOX gene expression is subtle, temporal regulation during development is critical and, consequently, mH2A-depleted zebrafish embryos display severe defects .
Most of the mH2A localization studies to date have been performed using cells or tissues where mH2A.1 is the predominant isoform, and some have used reagents specific for mH2A.1 so that the results may apply only to mH2A.1 [86,91]. mH2A.1 and mH2A.2 are only ~80% similar at the protein level and may therefore be targeted to distinct regions to facilitate different functions. One study has compared mH2A.1 and mH2A.2 at human gene promoters and found that they have distinct and overlapping gene targets but that most genes have both isoforms . However, by immunofluorescence, mH2A.1 and mH2A.2 patterns are distinguishable , so it will be interesting to test whether mH2A.1 and mH2A.2 are differentially targeted to other genomic sequences, such as different heterochromatic regions. It will also be of interest to discover whether the splice isoforms of mH2A.1 that differ in their ability to bind NAD metabolites occupy distinct genomic locations. It is known that differently post-translationally modified forms of mH2A are found at different locations, with mH2A.1 phosphorylated at Ser137 specifically absent from the Xi . Other modified forms of mH2A.1 have been identified, including methylated and ubiquitinated forms [98,99], which may also be restricted to particular genomic regions.
H2A.X localizes to heterochromatin and collapsed replication forks
The analysis of H2A.X localization at the genome scale has lagged behind that of the other H2A variants, possibly because most studies have focused on the localized phosphorylation of H2A.X that occurs at DNA lesions [100,101]. Another complicating factor is that H2A.X function is provided by another H2A protein in some key model organisms, including yeasts and Drosophila, making analysis less straightforward.
In yeasts, the SQE/DØ consensus motif that is considered characteristic of H2A.X proteins is carried on the canonical H2A proteins. The serine residue within this motif can be phosphorylated, and recent studies have mapped the phosphorylated form of H2A, which can be equated to H2A.X (H2A.XS129/28ph), across the budding and fission yeast genomes. These studies reveal that H2A.Xph is found in heterochromatin [102–104], although in fission yeast it is excluded from the centromeric core . The heterochromatin proteins Sir3 and Swi6 are required for H2A.Xph localization in heterochromatin in S. cerevisiae and S. pombe respectively, and cells lacking H2A.Xph do not have detectable defects in heterochromatin formation [102–104]. These data suggest that H2A.Xph is downstream of heterochromatin formation or redundant with other factors.
H2A.Xph is also present outside of heterochromatin in yeast, with enrichment at tRNA genes, replication origins, rDNA, LTRs (long terminal repeats) and some transposons and protein coding genes. Many of these are sites that are prone to replication fork stalling and approximately 30% of H2A.Xph peaks can be attributed to collapsed replication forks . At genes, H2A.Xph is negatively correlated with expression and its localization at repressed genes is dependent on their lack of expression . However, it is not known whether H2A.X phosphorylation regulates gene activity either positively or negatively.
In higher organisms, immunofluorescent staining has shown that H2A.XS139ph is enriched in the heterochromatic XY body in mouse sperm, where it is essential for chromosome condensation and silencing . Lack of H2A.X consequently leads to male sterility in the mouse . Only one genome-wide ChIP study examining H2A.X and H2A.XS139ph localizations in higher organisms has been carried out to date, where a cell-type difference was detected. In resting T-cells, no peaks of H2A.X were detected unless the cells were treated with ionizing irradiation, whereas H2A.X enrichment was present in sub-telomeric regions and at active TSSs in untreated Jurkat cells . Sub-telomeric H2A.X is phosphorylated in Jurkat cells, similar to the pattern in yeasts. The levels of H2A.Xph increased in T-cells upon treatment with radiation and H2A.Xph became enriched in transcribed regions . Although further studies will add to our understanding of the profiles of bulk and H2A.Xph enrichment before and after DNA damage, it currently appears that H2A.X has a distinct localization pattern from the other H2A variants (Figure 2C). The general pattern of H2A.Xph is conserved between budding and fission yeasts, and the sub-telomeric localization has also been reported in one human cell line. Patterns of H2A.X are likely to be dynamic in response to DNA damage and possibly other signals, and the relationship between H2A.X, its phosphorylated isoforms and heterochromatin deserves further attention.
H2A.B is enriched on active genes
In contrast with other histones, H2A.B is rapidly evolving and appears to be restricted to mammals . Human and mouse genomes encode three and four H2A.B-like proteins respectively. Genome-wide localization studies have been carried out for a mouse H2A.B-like protein (H2A.Lap1) as well as human H2A.B. H2A.Lap1 has one extra acidic residue compared with the other H2A.B proteins, which allows nucleosome arrays containing H2A.Lap1 to partially fold in vitro, while H2A.B arrays remain uncompacted [12,57].
H2A.Lap1 is highly expressed in the testis, where it is present during meiosis and in post-meiotic round spermatids . ChIP-seq analysis shows that H2A.Lap1 is enriched at the TSSs of active genes in mouse round spermatids (Figure 2B). H2A.Lap1 occupancy in gene-coding regions is lower than at the TSS or distal promoter regions, and no relationship to other genomic features has been described. Although immunofluorescence shows that H2A.Lap1 is found in the heterochromatic X and Y chromosomes late in spermatid development, it is not present in constitutive heterochromatin and X chromosome enrichment is higher on active genes, indicating that H2A.Lap1 is an active mark. Additionally, the levels of H2A.Lap1 at select genes increases with levels of the cognate transcripts during a time course of testis development . The enrichment of H2A.Lap1 over the TSS is distinct from all other variants, and indeed other histone marks, as this region is usually depleted of nucleosomes.
ChIP-seq analysis of human H2A.B stably expressed in HeLa cells shows a different pattern to H2A.Lap1, with depletion from TSSs and enrichment in gene-coding regions (Figure 2A) . Highly expressed genes have more H2A.B, which is also enriched at intron–exon boundaries. H2A.B co-purifies with a number of splicing factors, and depletion of the protein leads to an increase in retention of intronic sequences . These data point to a role for H2A.B in gene splicing through localization at gene-coding sequences. Although H2A.B and H2A.Lap1 are highly related proteins, the mapping data for these two isoforms show that they have dramatically different localizations. This may be attributable to tissue-specific functions for H2A.Lap1, or alternatively the presence of an extra acidic residue in H2A.Lap1 compared with H2A.B may change the properties and distribution of the protein. These two studies indicate that the rapid evolution of H2A.B-like proteins or their cellular context can greatly influence their binding profiles.
Nucleosome occupancy by H2A variants can be homotypic or heterotypic
A further nuance of H2A variant localization is that both H2A proteins in the nucleosome can be of the same type (homotypic) or there may be two different types of H2A protein in a single nucleosome (heterotypic). It is known that heterotypic H2A.Z–H2A nucleosomes exist in vivo [109–111] and homotypic H2A.Z nucleosomes are relatively more enriched at TSSs in S. cerevisiae and D. melanogaster [110,111]. In mouse cells, H2A.Z-containing nucleosomes at promoters switch between homo- and hetero-typic states at different cell-cycle phases . Homotypic nucleosomes have biochemical properties that are similar to active chromatin , raising the possibility that homo- and hetero-typic H2A.Z nucleosomes could have different properties in vivo. Little is known about whether the other H2A variants form heterotypic nucleosomes in vivo, although mH2A preferentially forms heterotypic nucleosomes with H2A  and H2A.B can also form heterotypic nucleosomes in vitro. Further studies of the H2A landscape of the genome, where homo- and hetero-typic nucleosome combinations are examined, are likely to reveal further interesting patterns of H2A variant occupancy.
Collectively, studies of H2A variant occupancies reveal that different variants have distinct patterns of genome-wide localizations (Figure 2 and Table 1). H2A.Z and mH2A are each found in both heterochromatin and euchromatin, but their localization patterns are different both around genes and in heterochromatic regions. Despite a generally mutually exclusive pattern, both H2A.Z and mH2A have been localized to HOX genes in mouse ES cells [38,51], indicating that they may co-occupy heterotypic nucleosomes at the HOX locus in vivo to establish a unique chromatin structure. However, further work is required to establish co-localization of H2A.Z and mH2A, as similar enrichment patterns could also represent enrichment in neighbouring nucleosomes or at the same positions in different cells in the population. mH2A and H2A.X generally associate with less-transcribed genomic regions, including heterochromatin and repressed genes, whereas H2A.Z and H2A.B are more enriched at active genes in mammalian cells, indicating that half of the H2A variants are predominantly associated with gene repression or silencing while the other two are active marks. Interestingly, H2A.B localization at genes is drastically different in two different cell types (Figures 2A and 2B), although the protein is still associated with active genes and the differences may also be due to subtle differences in protein sequence between the two H2A.B isoforms studied to date. H2A.X is predominantly localized to heterochromatin in yeast and human cells, and both H2A.Z and mH2A also localize to heterochromatic regions in human cells. In yeast, where the mapping of heterochromatic regions is more straightforward, it can be seen that the two H2A variants present, H2A.X and H2A.Z, predominantly lack any overlap in their distribution (Figure 2C). It will be of great interest to map multiple variants in single cell types in higher organisms to reveal the full extent of their unique and overlapping distributions.
REGULATION OF H2A VARIANT PATTERNS
How are the patterns of H2A variant occupancy in different genomic loci established and maintained? For this to happen, H2A variants must be delivered to the nucleus and assembled into or removed from the appropriate nucleosomes at the correct time. To generate specific patterns, cells must be able to distinguish between different H2A variants and indeed regions of the proteins that are important for targeting have been identified in some cases. Two classes of proteins, histone chaperones and ATP-dependent remodelling complexes, have been closely linked to H2A–H2B assembly and disassembly. Histone chaperones (reviewed in [113–115]) are involved in several aspects of histone metabolism and act to partner histones in various subcellular compartments. ATP-dependent remodelling complexes (reviewed in [116,117]) use energy from ATP hydrolysis to move, remove or assemble nucleosomes and a subset of these enzymes are involved in regulating H2A variant nucleosomal occupancies.
Histone chaperones: essential partners but conferring little specificity
Histone chaperones are proteins that partner histones and they act, in general, by blocking the interaction sites of the histones in the nucleosome core particle (reviewed in ). Chaperones are involved in pre-deposition functions, such as nuclear import, and have post-deposition roles in eviction and recycling of histones. Both pre- and post-deposition roles could, hypothetically, be important in regulating the H2A variant composition of chromatin, as variants could be delivered to the nucleus in a regulated way by chaperones to enhance assembly, and chaperones could also determine whether variants are removed at a certain time point. However, it is difficult to imagine how chaperones could specify genomic localization unless their actions are restricted to one or a few variants.
Biochemical and structural studies show that H2A–H2B chaperones tend to make extensive contacts with both the H2A and H2B proteins in the dimer [119,120] and any specificity determinants that might be contributed by H2A variants have not been identified. The chaperone with the best-characterized specificity for a single H2A variant is the yeast protein Chz1, which preferentially binds H2A.Z over H2A and chaperones H2A.Z in the nucleus . Another chaperone with preferential activity is nucleolin, which can stimulate the remodelling of mH2A or H2A-containing nucleosomes, but not those containing H2A.B . Some chaperones may partner with a number of H2A variants but not all of them; for example Nap1 can facilitate removal of dimers containing H2A and H2A.Z, but not mH2A from nucleosomes . Exhaustive challenges of chaperone specificities for different H2A proteins have not yet been performed and it is possible that other chaperones with preferences for single variants or specific subsets exist.
Even when variant-specific chaperones such as Chz1 are known to operate, it appears that they do not determine variant nucleosome patterns, as the enrichment of H2A.Z at several target genes is not altered in the absence of Chz1 and Nap1 . This may be due to redundancy in the chaperoning system, with FACT substituting as a H2A.Z chaperone when Nap1 and Chz1 are absent . It therefore seems that although histone chaperones are essential for the normal metabolism of H2A variants, they are not the prime determinants of the localization of variant proteins. However, there is still a lot of work to be done to determine how the delivery of H2A variant dimers to chromatin is regulated by chaperone proteins.
ATP-dependent remodelling enzymes regulate H2A variant occupancy in nucleosomes
ATP-dependent chromatin remodellers can be divided into several categories on the basis of sequence homology and the structural organization of their catalytic subunit . A few different subtypes have activity on H2A–H2B dimers, and in particular members of the ‘split ATPase’ group of remodellers, Swr1  and Ino80 , can remove H2A–H2B dimers from nucleosomes in vitro. Other classes of remodellers, such as SWI/SNF and RSC also have some H2A–H2B exchange activity in vitro , although this involves moving nucleosomes off the end of a short piece of DNA, a mechanism that is unlikely to be used in vivo. The relationship between ATP-dependent remodellers and the regulation of each H2A variant's occupancy in nucleosomes is discussed below and summarized in Table 2.
The first example of an ATP-dependent remodelling enzyme showing variant dimer exchange activity was the Swr1 enzyme in budding yeast, which can swap nucleosomal H2A–H2B dimers for H2A.Z–H2B dimers in vitro . Swr1 is the catalytic subunit of a multiprotein complex (SWR-C) that is evolutionarily conserved and has been biochemically verified in S. pombe, A. thaliana, D. melanogaster and H. sapiens (reviewed in ). Importantly, the in vivo nucleosomal occupancy pattern of H2A.Z is dependent on the SWR-C, as ChIP experiments show reduced H2A.Z occupancy in the absence of SWR-C members [43,61,125,129]. In H. sapiens, there are two complexes that contain different Swr1-like subunits, SRCAP or p400 [130,131]. Intriguingly, a smaller SRCAP complex lacking the catalytic subunits p400 and SRCAP has been isolated from HeLa cells and found to have H2A.Z deposition activity in vitro , indicating that other ATPase proteins within the SWR-C may also have dimer-exchange capabilities.
The specific recognition of H2A.Z by the SWR-C in yeast requires a region near the C-terminus of H2A.Z  that is well conserved in H2A.Z orthologues but different in other H2A proteins. This M6 region  (Figure 1A) is required for the association of H2A.Z with the SWR-C and most probably underlies the specificity of the SWR-C for H2A.Z over other variant types [133,134]. Interestingly, the M6 region includes residues contributing to the extended acidic patch of H2A.Z that is important for protein interactions and chromatin compaction, which may mean that deposition and altered function have been co-selected within this portion of the H2A.Z sequence.
For the SWR-C to generate a specific pattern of H2A.Z-containing nucleosomes, the complex must be recruited to those nucleosomes, which in turn need to be good substrates for the deposition reaction. Genome-wide ChIP experiments show that subunits of SWR-C are enriched at the +1 and −1 nucleosomes in yeast , and in human cells SRCAP is found at promoters , meaning that the SWR-C stably binds at least a subset of sites where H2A.Z is highly enriched. Factors that affect the targeting of SWR-C have generally not been directly tested by examining SWR-C binding, but rather by measuring H2A.Z enrichment. Although this is a good readout of SWR-C recruitment, it is also a product of SWR-C activity as well as any disassembly that may occur at that nucleosome. Such experiments indicate that acetylation sites in H3 and H4, various HATs, and the DNA sequence of the adjacent NDR (nucleosome-depleted region) can all influence H2A.Z occupancy in yeast [48,54,137]. DNA sequence determinants include binding sites for specific transcription factors, such as Reb1, but also sequences that aid in the formation of NDRs. The NDR has been proposed to regulate the activity of the SWR-C and NDR formation depends on the activity of the RSC remodelling complex, thus RSC may indirectly regulate H2A.Z occupancy . In mammalian cells, H2A.Z deposition is dependent on the activation of some transcription factors including p53, c-Myc , ER (oestrogen receptor) α  and ΔNp63α , although only ΔNp63α has been shown to interact with members of the mammalian SWR-C .
While targeting the recruitment of the SWR-C is likely to be the major regulatory step in determining which nucleosomes are destined to contain H2A.Z, the components of the substrate nucleosome can also affect the reaction. The ATPase activity of Swr1 is stimulated by nucleosomes containing H2A  and acetylation of nucleosomal histones H4 and H2A stimulates the incorporation of H2A.Z by the SWR-C in vitro . This implies that the effect of acetyltransferase mutants on H2A.Z occupancy may be through the regulation of SWR-C activity rather than recruitment of the complex, and that both enzyme recruitment and regulation are key to establishing a pattern of H2A.Z in the genome.
Although assembly of H2A.Z is a critical first step in establishing nucleosome-specific occupancies, removal of H2A.Z from nucleosomes will also contribute to the steady-state pattern. This is analogous to the regulation of histone modifications (e.g. acetylation) by enzymes that add (acetyltransferases) and remove (deacetylases) the modifications. Some H2A.Z removal may be passive, due to the general disruption of nucleosomes that occurs during gene transcription and other DNA operations . However, another remodelling complex with a split-ATPase family catalytic subunit, the INO80 complex (INO80-C), has been shown to specifically remove H2A.Z–H2B dimers from nucleosomes, replacing them with H2A–H2B dimers to reconstitute canonical nucleosomes . In vivo, G1-arrested yeast cells have an altered pattern of H2A.Z when Ino80 is absent, with less enrichment at promoter nucleosomes and a corresponding increase in H2A.Z occupancy in gene-coding regions. ChIP-seq experiments confirm that Ino80 binds throughout gene-coding regions , where it could act to remove H2A.Z from nucleosomes. However, Ino80 is also localized to promoter nucleosomes , indicating that a competition with SWR-C may occur to establish steady-state H2A.Z occupancy levels. INO80 complexes are conserved in mammalian cells  and it will be interesting to learn whether mammalian INO80-C also ‘prunes’ H2A.Z from gene-coding regions.
A third ATP-dependent remodelling enzyme, Fun30, which is related to Swr1 and Ino80 , has been linked to the regulation of H2A.Z nucleosomal occupancies. Budding yeast cells lacking Fun30 have a disrupted H2A.Z pattern similar to that seen in the absence of Ino80, with reduced occupancy in nucleosomes near TSSs and increased localization in coding sequences . In fission yeast, deletion of the Fun30 orthologue Fft3 results in an increased occupancy of H2A.Z at centromeres and sub-telomeric regions . Fun30 may therefore also act to remove H2A.Z, although in vitro experiments are needed to demonstrate activity on H2A.Z-containing nucleosomes.
Several factors have been shown to be important for mH2A localization to heterochromatic regions. However, unlike the case for SWR-C and H2A.Z, it is not clear whether these requirements are direct, through an assembly activity, or if the factors act further upstream of mH2A incorporation. For example, the localization of mH2A to SAHF depends on the chaperones ASF1a and HIR1 , but ASF1 and HIR1 are primarily H3–H4 chaperones and have not been shown to bind mH2A so this is likely to be an indirect effect. Similarly, mH2A enrichment on the Xi requires the Xist RNA , but this is likely to be an upstream requirement rather than an assembly mechanism. Recently, a number of proteins co-purifying with soluble mH2A.1, and which could therefore be good candidates as assembly factors, were identified . These included the ATPase protein ATRX, which is in the Rad54-like family of remodellers and not closely related to Swr1/Ino80/Fun30. Although no direct biochemical assembly or disassembly activity was demonstrated, depletion of ATRX from cells resulted in an increased association of mH2A with chromatin, indicating that ATRX is either required for the removal of mH2A from chromatin or acts to inhibit its deposition .
Features of the mH2A protein that are important for its normal localization in chromatin have been studied by examining enrichment on the Xi. Both the non-histone C-terminal tail and the core histone domain are required for mH2A.1 targeting to the Xi . Within the histone domain, several sequences including a region corresponding to the H2A.Z M6 region are required for mH2A.1 enrichment on the Xi . Monoubiquitination of mH2A may also be important for its association with, or retention on, the Xi, as depletion of the CULLIN3/SPOP ubiquitin E3 ligase that monoubiquitinates mH2A leads to mH2A dissociation from the Xi . It will be interesting to see whether the interaction of ATRX and mH2A can be localized to an mH2A-specific sequence and whether more activities that regulate mH2A localization will be identified.
The mechanism of H2A.X assembly into nucleosomes has not been studied in detail. In yeast, as H2A.X is also the canonical H2A, deposition is replication-dependent and it is not known what regulates H2A.X assembly in higher organisms. Although the mechanism of H2A.X incorporation is not known, the last 23 residues of the protein are important for chromatin incorporation during early embryogenesis . In fact, these residues can direct H2A.Z to be deposited during this stage when it is normally absent, which indicates that the C-terminus of H2A.X may contain a signal that directs its assembly into nucleosomes. A recent proteomics screen carried out in mammalian cells identified a number of proteins, including members of the Mi-2/NurD remodelling complex, as H2A.X-specific interactors making them candidates for proteins that directly regulate H2A.X nucleosome occupancy .
Removal of H2A.Xph from nucleosomes after DNA damage in yeast has been linked to the same enzymes that regulate H2A.Z occupancy, Ino80 and Swr1 . In Drosophila, phosphorylated H2A.X (H2Av) is removed by a SWR-related complex, dTip60 . As H2Av contains features of both H2A.Z and H2A.X, the possibility that removal of H2Av is H2A.Z-specific rather than H2A.X-specific cannot be excluded. However, the fact that removal depends on a phospho-mimetic mutation at the H2A.X-specific phospho-acceptor residue Ser137 indicates a dependence on H2A.X function. Therefore the SWR and INO80 complexes can remove H2A.X from chromatin, but pathways leading to H2A.X deposition are as yet undiscovered.
Remodelling enzymes have not yet been linked to the assembly or disassembly of H2A.B, which occurs via unknown mechanisms, although NAP-1 can promote the assembly and disassembly of H2A.B in vitro . As this variant is highly expressed in the testis, there may be tissue-specific enzymes that direct its deposition and removal, and further experiments will be required to identify these.
Regulation of H2A variant deposition and removal is the key to establishing and maintaining distinct patterns of H2A variant localization in the genome. This is likely to be particularly important during development, for example during early embryogenesis, where H2A variant dynamics seem to be particularly pronounced ([150,154]; for a review see ). Alterations in H2A variant deposition and removal may also underlie the aberrant H2A variant protein levels that have been documented in human cancers [156–159]. It is therefore critical to understand how H2A proteins are interchanged at different loci and at particular time points. Although a clear link between ATP-dependent remodelling enzymes and H2A variant exchange has been established, there are many outstanding questions about how H2A variant patterns are established and maintained.
The organization and regulation of eukaryotic genomes by chromatin relies on molecular heterogeneity at the nucleosomal level and the inclusion of H2A variants is one major mechanism to functionally differentiate nucleosomes. H2A and its variants occupy nucleosomes in different parts of the genome where their sequence differences result in altered nucleosome properties, novel post-translational modifications and different protein interactors. Much progress has been made in understanding the genomic distributions of H2A variants and future work will shed light on cell-type-specific differences in variant distributions, the positions of different heterotypic nucleosomes and the regulation of these occupancies.
Work in the author's laboratory is supported by the Wellcome Trust.
I thank Yanin Naiyachit and Tom Wood for comments on the paper before submission.
Abbreviations: ChIP, chromatin immunoprecipitation; ES, embryonic stem; HOX, homeobox; HP1, heterochromatin protein 1; IL, interleukin; mH2A, macroH2A; NDR, nucleosome-depleted region; pol II, polymerase II; SAHF, senescence-associated heterochromatic foci; TSS, transcription start site; U-PAR, urokinase-type plasminogen activator receptor; Xi, inactive X
- © The Authors Journal compilation © 2013 Biochemical Society