CENP-A is an heteromorphous variant of the histone H3 family. The protein was discovered as one of the epitopes recognized by anti-centromeres antibodies found in patients suffering from a form of scleroderma known as the “CREST syndrome” (Calcinosis, Raynaud’s syndrome, Esophageal dismotility, Sclerodactyly and Telangiectasia), and was given its name (CENtromeric Protein A) due to its exclusive centromeric localization (Earnshaw & Rothfield, 1985). Later studies demonstrated that CENP-A is a bona fide histone variant, able to substitute the canonical histone H3 inside a nucleosome (Palmer et al., 1987, 1991).
Homologs of human CENP-A have been identified in all studied eukaryotes, all of them exhibiting the distinctive centromeric localization. To list but a few: Cse4p in S. cerevisiae, Cnp1 in S. pombe, CID (Centromere IDentifier) in D. melanogaster, HTR12 in A. thaliana…
CENP-A shares the overall structure of core histones, comprising a structured COOH-terminal domain, containing the typical and very conserved histone-fold domain, and a NH₂-terminal domain, the “tail”, which is unstructured and floating outside of the nucleosome core particle. These two domains exhibit different conservation status: the COOH-terminal domain is pretty well conserved throughout the eukaryotes, while the NH₂-terminal domain is very divergent (Fig. 1).
The COOH-terminal domain was shown to be required for the centromeric localization of the protein (Sullivan et al., 1994); ten years later, a subset of that domain, termed the CENP-A Targeting Domain (CATD) and comprising the L1 loop and the α2 helix, was shown to be necessary and sufficient to direct the protein to the centromeres (Black et al., 2004).
The composition of the CENP-A nucleosome core particle is still debatted. The obvious hypothesis is an octameric core similar to a conventional nucleosome, with the two histones H3 substituted by two CENP-A proteins. Other models (reviewed by Black & Cleveland, 2011) include:
Black and Cleveland have proposed that the non-octameric forms are intermediate steps of CENP-A nucleosome assembly, leading ultimately to the mature octameric form.
The structure of an octameric nucleosome containing CENP-A has recently been resolved (Tachiwana et al., 2011). It revealed that such a nucleosome exhibits a structure essentially similar to that of a H3 nucleosome—contradicting previous reports suggesting a very distinct structure (Sekulic et al., 2010). Interestingly, in that structure only 121 bp of DNA are visible, instead of around 147 bp in a conventional nucleosome, meaning that 13 bp at each end of the nucleosomal DNA are not tightly bound to the histone core (Fig. 2).
Centromeric DNA is highly variable throughout the eukaryotes (Sullivan et al., 2001), from the 125-bp centromeres of S. cerevisiae to the megabase-sized centromeres of higher eukaryotes. The only common property is the presence of some form of repetitive DNA (Fig. 3).
The low conservation of centromeric DNA sequences quickly led to suggest that centromeres position and function were not genetically defined. Actually, centromeric DNA appears to be neither sufficient nor necessary for centromere function: not necessary, as a functional centromere may form upon non-centromeric DNA regions (neocentromeres); and not sufficient, as the mere presence of centromeric-type DNA does not systematically lead to a functional centromere.
There is little doubt nowadays that centromere identity and function are defined epigenetically and depend on a specialized chromatin structure rather than the underlying DNA sequence. Indeed, centromeric chromatin exhibits some very well conserved features, the most important being the systematic presence of the H3 variant CENP-A. This variant is the best candidate for the epigenetic marker of centromeres.
Like all core histones, in cycling cells CENP-A must be periodically reloaded into the chromatin to cope with the replication of the genetic material. Canonical histones are reloaded in S phase in a DNA replication-dependent manner (Osley, 1991), but CENP-A is not expressed during S phase (Shelby et al., 1997) and its reloading appears to be more complicated.
During the replication of centromeric DNA, pre-existing CENP-A nucleosomes are equally distributed on the two nascent DNA molecules. As there are not enough CENP-A nucleosomes, gaps are filled in by canonical (i.e., H3-containing) nucleosomes. In the following G2 phase, CENP-A is expressed and the newly synthesized CENP-A proteins are chaperoned by HJURP, the CENP-A-specific histone chaperone (Foltz et al., 2009) which recognizes CENP-A’s CATD.
The loading of CENP-A into the chromatin seems to occur during the M and G1 phases in several steps. The first step is mediated by a complex formed by Mis18α, Mis18β, RbAp46, RbAp48 and M18BP1 (Fujita et al., 2007); this complex is transiently associated to centromeric chromatin between the telophase and the beginning of the next G1 phase, and is essential for CENP-A loading. It is never found directly associated with CENP-A, however, suggesting that its function is merely to prepare the centromeric chromatin to allow the later loading of CENP-A by other factors. The exact nature of that preparation step remains to be elucidated, but it could involve the dimethylation of H3K4 or transcription of centromeric alphoid DNA (Allshire & Karpen, 2008; Bergmann et al., 2011).
The second step is the effective deposition of CENP-A onto centromeric chromatin. This step is mediated by HJURP and requires the H3K4me2 mark (Bergmann et al., 2011). It would imply the maturation of a pre-nucleosomal complex, formed by HJURP and a CENP-A:H4 tetramer, into a full octameric nucleosome (Black & Cleveland, 2011), a process that lasts in the first hours of G1 phase. A last step occurs in late G1 to stabilize the freshly incorporated nucleosomes (Lagana et al., 2010).
CENP-A plays an essential role in kinetochore assembly and, indirectly, in spindle checkpoint function. Consequently, changes in CENP-A expression or localization may lead to aneuploidy caused by chromosome segregation defects. Since aneuploidy is well recognized as one of the hallmarks of cancer, it may be assumed that CENP-A is directly involved in the development of some tumors. The protein was found to be overexpressed in colorectal cancer, where it has a tendency to occupy chromosome arms instead of centromeres, thus impairing centromeric function (Tomonaga et al., 2003). CENP-A overexpression and resulting chromosome segregation defects could be a mechanism linking the loss of the retinoblastoma protein to the genomic instability which favors tumoral development (Amato et al., 2009). CENP-A was identified as one of the main markers of testis germ cells cancer, exhibiting a 20-fold increase of expression compared to normal testis tissues (Biermann et al., 2007).
CENP-A is also involved in some autoimmune diseases, as the target of anti-centomeres auto-antibodies. The first centromeric proteins, including CENP-A, were precisely discovered with such auto-antibodies isolated from patients suffering from the CREST syndrome (Palmer et al., 1987). These auto-antibodies could alter the assembly or the function of the centromeres, which would explain the aneuploidy frequently associated with auto-immune scleroderma (Jabs et al., 1993).
The biological significance of the unstructured NH₂-terminal extension of CENP-A has remained unclear. Although the tail itself exists in nearly all CENP-A homologs, it shows high variability in length and amino acid sequence. This suggested that the essential functions depend on the structured, conserved COOH-terminal domain—an hypothesis further reinforced by the identification, inside the COOH-terminal domain, of the CENP-A Targeting Domain (CATD), which is responsible for the exclusive centromeric localization of the protein (Black et al., 2004). Consequently, the NH₂-terminal domain attracted little attention.
In budding yeast, a segment of the NH₂-terminal domain of Cse4p, termed the essential N-terminal domain (END), was shown to be required for the proper segregation of chromosomes (Chen et al., 2000). In flies, an arginine-rich motif in the NH₂-terminal domain of CID is involved in recruiting BubR1 to the centromeres (Torras-Llort et al., 2010); in chicken, BubR1 recruitment is also dependent on CENP-A, although this function was not assigned to any specific part of the protein (Régnier et al., 2005). In human, overexpressing CENP-A leads to the mislocation of some centromeric proteins to the chromosome arms and this effect is dependent on the NH₂-terminal domain (Van Hooser et al., 2001). Taken together, these clues suggest that the NH₂-terminal domain of CENP-A, from yeasts to higher eukaryotes, fulfills a critical mitotic function.
Budding yeast’s Cse4p can functionally replace CENP-A in human cells after siRNA-mediated depletion of the endogenous protein (Wieland et al., 2004), even though Cse4p and CENP-A NH₂-terminal domains are divergent. Conventional human histone H3 carrying the CENP-A targeting domain can also functionally replace CENP-A (Black et al., 2007). So, there seems to be little constraints on the amino acid sequence of the NH₂-terminal domain.
It was recently shown that the mitotic phosphorylation of budding yeast Cse4p promotes accurate chromosome segregation (Boeckmann et al., 2013); another study in human cells reported that the phosphorylation of CENP-A at serine #7, which occurs at the beginning of mitosis, is critically required for proper chromosome segregation through the recruitment of 14-3-3 proteins and CENP-C (Goutte-Gattat et al., 2013).
The phosphorylation of CENP-A tail may be an important and evolutionary conserved mechanism in the regulation of mitosis. Importantly, all CENP-A homologs, however divergent, contain potentially phosphorylatable serines in their NH₂-terminal domains.