Homologous gene sequences mediate transcription-domain formation.

The organisation of transcription in the mammalian nucleus is a topic of particular interest because of its relevance to gene regulation. RNA polymerase II transcription occurs at hundreds of sites throughout the nucleoplasm. Recent data indicate that coordinately regulated genes can localise to shared transcription sites. Other transcribed sequences have also been shown to cluster in the nucleus. The ribosomal RNA genes cluster in the nucleoli. Similarly, transiently transfected plasmids and dsDNA viruses form transcription domains (TDs) containing multiple templates. Intriguingly, plasmids expressing β-globin gene sequences recruit the endogenous β-globin loci to their TDs. In light of this observation, we have investigated plasmid TDs as a model for gene recruitment. We find that TD formation is dependent on the presence of homologous gene sequences. Plasmids containing non-homologous gene sequences form separate TDs, independent of homology in the backbone or promoter sequences. TD formation is also favoured by low plasmid concentrations. This effect is sequence-specific and high concentrations of one plasmid do not disrupt domain formation by non-homologous plasmids in the same cell. We conclude that recruitment into TDs is an active process that is driven by homologies between transcribed sequences and becomes saturated at high copy numbers.


Introduction
RNA polymerase II (Pol II) transcription sites are scattered throughout the nucleoplasm in a punctate pattern. From electron microscopy studies it is estimated that a HeLa cell nucleus contains 10,000 of these distinct sites (Cook, 1999;Pombo et al., 1999), whereas primary cells contain only 100-300 (Osborne et al., 2004). Since the number of transcribing polymerases exceeds the number of transcription sites, it has been hypothesised that each site represents a 'transcription factory' containing multiple genes (Iborra et al., 1996;Jackson et al., 1998). Recent data from two laboratories indicates that coordinately regulated genes are recruited into shared transcription sites, even when located on different chromosomes. In mouse erythroid progenitor cells, actively transcribing alleles of the ␤-like globin gene Hbb-b1 colocalise with other transcribing genes in a 40MB region of chromosome 7 at a frequency of 40-60% (Osborne et al., 2004). Hbb-b1 also colocalises at a frequency of 7% with the ␣-globin gene Hba, which is located on chromosome 11, more frequently than expected by chance. Similarly, in mouse T cells the T H 2 locus control region on chromosome 11 colocalises with the adjacent T H 2 cytokine gene locus (Spilianakis and Flavell, 2004) and also, at a frequency of 40%, with the IFN-␥ gene on chromosome 10 (Spilianakis et al., 2005).
These are not the only instances in which coordinately regulated mammalian genes colocalise during transcription. The best-studied example is the ribosomal RNA genes, which cluster together into nucleoli where they are transcribed by RNA polymerase I (Lewis and Tollervey, 2000). Amongst Pol-II-transcribed genes, both histone genes and small nuclear RNA genes U1, U2, U4, U11 and U12 preferentially (but not exclusively) associate with the Cajal bodies during transcription (Gall, 2000). Similarly, dsDNA viruses, including herpes simplex virus-1, adenovirus-5, Epstein-Barr virus and human cytomegalovirus, as well as transiently transfected plasmids, form specialized transcription domains (TDs) in mammalian nuclei, which are the exclusive sites of early viral transcription (Huang and Spector, 1996;Maul, 1998). Previous work from our laboratory demonstrated that the endogenous ␤globin loci can be recruited into plasmid TDs when the plasmids express ␤-globin sequences (Ashe et al., 1997). In light of this observation, we have investigated the formation of plasmid TDs as a model for gene recruitment to transcription sites. Using co-transfected plasmid pairs, we find that TD formation is mediated by homologous gene sequences in the co-transfected plasmids. Plasmids with homologous gene sequences form shared TDs, whereas those with nonhomologous gene sequences form separate TDs, which are located adjacent to each other in the interchromatin compartment, irrespective of homology in the plasmid backbone and promoter sequences. The length of gene homology determines the degree of overlap between the TDs. Shared TD formation can be mediated by ␤-globin and ␣2globin gene sequences indicating that homology is the crucial factor. The plasmid copy number is also important in TD formation: low copy numbers favour TD formation, whereas high copy numbers favour a 'punctate pattern' of plasmid transcription consisting of many small transcription sites The organisation of transcription in the mammalian nucleus is a topic of particular interest because of its relevance to gene regulation. RNA polymerase II transcription occurs at hundreds of sites throughout the nucleoplasm. Recent data indicate that coordinately regulated genes can localise to shared transcription sites. Other transcribed sequences have also been shown to cluster in the nucleus. The ribosomal RNA genes cluster in the nucleoli. Similarly, transiently transfected plasmids and dsDNA viruses form transcription domains (TDs) containing multiple templates. Intriguingly, plasmids expressing ␤ ␤-globin gene sequences recruit the endogenous ␤ ␤-globin loci to their TDs. In light of this observation, we have investigated plasmid TDs as a model for gene recruitment. We find that TD formation is dependent on the presence of homologous gene sequences. Plasmids containing non-homologous gene sequences form separate TDs, independent of homology in the backbone or promoter sequences. TD formation is also favoured by low plasmid concentrations. This effect is sequence-specific and high concentrations of one plasmid do not disrupt domain formation by non-homologous plasmids in the same cell. We conclude that recruitment into TDs is an active process that is driven by homologies between transcribed sequences and becomes saturated at high copy numbers.
spread throughout the nucleoplasm. This effect is sequencespecific and, thus, high copy numbers of one plasmid do not inhibit TD formation by non-homologous plasmids within the same nucleus. To investigate the efficiency of gene expression in TDs, we measured expression of a GFP-tagged reporter protein. We find that cells with plasmid TDs contain lower levels of reporter protein, relative to nascent plasmid RNA than cells with punctate plasmid transcription, suggesting that TDs are less efficient at protein expression than punctate transcription sites.
In summary, we find that gene homology is crucial for TD formation. Our results also demonstrate that plasmid TDs bear a striking resemblance to early viral TDs, thus raising the possibility that TD formation is an inherent feature of the transcription of extrachromosomal elements. Our results also provide new insight into how genes might be recruited into shared transcription sites.

Localisation of plasmid TDs
The observation that the endogenous ␤-globin loci are recruited into plasmid TDs that express ␤-globin gene sequences (Ashe et al., 1997) indicates that ␤-globin gene sequences can mediate gene recruitment. To test this hypothesis, we designed three plasmids containing varying lengths of ␤-globin gene sequence. Plasmid pHIV-␤ contains the entire ␤-globin gene driven by the HIV-LTR promoter (Fig.  1A, first construct). Plasmid pHIV-␣ is identical to pHIV-␤ but contains the ␣2-globin gene in place of ␤-globin (Fig. 1A, second construct). Finally, plasmid pHIV-␣␤ is identical to pHIV-␤ but contains a fusion of the ␣2-globin and ␤-globin genes (Fig. 1A, third construct) joined at exon 2, with the reading frames of the ␣2-globin and ␤-globin gene sequences matched to prevent nonsense-mediated decay. Thus pHIV-␤ shares gene homology of ~1200 bp with pHIV-␣␤ but none with pHIV-␣.
Plasmid transcription was analysed by S1 nuclease assay to verify that the mRNAs were being efficiently transcribed and processed (Fig. 1B). Plasmids were transiently transfected into HeLa cells and cytoplasmic RNA was harvested. A probe for the 3Ј end of the plasmid gene was used to detect the mature mRNA. All of the plasmids generated correctly processed mRNA at similar efficiencies. Transcription of plasmids containing the HIV-LTR promoter was dependent on cotransfection of a Tat-expression plasmid (Fig. 1B, compare lanes 2 and 3), whereas transcription of pCMV-␤, containing the CMV promoter, was independent of Tat expression (Fig.  1B, lane 1), indicating that the promoters were being correctly regulated.
To visualise plasmid transcription, pHIV-␤ was transiently transfected into HeLa cells and the nascent ␤-globin pre-mRNAs were labelled by RNA in situ hybridisation with probes to the ␤-globin introns. Splicing is a co-transcriptional process and introns are degraded immediately after excision so intron probes are specific for the sites of transcription (Bentley, 1999;McCracken et al., 1997;Proudfoot, 2000). Importantly, recent work from our lab has demonstrated that splicing of the ␤-globin second intron occurs co-transcriptionally in transiently transfected constructs and the adjacent exons remain tethered to the elongating Pol II prior to splicing (Dye et al., 2006). Thus, probes to the ␤-globin second intron specifically label pre-mRNAs that are bound to the elongating polymerase.
When transiently transfected with pHIV-␤, 90% of cells displayed 5-20 discrete plasmid transcription signals in the nucleus, ranging from 200 nm to more than 1 m in diameter (Fig. 1C, left panel); we refer to these as TDs. A minority of transfected cells (approximately 10%) displayed an alternative pattern of plasmid transcription, in which hundreds of transcription sites were spread throughout the nucleoplasm (Fig. 1C, right panel). This pattern is reminiscent of endogenous Pol II transcription (Jackson, 2003) and we refer to it as the 'punctate pattern' of plasmid transcription.
To confirm that the TDs represent nascent transcription, we performed a dual-labelling in situ hybridisation experiment using probes to the first intron and the 3Ј flanking region (3Ј flank) of the plasmid gene. Polyadenylation occurs cotranscriptionally and the uncapped sequence downstream of the The plasmids are based on the pUC18 backbone (not shown) and contain the HIV-LTR promoter upstream of the plasmid gene. (B) S1 nuclease analysis of cytoplasmic RNA from HeLa cells transiently transfected with pHIV-␤, pHIV-␣␤ or pCMV-␤. The labelled probe was complementary to the third exon and 3Ј flank sequence of ␤globin, and mismatched with correctly processed mRNAs at the polyadenylation site. All plasmids generated correctly processed mRNAs corresponding to the band marked '␤pA' in lanes 1, 3 and 4. Transcription of pHIV-␤ was Tat-dependent (compare lanes 2 and 3), whereas transcription of pCMV-␤ was Tat-independent (lane 1). pHIV-␤ was also co-transfected with plasmid pHIV-␣, which had no discernible effect on transcription levels (compare lanes 3 and 5). (C) RNA in situ hybridisation of HeLa cells transiently transfected with pHIV-␤. The majority of transfected nuclei displayed five to 20 discrete nascent RNA signals corresponding to plasmid TDs (left panel). A minority of cells displayed a punctate pattern of plasmid transcription in which many transcription sites were scattered throughout the nucleoplasm (right panel). Bars, 5 m. polyadenylation site is degraded after poly(A) cleavage (Proudfoot et al., 2002;West et al., 2004). Indeed exonuclease degradation of the 3Ј flank transcripts occurs cotranscriptionally and is required for Pol II transcriptional termination in both yeast and mammalian genes (Kim et al., 2004;West et al., 2006;West et al., 2004). Consequently, 3Ј flank transcripts are exclusively associated with transcription sites. If splicing of the plasmid transcripts occurs cotranscriptionally then intron and 3Ј flank sequences will colocalise. Alternatively, if splicing is inefficient or defective then some of the intron signals may not colocalise with the 3Ј flank signals, indicating movement of unspliced transcripts away from the transcription site. The results of this experiment are shown in Fig. 2A. The intron and 3Ј flank probes showed complete colocalisation, indicating that intron sequences were only found at transcription sites ( Fig. 2A, top panels). This experiment was repeated using probes to the second intron of the plasmid gene that also showed complete colocalisation with the 3Ј flank probes ( Fig. 2A, middle panels).
We used the same dual-labelling technique to investigate whether punctate plasmid transcription sites also represent nascent transcription. HeLa cells were transiently transfected with pHIV-␤ and the transcripts were labelled by RNA in situ hybridisation using probes to the ␤-globin first intron and 3Ј flank. In cells displaying punctate plasmid transcription the probes showed complete colocalisation ( Fig. 2A, bottom panels). The experiment was repeated with probes to the ␤globin second intron that also showed complete colocalisation with the 3Ј flank probes (data not shown). These results confirm that both TDs and punctate transcription sites represent nascent transcription.

Plasmid TDs localise adjacent to the interchromatin granule clusters and the PML bodies
To determine the position of the plasmid TDs relative to the underlying nuclear architecture, we performed RNA in situ hybridisation together with immunocytochemistry for the interchromatin granule clusters (IGCs) and PML bodies. The IGCs are located in the interchromatin compartment where they are thought to be storage sites for splicing factors and may play a role in mRNA processing or the assembly of mRNA processing machinery (Lamond and Spector, 2003;Molenaar et al., 2004;Sacco-Bubulya and Spector, 2002;Zhang et al., 1994). Several highly active genes have been shown to localise to the periphery of the IGCs during transcription, indicating that this might be a preferred transcription site (Shopland et al., 2003). There is also evidence that processed transcripts migrate through the IGCs (Johnson et al., 2000;Melcak et al., 2000). Interestingly, transcripts from a splice-defective mutant of the collagen type I, ␣1 gene (COL1A1) accumulate in the IGCs and are not exported to the cytoplasm (Johnson et al., 2000). The function of PML bodies is not fully understood but they are located in the interchromatin compartment and localise, together with IGCs, adjacent to the early viral TDs of dsDNA viruses (Everett, 2001;Everett and Murray, 2005;Maul, 1998). It has been suggested that PML bodies form part of a cellular antiviral defence mechanism (Bell et al., 2001;Ishov and Maul, 1996;Maul, 1998).
To localise the plasmid TDs relative to IGCs and PML bodies, we performed combined RNA in situ hybridisation and Journal of Cell Science 119 (18) immunocytochemistry on HeLa cells transiently transfected with pHIV-␤. The IGCs and PML bodies were stained with anti-SC35 splicing factor and anti-PML protein antibodies, respectively. The pHIV-␤ TDs consistently localised to the periphery of IGCs (Fig. 2B, top panels), as do early viral TDs, an observation that is consistent with TDs being nascent transcription sites (Shopland et al., 2003). Of note, the SC35 staining pattern differed between transfected and untransfected cells with transfected nuclei showing little SC35 staining outside the IGCs (compare the untransfected and transfected nuclei in Fig. 2B, top panels). This might be due to the high levels of RNA transcription and processing in transfected cells, resulting in a rearrangement of splicing factors (Zeng et al., 1997). A similar pattern of SC35 staining was observed in cells with punctate plasmid transcription, although no specific HeLa cells were transiently transfected with pHIV-␤ and plasmid transcription sites were labelled by RNA in situ hybridisation using probes to the ␤-globin first intron (green signals). IGCs were labelled with an anti-SC35 antibody (top panels) and PML bodies were labelled with an anti-PML antibody (bottom panels). TDs were consistently observed adjacent to both IGCs and PML bodies. Bars, 5 m. association was noticed between the nascent plasmid mRNAs and the IGCs (data not shown).
The PML bodies also showed a specific association with plasmid transcription, whereby all the PML bodies were observed at the periphery of TDs (Fig. 2B, bottom panels). Recent data indicate that PML bodies associate with transcriptionally active regions of the genome as well as with dsDNA virus genomes (Ching et al., 2005;Wang et al., 2004). Since pHIV-␤ is heavily transcribed this might account for the observed association. Of note, an average of 2 PML bodies was observed per transfected cell compared with 5-10 PML bodies per untransfected cell (Weis et al., 1994), indicating that plasmid transcription might correlate with dispersion of the PML bodies (data not shown). This effect was observed in cells with TDs and those with punctate transcription. Disruption of the PML bodies is observed when cells are infected by dsDNA viruses, such as herpes simplex virus type 1, adenovirus-5, cytomegalovirus and Epstein-Barr virus, and is usually associated with expression of specific viral proteins (Everett, 2001).

TDs localise to the interchromatin compartment
Plasmid TDs localise adjacent to the IGCs and the PML bodies, both of which are located in the interchromatin compartment between the chromosome territories (Cremer and Cremer, 2001). By contrast, punctate plasmid transcription sites are found in a diffuse pattern that extends throughout the nucleus. To examine the relationship between punctate transcription sites and the interchromatin compartment, we transiently transfected plasmid pHIV-␤ into HeLa cells stably expressing a GFP-labelled histone H2B protein (Kanda et al., 1998;Kimura and Cook, 2001). H2B is one of four core histones found in stable chromatin and can be used to outline the interchromatin compartment (Cremer and Cremer, 2001;Cremer et al., 2004;Verschure et al., 2002). In live cells H2B-GFP staining concentrates in heterochromatin, whereas in fixed and permeabilised cells it stains both heterochromatin and euchromatin; studies have shown, however, that the underlying nuclear architecture is preserved and the interchromatin compartment can still be visualised after fixation (Solovei et al., 2002).
Two H2B-GFP-expressing HeLa cells transiently transfected with pHIV-␤ are shown in Fig. 3A (top panels). Both cells display TDs. From the overlay image (right panel) it is clear that plasmid TDs colocalise with the interchromatin compartment, as represented by the absence of H2B-GFP signal. This result is consistent with the finding that plasmid TDs localise adjacent to IGCs and PML bodies. Fig. 3B shows one untransfected nucleus and one with punctate pHIV-␤ transcription. The punctate transcription pattern overlies the H2B staining pattern (Fig. 3B, right panel) indicating that punctate transcription sites are located outside the interchromatin compartment. Of note, the interchromatin compartment was enlarged in nuclei with TDs relative to those with punctate plasmid transcription and untransfected nuclei, indicating that the presence of TDs may enlarge the interchromatin compartment (Fig. 3, compare the H2B-GFP staining pattern in A and B).

␤-globin gene sequences can mediate recruitment into TDs
We wished to determine whether the ␤-globin gene can recruit homologous sequences into TDs. To answer this question we co-transfected plasmid pHIV-␤, containing the entire ␤-globin gene, into HeLa cells with pHIV-␣ or pHIV-␣␤. pHIV-␣ is identical to pHIV-␤ except that it contains the ␣2-globin gene in place of ␤-globin (Fig. 1A, first and second constructs). Conversely, pHIV-␣␤ contains 1243bp of ␤-globin gene sequence from the second exon to the polyadenylation site (Fig. 1A, first and third constructs). Transcription of the co-transfected plasmids was visualised by RNA in situ hybridisation using probes to the first introns of ␤-globin and ␣2-globin. Fig. 4A displays HeLa cells co-transfected with pHIV-␤ and pHIV-␣. Under standard transfection conditions (Materials and Methods) between 40% and 60% of cells contained plasmid transcripts. All of these displayed transcripts from both plasmids, the majority (~90%) in TDs with the remainder displayed them in a punctate pattern (data not shown). In cells with TDs, the pHIV-␤ and pHIV-␣ signals were found adjacent to each other but did not colocalise (Fig. 4A, third panel). Line scans (2D intensity plots) from Fig. 4A reveal that although the pHIV-␤ and pHIV-␣ TDs are closely apposed there is no overlap of the signal peaks ( Fig. 4B) indicating that pHIV-␤ and pHIV-␣ have formed adjacent TDs. In order to verify this result, plasmid pHIV-␤ was co-transfected with plasmid pHIV-⑀ which contains the ⑀-globin gene in place of ␤-globin (Ashe et al., 1997). Once again, the co-transfected plasmids formed adjacent but separate TDs (supplementary material Fig. S1A). pHIV-␤ and pHIV-␣ were also transfected sequentially (24 hours apart) into HeLa cells. Under these conditions, the TDs of the first plasmid were observed independently from TDs of the second plasmid but not vice-versa, implying that the second plasmid formed TDs at sites already occupied by the first plasmid (supplementary material Fig. S1B). Next, pHIV-␤ was co-transfected with pHIV-␣␤, which contains 1243 bp of ␤globin gene sequence from the second exon to the poly(A) site (Fig. 1A, first and third constructs). Transcription was labelled by RNA in situ hybridisation using probes to the first introns of ␤globin and ␣2-globin. pHIV-␤ and pHIV-␣␤ TDs showed a high degree of colocalisation (Fig. 4C, third panelnotice the overlapping red and green signals in the overlay image). Line scans of Fig. 4C are displayed in Fig. 4D. These demonstrate a high degree of correlation between the transcription signals of pHIV-␣␤ and pHIV-␤ together with overlapping signal peaks (Fig. 4D). Thus, the 1243 bp of ␤-globin sequence that is present in pHIV-␣␤, but not pHIV-␣, can mediate recruitment of pHIV-␣␤ into pHIV-␤ TDs.
The degree of TD colocalisation correlates with the length of homologous ␤-globin sequence Co-transfections with pHIV-␤, pHIV-␣ and pHIV-␣␤ indicated that a 1243-bp sequence from the 3Ј end of the ␤-globin gene could mediate recruitment of pHIV-␣␤ into pHIV-␤ TDs. To identify the specific element within this sequence that was mediating recruitment, we designed four additional plasmids to use in cotransfection experiments with pHIV-␤ (Fig. 5A). The new plasmids were based on plasmid pHIV-␣␤ but contained varying lengths of ␤-globin sequence located at either the 5Ј or 3Ј end of the plasmid gene, offset by ␣2-globin sequence.
The ␣␤ plasmids were co-transfected into HeLa cells with pHIV-␤. The plasmid transcription sites were labelled by RNA in situ hybridisation using intronic or 3Ј flank probes, depending on the location of non-homologous sequences in the co-transfected plasmids. All of the probes showed complete colocalisation when hybridised to a single transcript, so this did not affect the experimental results (data not shown). The degree of colocalisation between the plasmid transcription signals was quantified using the overlap coefficient, a measure of signal correlation that is sensitive to intensity variations within each channel and not absolute intensities. Thus, it can be used to analyse signals obtained with two different probes Journal of Cell Science 119 (18) (Manders et al., 1993). The value of the overlap coefficient ranges from 0 to 1, where 1 indicates perfect correlation. For each co-transfected plasmid pair, overlap coefficients were calculated using confocal images of transfected nuclei. Cytoplasmic signals were excluded from the analysis to minimise the effect of endogenous biotin signals and other non-specific background. Thirty or more nuclei from three separate experiments were analysed to obtain a final value for each plasmid pair.
The overlap coefficients for pHIV-␤ co-transfections are shown in Fig. 5B. As a positive control, pHIV-␤ was transfected alone and the plasmid transcripts were double- labelled with two probes to the ␤-globin first intron. The probes showed complete colocalisation and gave an overlap coefficient of 0.89±0.04 (Fig. 5B, bar 1). As a negative control pHIV-␤ was co-transfected with pHIV-␣, which does not contain ␤-globin gene sequence. The plasmids formed separate TDs giving an overlap coefficient of 0.33±0.06 (Fig. 5B, bar  2). From these data we established 0.9 and 0.3 as measures of colocalisation and non-colocalisation, respectively.
pHIV-␤ was then co-transfected with each of the ␣␤ plasmids. In these co-transfections, the plasmid TDs displayed intermediate degrees of colocalisation. Fig. 5B displays the results of these experiments. The light blue bars (bars 3 and 4) represent plasmids with 5Ј ␤-globin sequences, whereas the dark blue bars (bars 5, 6 and 7) represent plasmids with 3Ј ␤globin sequences. The curve indicates the length of ␤-globin sequence in each ␣␤ plasmid. A clear correlation is observed between the length of ␤-globin sequence in the co-transfected plasmid and the degree of colocalisation between the TDs. ANOVA testing revealed that the length of ␤-globin homology sequence had a significant effect on the overlap coefficient (P<0.001). Student's-Newman-Keuls multiple-comparison testing revealed that all of the plasmid pairs differed significantly in their overlap coefficients (P<0.01) with the exception of the pHIV-␣␤2 and pHIV-␤␣2 co-transfections. These plasmids contain 655 bp of 3Ј ␤-globin sequence and 543 bp of 5Ј ␤-globin sequence, respectively. Least-squares regression of the overlap coefficients relative to the length of ␤-globin gene homology yielded an equation of y=0.35+0.00023ϫ, indicating that the overlap coefficient increases by 0.023 per 100 bp of homology (95% confidence interval equals 0.013-0.033). The R 2 value for this regression was 0.91 (P<0.01). The results displayed in Fig. 5B demonstrate that TD colocalisation correlates with the length of ␤-globin homology sequence, irrespective of its position at the 5Ј or 3Ј end of the ␤-globin gene. Colocalisation does not appear to be mediated by a specific element within the ␤-globin sequence.
Homologous ␣2-globin sequences can mediate TD colocalisation We wished to determine whether gene sequences other than the human ␤-globin gene can mediate TD colocalisation. We therefore repeated the co-transfection experiments substituting plasmid pHIV-␣, containing the ␣2-globin gene, for pHIV-␤, containing the ␤-globin gene. pHIV-␣ was co-transfected into HeLa cells with each of the ␣␤ plasmids and the nascent plasmid transcripts were labelled by RNA in situ hybridisation using probes to the plasmid introns or 3Ј flanks. As a positive control, plasmid pHIV-␣ was transfected alone and its TDs were double-labelled with two probes to the ␣2-globin first the ␣␤ plasmids, and the TDs were labelled by RNA in situ hybridisation using ␤-globin and ␣2-globin intronic probes. The degree of colocalisation between the TDs was calculated using the overlap coefficient which ranges from 0 to 1, with 1 indicating perfect colocalisation. The blue curve indicates the length of ␤-globin gene sequence in the co-transfected plasmids. The highest overlap coefficients were observed when pHIV-␤ was transfected alone and the nascent transcripts were double-labelled with two different probes to the ␤-globin first intron (bar 1). Co-transfections of pHIV-␤ and plasmid pHIV-␣ (which contains no ␤-globin gene sequence) gave the lowest overlap coefficients (bar 2). When pHIV-␤ was co-transfected with plasmids containing varying lengths of ␤-globin gene sequence (bars 3-7) the value of the overlap coefficient correlated with the length of homologous ␤-globin gene sequence. (C) Homologous ␣2-globin gene sequences can also mediate TD formation. Plasmid pHIV-␣ was co-transfected with each of the ␣␤ plasmids shown in Fig. 3A. The TDs were labelled by RNA in situ hybridisation using ␤-globin and ␣2-globin intronic probes. The degree of colocalisation between the transcription signals was measured using the overlap coefficient. The blue curve indicates the length of ␣2-globin sequence in the cotransfected plasmids. The highest overlap coefficients were observed when pHIV-␣ was transfected alone and the nascent transcripts were double-labelled using two different ␣2-globin first intron probes (bar 1). Co-transfections of pHIV-␣ and plasmid pHIV-␤ gave the lowest overlap coefficients (bar 2). When pHIV-␣ was co-transfected with plasmids containing varying lengths of ␣2-globin gene sequence (bars 3-7) the value of the overlap coefficient correlated with the length of homologous ␣2-globin gene sequence.
intron. These signals colocalised, giving an overlap coefficient of 0.81±0.11 (Fig. 5C, bar 1). As a negative control pHIV-␣ was co-transfected with pHIV-␤. The plasmids formed separate TDs giving an overlap coefficient of 0.33±0.06 (Fig. 5C, bar  2). When pHIV-␣ was co-transfected with the ␣␤ plasmids the degree of colocalisation between the TDs correlated with the length of ␣2-globin sequence in the co-transfected plasmid. In Fig. 5C the light blue bars represent plasmids with 5Ј ␣2-globin sequences, whereas the dark blue bars represent plasmids with 3Ј ␣2-globin sequences. The blue curve indicates the length of ␣2-globin gene sequence in each ␣␤ plasmid. A clear correlation is observed between the length of ␣2-globin sequence and the degree of overlap between the TDs. ANOVA testing revealed that the length of the ␣2-globin gene homology sequence had a significant effect on the overlap coefficient (P<0.001). Student's-Newman-Keuls multiple comparison testing revealed that all of the plasmid pairs differed significantly in their overlap coefficients (P<0.01), with the exception of the pHIV-␣␤2, pHIV-␣␤3 and pHIV-␤␣ cotransfections. All of those plasmids contain between 495 bp and 555 bp of ␣2-globin sequence. Least squares regression of the overlap coefficients relative to the length of ␣2-globin gene homology sequence yielded an equation of y=0.32+0.00056ϫ, indicating that the overlap coefficient increases by 0.056 per 100 bp of homology (95% confidence interval equals 0.028-0.084). The R 2 value for this regression was 0.89 (P<0.01). Thus, both ␤-globin and ␣2-globin sequences can mediate the formation of shared TDs with the degree of colocalisation correlating with the length of gene homology, irrespective of its position in the plasmid gene.
Homologous promoter sequences are not required for TD formation Plasmids pHIV-␤ and pHIV-␣ contain identical promoter sequences upstream of non-homologous genes. When these plasmids are co-transfected into HeLa cells they form separate TDs, located adjacent to each other in the nucleus. Conversely, pHIV-␤ and pHIV-␣␤, which also share 1293 bp of homologous ␤-globin sequence, form shared TDs. From these results it is apparent that homologous promoters cannot mediate TD colocalisation, however, it is unclear whether or not they are required for TD formation.
To determine whether homologous promoter sequences are required for the formation of shared TDs, we co-transfected plasmid pHIV-␣ with plasmid pCMV-␤␣. pCMV-␤␣ is identical to pHIV-␤␣ but contains the CMV immediate-early promoter. When co-transfected into HeLa cells with pHIV-␣, the plasmid transcription signals gave an overlap coefficient of 0.55±0.09 (data not shown). By comparison, pHIV-␣ and pHIV-␤␣ co-transfections gave an overlap coefficient of 0.62±0.07 (Fig. 5C), which is not significantly different (P<0.05). Thus, homologous gene sequences can mediate TD formation in the absence of homologous promoters.
High plasmid-copy numbers correlate with punctate plasmid transcription We noticed that cells located in close proximity to each other on coverslips typically displayed the same pattern of plasmid transcription, be it TDs or punctate transcription (data not shown). We hypothesised that they might be mitotic sister cells, which should contain a similar plasmid copy number and, Journal of Cell Science 119 (18) therefore, we theorized that the pattern of plasmid transcription correlates with the number of plasmids in the cell. To test this hypothesis, we transfected into HeLa cells pHIV-␤ at increasing concentrations ranging from 0.1-1 g of plasmid DNA per 35-mm plate. As the DNA concentration increased, there was a corresponding increase in the transfection efficiency and the percentage of transfected cells displaying punctate plasmid transcription (Fig. 6A). Punctate transcription rose from 0% of transfected cells at 0.1 g of DNA to 12% of transfected cells at 0.4 g. This result confirms that higher plasmid-copy numbers do correlate with punctate plasmid transcription.
The effect of plasmid concentration on the plasmid transcription pattern is plasmid-specific To determine whether the pattern of plasmid transcription is a function of the total plasmid copy number or of the copy number of each individual plasmid we co-transfected plasmids pHIV-␣ and pHIV-␤ into HeLa cells at a ratio of 9:1. In this way we ensured that plasmid pHIV-␣ was at a higher concentration in all transfected cells than plasmid pHIV-␤. If the plasmid transcription pattern were a function of total plasmid copy number, the two plasmids should always be transcribed in the same pattern. Conversely, if the copy number of each individual plasmid were to determine the transcription pattern only for that plasmid, some cells should display pHIV-␣ in a punctate pattern with pHIV-␤ in TDs. Fig. 6B displays a single cell transfected with pHIV-␣ and pHIV-␤ at a ratio of 9:1. The pHIV-␣ transcription sites (Fig.  6B, left) are observed in a punctate pattern, whereas the pHIV-␤ transcription sites (Fig. 6B, middle) are in TDs. In 9:1 cotransfections of pHIV-␣ and pHIV-␤, 20% of cells displayed this 'mismatched' pattern of plasmid transcription. The remaining 80% of cells displayed both plasmids in the same transcription pattern, be it TDs or punctate transcription. The reverse pattern, with pHIV-␣ in TDs and pHIV-␤ in a punctate pattern, was not observed. A similar relationship between plasmid copy number and transcription pattern was observed when pHIV-␣ and pHIV-␤ were co-transfected at a ratio of 1:9 (data not shown). These results indicate that the plasmid transcription pattern is determined by the copy number of each individual plasmid, and is not a function of the total plasmid concentration. Thus, high concentrations of one plasmid do not prevent TD formation by non-homologous plasmids within the same cell. Our results also demonstrate that punctate plasmid transcription and TDs can coexist within a single nucleus, thereby confirming that the pattern of transcription is not determined by the characteristics of the host cell.
The punctate pattern of plasmid transcription correlates with higher levels of protein expression Given that the punctate pattern of plasmid transcription correlates with high plasmid-copy number, we hypothesised that it also correlates with high levels of plasmid RNA and protein expression. To address this question, we designed plasmid pCMV-GFP-␤, which expresses a GFP-linked ␤globin protein driven by the CMV immediate-early promoter. The GFP-labelled protein provides a direct measure of protein expression with the relationship between protein level and signal intensity being linear, except at high concentrations when the GFP signal under-represents the protein level (Soboleski et al., 2005). Thus, GFP intensity will tend to underestimate differences between cell populations in protein expression rather than overestimate them.
pCMV-GFP-␤ was transiently transfected into HeLa cells and the nascent plasmid transcripts were labelled by RNA in situ hybridisation, using probes to the ␤-globin first intron. Nascent RNA and GFP levels were quantified in individual cells with the ImageJ software package and plotted according to the plasmid transcription pattern (Fig. 6C). For ease of interpretation, the GFP signal intensities have been normalised to the values measured in cells with TDs. The results indicate that cells with punctate plasmid transcription have consistently higher nascent RNA and GFP-␤-globin signals than cells with TDs. The difference in RNA levels was not significant (P<0.05), however, the GFP-␤-globin concentration in cells with punctate plasmid transcription was significantly higher than in cells with TDs (P<0.01). Cells with punctate plasmid transcription and TDs both arise at the same time post-transfection (data not shown) and thus the higher GFP-␤-globin levels in punctate transcription cannot Fig. 6. High plasmid-copy numbers favour punctate plasmid transcription. (A) HeLa cells were transiently transfected with increasing concentrations of plasmid pHIV-␤ and the plasmid transcription sites were labelled by RNA in situ hybridisation using probes of the ␤globin first intron. Higher DNA concentrations correlated with higher transfection efficiencies (dashed blue curve) and also with a greater percentage of transfected cells displaying punctate plasmid transcription signals (pink curve). (B) HeLa cells were transiently transfected with pHIV-␣ and pHIV-␤ at a concentration ratio of 9:1. The cell in this image displays pHIV-␤ in TDs and pHIV-␣ in punctate transcription sites indicating that high plasmid copy numbers do not inhibit TD formation by non-homologous plasmid in the same cell. Bar, 5 m. (C) Plasmid pCMV-GFP-␤ was transiently transfected into HeLa cells and the plasmid transcription sites were labelled by RNA in situ hybridisation using probes to the ␤-globin first intron. RNA and GFP-␤-globin signal intensities were recorded for individual cells and plotted according to the plasmid transcription pattern. Results were normalised to the signal intensities observed in cells with TDs. Cells with punctate plasmid transcription showed significantly elevated levels of GFP staining relative to cells with TDs. A cell transfected with pCMV-GFP-␤ displaying plasmid TDs (red) and GFP signals (green) is shown on the right. The overlapping GFP signals and TDs appear yellow. Bar, 5 m. be a function of longer accumulation times. These observations suggest that the path from nascent RNA to protein is more efficient in cells with punctate plasmid transcription than in cells with TDs.

Discussion
In this study, we have used transiently transfected plasmids to investigate the mechanism of TD formation. Previous work from our laboratory demonstrated that the endogenous ␤globin genes can be recruited into TDs formed by plasmids expressing the human ␤-globin gene (Ashe et al., 1997). This observation suggested that TD formation occurs through the recruitment of specific templates, rather than by passive sequestration of extrachromosomal DNA. To investigate the mechanism of TD formation, we designed a series of RNA Pol II expression plasmids containing differing lengths of homologous sequence and examined their behaviour in cotransfection experiments.
We find that, like early viral TDs, plasmid TDs are located in the interchromatin compartment adjacent to the IGCs and PML bodies (Maul, 1998). Plasmids containing homologous gene sequences form shared TDs, whereas those containing non-homologous gene sequences form separate but adjacent TDs. Both ␤-globin and ␣-globin gene sequences can mediate TD formation and the degree of TD colocalisation correlates with the length of gene homology, not with the presence of any specific gene sequence.
Our results also demonstrate the existence of a punctate pattern of plasmid transcription in which plasmid transcription sites are spread throughout the nucleoplasm rather than being restricted to the interchromatin compartment. High plasmidcopy numbers favour punctate plasmid transcription, although non-homologous plasmids in the same cell may still form TDs. Interestingly, reporter protein expression is higher, relative to nascent RNA levels, in cells with punctate plasmid transcription than in cells with TDs, suggesting that nascent RNAs arising in TDs are less efficiently processed and exported than those arising in punctate plasmid transcription sites.
Our results support a model in which recruitment of DNA into TDs is mediated by homology between transcribed sequences. The importance of coding sequences indicates that transcription may play a central role in this process. The observation that high concentrations of one plasmid do not disrupt TD formation by a second plasmid in the same cell suggests that the mechanism can be saturated. To account for these observations we propose a model, in which transcription sites containing a single DNA template evolve into TDs through a process of template recruitment that is mediated by transcription and processing factors present at the site. The specific combination of factors at a transcription site will have a high affinity for homologous templates and a lower affinity for non-homologous templates. Once a transcription site forms around a single template, the proteins present at the site will have a natural tendency to recruit other homologous templates. These, in turn, can recruit more processing-and transcription-factors until a domain is formed. This model might also account for the fact that non-homologous plasmids form adjacent TDs, because the non-homologous plasmids might share some of the same processing-and transcriptionfactors.

Journal of Cell Science 119 (18)
Our model accounts for the importance of coding sequences in mediating TD formation and also for the linear relationship between the length of gene homology sequence and the degree of TD colocalisation, because longer sequences contain more protein-binding sites. The relative unimportance of promoter sequences in TD formation might stem from the fact that a number of transcriptional initiation factors form a stable scaffold at the transcription start site and are thus unavailable for binding to other templates (Yudkovsky et al., 2000). Our model may also account for the punctate pattern of plasmid transcription because at high plasmid concentrations the supply of processing-and transcription-factors will be exhausted and, thus, TDs will not form.
Our model of TD formation implies that homologous pairing is a normal feature of transcription. Although the size and complexity of the human genome is an obvious impediment to pairing, examples exist of chromosome pairing in lower eukaryotes. Drosophila and other dipterans display homologous chromosome pairing throughout most of the mitotic cell cycle; this pairing contributes to homology-based effects such as transvection and homology-dependent silencing (Wu and Morris, 1999). Pairing is first observed in Drosophila embryos at the mid-blastula transition, coincident with the onset of transcription, and it has been hypothesised that pairing is mediated by a transcription-dependent mechanism (Gemkow et al., 1998). Transcriptional regulators, such as promoters and enhancers, may play a central role in this process (McKee, 2004). The fission yeast Schizosaccharomyces pombe also demonstrates high levels of pairing with homologous chromosomes occupying adjacent territories in 95% of mitotic diploid nuclei and individual sequences pairing at rates of 20-60% (Scherthan et al., 1994). The mechanism of pairing in S. pombe is unknown. In higher eukaryotes, homologous chromosomes are unpaired but a subset of genes exhibit pairing behaviour. Notably, the ribosomal DNA genes cluster by a transcription dependent-mechanism to form the nucleoli (Dousset et al., 2000;Sullivan et al., 2001). Interestingly, nucleolar-like structures form even when the rDNA genes are located on extrachromosomal plasmids and irrespective of whether the plasmids are transcribed by RNA Pol I or Pol II (Oakes et al., 1998;Trumtel et al., 2000). Thus, in eukaryotes, pairing is frequently associated with transcription and the behaviour of transiently transfected plasmids might reveal an underlying tendency of transcription to result in the pairing of homologous sequences.
Our results also demonstrate several parallels between plasmid TDs and the early TDs of dsDNA viruses. First, both plasmid TDs and early viral TDs localise adjacent to the IGCs and the PML bodies (Maul, 1998). Second, higher plasmid-copy numbers favour punctate plasmid transcription, whereas higher viral loads, secondary to replication, result in the transcription of dsDNA viruses throughout the nucleus (Ishov et al., 1997;Maul, 1998;Pombo et al., 1994). Thus, TDs are a feature of lowcopy-number plasmids and low-viral-load infections. Coupled with our data showing that protein expression is depressed in TDs relative to punctate plasmid transcription sites, these observations indicate that TD formation favours the host cell by limiting expression from extrachromosomal DNA sequences. A transcriptional mechanism that clusters homologous extrachromosomal may thus be of benefit to cells in their ongoing battle against infection. Transcription-domain formation

Tissue culture
HeLa cells were grown in Dulbecco's modified Eagle's medium (DMEM) with 10% foetal calf serum, 2 mM glutamine, 100 units/ml penicillin and 100 g/ml streptomycin. For in situ hybridisations, cells were grown on 22-mm 2 glass coverslips and transfected using the liposomal reagent Effectene (Quiagen) at 25% confluence. Optimised transfection conditions were 1 g of plasmid DNA, 50 l of buffer, 2.7 l of enhancer and 4.0 l of Effectene. Cultures of GFP-histone H2B HeLa cells (Kimura and Cook, 2001) were supplemented with 2 g/ml of blasticidin S (Invitrogen).
RNA analysis S1 nuclease analysis was performed as previously described (Ashe et al., 1995). In brief, plasmids were transiently transfected into HeLa cells, with or without cotransfection of a Tat-expressing plasmid. After 24 hours, cytoplasmic RNA was harvested and steady-state plasmid RNA expression was assayed by S1 nuclease mapping. Probe for the 3Ј end of the ␤-globin mRNA was EcoRI-digested pHIV-␤ cut in exon 3 and end-labelled with [␣ 32 P]dATP. Correctly processed ␤-globin mRNAs generated a 212 bp fragment from the EcoRI site to the poly(A) site.

RNA in situ hybridisation probes
DNA oligonucleotides probes were 3Ј end-labelled with terminal deoxynucleotidyl transferase (Tdt) using a protocol adapted from published methods (Dernburg et al., 1996). Briefly, 2 g of oligo probe was labelled in a 20 l volume containing 4 l of 5ϫ Tdt buffer, 1 l of dTTP (2.5 mM), 1.4 l of labelled dUTP (1 mM) and 0.6 l of Tdt enzyme. The probe was incubated at 37°C for 6 hours, purified on a Qiaquick nucleotide removal column (Qiagen), eluted in 90 l of elution buffer and ethanol-precipitated with 10 mg of glycogen and 0.3 M sodium acetate. Purified probe was resuspended in 20 l H 2 O. Oliginucleotide probes are listed in Table 1.

RNA in situ hybridisation
In situ protocols were adapted from previously published methods (Jimenez-Garcia and Spector, 1993). Briefly, HeLa cells were on 22-mm 2 coverslips at ~15% confluence. Cells were grown overnight, transfected and grown overnight again. After a brief PBS wash they were fixed in 4% formaldehyde in PBS for 15 minutes at room temperature, washed three times in PBS and permeabilised in 0.5% Triton X-100 in PBS on ice for 4 minutes, followed by further PBS washes. Endogenous biotin signals were blocked with a streptavidin-biotin blocking kit (Vector) according to the manufacturer's protocol. The cells were equilibrated in 2ϫSSC before hybridisation.
For hybridisation, coverslips were mounted in 15 l of hybridisation mix containing 50% formaldehyde, 2ϫSSC, 10% dextran sulphate, 50 ng/l tRNA, 50 ng/l oligo dT 20 and 50 ng of labelled probe. Coverslips were sealed with rubber cement and placed in a humid chamber overnight at 37°C. Post-hybridisation washes were 50% formamide in 2ϫSSC at 37°C, 2ϫSSC at room temperature, and 1ϫSSC at room temperature for 30 minutes, each followed by blocking in 4% BSA in 4ϫSSC for 15 minutes. Antibodies and avidins were applied in 20 l of 1% BSA in 4ϫSSC for 45 minutes at room temperature. Biotinylated probes were detected with FITC-avidin (Vector Labs) at 10 ng/l and digoxigenin-labelled probes with Rhodamine anti-digoxigenin (Roche) at 200 ng/l. Post-antibody washes were 4ϫSSC, 0.2% Tween 20 in 4ϫSSC, 4ϫSSC, and 2ϫSSC followed by mounting in 15 l of Pro-Long anti-fade (Molecular Probes).

Immunocytochemistry
Cells were fixed and permeabilised as described for RNA in situ hybridisation. Coverslips were blocked in 0.2% RNase-free BSA (Boehringer) and then anti-PML (Tsukamoto et al., 2000) or anti-SC35 (Fu and Maniatis, 1990) primary antibodies were applied in BSA-PBS for 45 minutes. Coverslips were washed in PBS and Cy3conjugated anti-mouse (Jackson) antibody was applied at 10 ng/l in 0.2% BSA in PBS followed by washing in PBS and 2ϫSSC. The remainder of the hybridisation protocol was performed as described above.

Microscopy
Images were obtained on a BioRad MRC1000 confocal laser scanning microscope controlled with Lasersharp imaging software. The set-up consisted of an Argon/Krypton laser fitted to a Nikon Diaphot 200 inverted microscope. Images were acquired through a 60ϫ PlanApo oil-immersion objective (NA 1.4) and an iris aperture of 1.0 Airy disk with the microscope in low-signal-low-scan mode. Laser power was set to the lowest level at which the brightest signals completely used the 8-bit dynamic range of the detectors. Images were Kalman filtered (n=5). Raw digital images were imported into Adobe Photoshop as tif files and contrast-

Image analysis
Line scans were acquired using the ImageJ software package (NIH). Overlap coefficients, a measure of pattern colocalisation that is based upon Pearson's correlation coefficient and was developed for light microscopy (Manders et al., 1993), was used to analyse signal colocalisation. This method incorporates information about intensity variations within each channel but does not compare absolute intensities. Overlap coefficients were calculated using Metamorph offline software (Universal imaging) from images of individual nuclei acquired under standardised conditions. Regression analysis used the least squares method to determine the degree of fit (R 2 ), of changes in the overlap coefficient relative to the length of gene sequence homology in transfected plasmid pairs. For pCMV-GFP-␤ transfections, images were acquired under standardised conditions and exported to the Metamorph Offline software package. RNA signal intensities were measured for individual nuclei, to eliminate the effects of nonspecific cytoplasmic background, whereas GFP intensities were measured for entire cells. A minimum of 30 cells with TDs and 30 cells with punctate transcription were selected at random for measurement in each experiment.