Compositionally distinct nuclear pore complexes of functionally distinct dimorphic nuclei in the ciliate Tetrahymena

ABSTRACT The nuclear pore complex (NPC), a gateway for nucleocytoplasmic trafficking, is composed of ∼30 different proteins called nucleoporins. It remains unknown whether the NPCs within a species are homogeneous or vary depending on the cell type or physiological condition. Here, we present evidence for compositionally distinct NPCs that form within a single cell in a binucleated ciliate. In Tetrahymena thermophila, each cell contains both a transcriptionally active macronucleus (MAC) and a germline micronucleus (MIC). By combining in silico analysis, mass spectrometry analysis for immuno-isolated proteins and subcellular localization analysis of GFP-fused proteins, we identified numerous novel components of MAC and MIC NPCs. Core members of the Nup107–Nup160 scaffold complex were enriched in MIC NPCs. Strikingly, two paralogs of Nup214 and of Nup153 localized exclusively to either the MAC or MIC NPCs. Furthermore, the transmembrane components Pom121 and Pom82 localize exclusively to MAC and MIC NPCs, respectively. Our results argue that functional nuclear dimorphism in ciliates is likely to depend on the compositional and structural specificity of NPCs. Summary: There are compositional and structural differences in the nuclear pore complexes present in the functionally differentiated macronucleus and micronucleus within the single cytoplasm of ciliated protozoa.


INTRODUCTION
Ciliated protozoa maintain two distinct nuclei within the same cytoplasm: a somatic macronucleus (MAC) and a germline micronucleus (MIC) (Fig. 1A) (Eisen et al., 2006;Orias et al., 2011;Karrer, 2012). The polyploid MAC is transcriptionally active, and its acentromeric chromosomes segregate during cell division by a spindle-independent amitotic process. In contrast, the diploid MIC has transcriptionally inert, centromeric chromosomes that segregate by canonical mitosis. In Tetrahymena thermophila, DNA replication in the MIC and MAC occurs during non-overlapping periods in the cell cycle. Thus, nuclear dimorphism in ciliates involves nonequivalent regulation of multiple activities in two distinct nuclei (Orias, 2000;Goldfarb and Gorovsky, 2009). This is likely to require targeted transport of components to the MIC versus MAC, for which differences in the NPCs may be important determinants.
Previously, we analyzed 13 Tetrahymena nucleoporins (Nups), and discovered that four paralogs of Nup98 were differentially localized to the MAC and MIC (Iwamoto et al., 2009). The MACand MIC-specific Nup98s are characterized by Gly-Leu-Phe-Gly (GLFG) and Asn-Ile-Phe-Asn (NIFN) repeats, respectively, and this difference is important for the nucleus-specific import of linker histones (Iwamoto et al., 2009). The full extent of the compositional differentiation of MAC and MIC NPCs could not, however, be assessed, since only a small subset of the expected NPC components were detected.
Based on prior analysis, T. thermophila appeared to lack homologs of many widely conserved NPC components. These included scaffold Nups (mammalian Nup205, Nup188, Nup160, Nup133, Nup107, Nup85 and Nup53, among others) from the Nup93 and Y-complexes. Similarly, homologs of FG-Nups Nup214, Nup153, Nup62 and Nup58 were also not detected, and neither were TM Nups except for gp210. These NPC components may have evaded homology-based searches due to extensive sequence divergence, given the large evolutionary distance between ciliates and animals, fungi and plants.
To address these ambiguities and to better understand NPC differentiation in T. thermophila, we attempted a comprehensive identification of Nups. First, we analyzed proteins that were affinity captured with known Nups. Furthermore, we mined updated genome and protein databases for characteristic Nup sequences or conserved domains through in silico structure prediction techniques. The resulting expanded catalog of Tetrahymena Nups, combined with localization data, sheds new light on the extent to which NPC architecture can vary within a single species, and even within a single cytoplasm.

RESULTS
The Nup93 complex includes a unique Nup205 ortholog and a novel central channel FG-Nup In mammalian cells, the Nup93 complex ( Fig. 1B) is composed of Nup93, Nup205, Nup188, Nup155 and Nup53 (Fig. S1) (Grandi et al., 1997;Hawryluk-Gara et al., 2005). In T. thermophila, we previously identified homologs for Nup93 (TtNup93; Gene Model identifier TTHERM_00622800) and Nup155 (TtNup155; TTHERM_00760460), and found both of them distributed to both MAC and MIC NPCs (Iwamoto et al., 2009). To identify other Nup93 complex components, we used mass spectrometry to analyze anti-GFP immunoprecipitates from T. thermophila expressing GFP-TtNup93 (Fig. 1C). All of the proteins listed in Table S2 as 'hypothetical protein' were examined by performing a Blast search for similarities to known Nups of other species. In addition, all of the 'hypothetical proteins' were examined through expression profile analysis in the Tetrahymena Functional Genomics Database (TetraFGD) website (http://tfgd.ihb.ac.cn/) [for details see the 'Microarray' page of the TetraFGD; http://tfgd.ihb.ac.cn/tool/exp (Miao et al., 2009) and also see Materials and Methods]. When either the Blast search or the expression profile analysis (details described below) found similarities to any known Nups, we examined its subcellular localization in T. thermophila by ectopically expressing GFP-fused proteins. By means of these  Table S2). (E) Physical interaction map of Nup93 based on the mass spectrometry results. MW, molecular mass.
Nup308, a protein of 2675 amino acid residues, was previously identified as a Tetrahymena-specific Nup, but it was not assigned to a subcomplex (Iwamoto et al., 2009). Based on PSIPRED analysis, Nup308 is composed of GLFG repeats forming an N-terminal disordered structure (residues 1-570), followed by a large C-terminal α-helix-rich region (residues 571-2675) (Fig. 2). To identify potential Nup308 counterparts, we looked for Nups in other species with similar distributions of secondary structures. Interestingly, a large α-solenoid domain is a predicted feature of both Nup205 and Nup188, conserved core members of the Nup93 complex (Kosova et al., 1999;Andersen et al., 2013), although these proteins do not have FG repeats.
To investigate whether this structural similarity between Tetrahymena Nup308 and Nup205 and Nup188 homologs in other species reflected shared evolutionary origins, we performed a phylogenetic analysis. Nup308 formed a clade with Nup205 orthologs, supported by a bootstrap probability of 72%, but not with Nup188 orthologs (Fig. S2). Nup188 appears to be absent in Tetrahymena, since we failed to find any candidates in either the database or in our mass spectrometry data. Taken together, our results strongly suggest that Nup308 belongs to the Nup93 complex and is orthologous to human Nup205, but has acquired an unusual GLFG repeat domain. Consistent with this assignment, GFP-Nup308 localized similarly to GFP-TtNup93 in the cell, being equally distributed between MAC and MIC NPCs (Iwamoto et al., 2009).
The second Nup candidate identified in TtNup93 pulldowns was TTHERM_00194800. This small protein (deduced molecular mass of 45 kDa) is composed of an N-terminal FG-repeat region and a C-terminal coiled-coil region (Fig. 2), which are characteristics of central channel FG-Nups that are tethered by Nup93 (Chug et al., 2015). The secondary structure characteristics of the novel Tetrahymena Nup are highly similar to those of Nup62 and Nup58, central channel proteins in yeast and vertebrates that interact with Nup93 (Grandi et al., 1993(Grandi et al., , 1997. Because another protein Each Nup is shown as the protein name on the left. Blue, red and black letters denote MIC-specific, MAC-specific and shared components, respectively. Asterisks indicate Nups that are newly identified in this study. The colored components in the illustration are as follows: orange boxes/bars, α-helix; green boxes/bars, β-strand; red slanting lines, FG repeats; blue slanting lines, FX repeats (X means any residue, but the majority are N and Q); purple ellipses, predicted TM domain. Conserved domains are indicated by differently colored bars with standard domain names.
was identified as an Nup62 ortholog (described below), this protein is the likely the Tetrahymena ortholog of Nup58; therefore, we named it TtNup58 (Nup58 in Fig. 1D,E).

Newly identified members of the Y-complex are likely homologs of conserved Nups
The Y-complex in vertebrates (Fig. 3A) contains ten distinct proteins (Orjalo et al., 2006;Mishra et al., 2010), of which three had identified homologs in T. thermophila (TtSeh1, TtSec13 and TtNup96) (Iwamoto et al., 2009). To investigate whether the remaining seven are present in Tetrahymena but had been overlooked due to sequence divergence, we carried out mass spectrometric analysis of anti-GFP immunoprecipitates from cells expressing the known Y-complex GFP-tagged Nups described below.
First, in precipitates of GFP-TtSeh1, we identified an 86 kDa protein orthologous to Nup85 (Table S3) with a short stretch of four predicted β-strand blades at the N-terminus followed by an α-solenoid domain (Fig. 2). That architecture is typical of Nup85 orthologs that are Y-complex components in other organisms (Brohawn et al., 2008). We therefore tentatively named the T. thermophila protein TtNup85. GFP-TtNup85 localized to NPCs in both the MAC and MIC (Fig 3B; Fig. S3A).
We then immunoprecipitated GFP-TtNup85, and identified two novel candidate Y-complex core members. Both proteins are comprised a β-strand-rich N-terminal half and an α-helical-rich C-terminal half. This domain architecture is characteristic of the Y-complex components Nup160 and Nup133 (Berke et al., 2004;Devos et al., 2004), and we tentatively named the Tetrahymena   Table S4). GFP-TtNup160 and GFP-TtNup133 localized to NPCs in both nuclei, like other Y-complex components ( Fig. 3B; Fig. S3A).
Another conserved Y-complex component is Nup107, which interacts with Nup96. To search for the Tetrahymena homolog we used GFP-TtNup96 as bait and identified a 109 kDa protein (Table S5) that is rich in predicted α-helices, like human Nup107 (Fig. 2). The protein, tentatively named TtNup107, localized as a GFP-tagged construct to NPCs of both nuclei (Fig 3B; Fig. S3A).
The genes encoding all members of the Y-complex except for Nup96 are co-expressed and exhibit sharp expression peaks at 2 h (denoted C-2) after two cell strains with different mating-types were mixed for conjugation [for details see the 'Microarray' page of the TetraFGD at http://tfgd.ihb.ac.cn/tool/exp (Miao et al., 2009)] (Fig. 3C). In contrast, TtNup96 exhibits an expression peak at 4 h (C-4). This difference in the timing of expression between TtNup96 and the other Y-complex components may be related to a unique aspect of TtNup96 gene structure: TtNup96 is expressed as part of a single transcription unit together with MicNup98B, under the promoter of the MicNup98B gene (Iwamoto et al., 2009).
Three other components of the human Y-complex were not detected in our studies: Nup43, Nup37 and ELYS (also known as AHCTF1). These components may be species specific (Bilokapic and Schwartz, 2012;Rothballer and Kutay, 2012) and genuinely absent from Tetrahymena. They are also absent from S. cerevisiae (Alber et al., 2007) (see Table S1), supporting this idea.

Y-complex components show biased localization to the MIC
As previously reported, GFP-tagged Nup93 complex members and some of the central channel Nups (TtNup93, TtNup308 and TtNup54) were distributed equally between MAC and MIC NPCs, judging by fluorescence intensities (Iwamoto et al., 2009). In striking contrast, all Y-complex components identified so far exhibit distinctively biased localization to the MIC nuclear envelope (NE) compared to the MAC NE (Fig. 3B). Fluorescence intensities in the MIC were 2.69-3.96 times higher than those in the MAC (Fig. 3B). This biased localization of Y-complex components might have been caused by overexpression of the components due to the ectopic expression the GFP-tagged proteins in addition to the expression of endogenously untagged ones. To address this issue, we examined the localization of Nup160-GFP, Nup133-GFP and Seh1-mCherry expressed from endogenous loci under the control of their native promoters, and therefore expressed at physiological levels. All three proteins showed biased localization, as found for the overexpressed GFP-tagged proteins (compare the images in Fig. 3B and Fig. S3B), suggesting that the biased localization is not caused by overexpression of the tagged proteins. Because the NPC density is similar in the MAC and MIC (Fig. S1 in Iwamoto et al., 2009), the relative concentration of Y-complex components in the MIC NE suggests that the Y-complex is present at a higher copy number per NPC in the MIC compared to in the MAC (Fig. 3D).
Newly detected FG-Nups include nucleus-specific and common components FG-Nups were originally characterized as nucleoporins with domains containing extensive repeats of phenylalanine-glycine (FG) that function in nucleocytoplasmic transport. More recently, we reported a remarkable difference in MAC and MIC NPCs regarding the repeat signature present in four Nup98 paralogs. The repeat signature of MacNup98A and MacNup98B is mostly GLFG, while that of MicNup98A and MicNup98B is mostly NIFN (Fig. 2) (Iwamoto et al., 2009(Iwamoto et al., , 2015. We have now taken advantage of the recently improved annotation of the Tetrahymena Genome Database Wiki (http://ciliate.org/index.php/home/ welcome), to search for sequences bearing repeats that are similar to those of FG-Nups in other species. We found five candidate FG-Nups. Based on the molecular mass and the positions of predicted α-helices, β-strands and FG-repeat regions, we designated four of these proteins as MicNup214 (TTHERM_00992810), MacNup214 (TTHERM_00755929), MicNup153 (TTHERM_00647510) and MacNup153 (TTHERM_00379010); GFP fusions of MicNup214 and MacNup214 were exclusively localized to the MIC and MAC, respectively (Fig. 4A,B). Fluorescent protein (GFP or mNeon) fusions of MicNup153 were primarily localized to the MIC, with less localizing to the MAC, in most growing cells (Fig. 4A), although these fusion proteins were exclusively localized to the MIC in some cells (Fig. S3C). GFP fusions of MacNup153 were exclusively localized to the MAC (Fig. 4B). The localization of the fifth candidate FG-Nup (TtNup62; Nup62 in Fig. 4C), like the novel nucleoporin TtNup58 (Nup58 in Fig. 4C) identified as a central channel protein (discussed above), showed less-specific distribution on both MAC and MIC.
A striking feature of the Nup214 paralogs is that they contain the same nucleus-specific repeat motifs described earlier for TtNup98 paralogs. Like the MIC-specific Nup98 paralogs, MicNup214 contains NIFN repeats (the last N is usually Q in this protein), while MacNup214 contains FG repeats (Fig. 2). This difference may be an important determinant for selective protein transport to the MAC and MIC, as previously shown for TtNup98 s (Iwamoto et al., 2009). We note that MacNup214 lacks the β-strand-rich N-terminal region that is found in other Nup214 orthologs (Weirich et al., 2004;Napetschnig et al., 2007) (Fig. 2).
In contrast, MicNup153 and MacNup153 do not differ markedly from one another in their molecular features (Fig. 2). Because the N-terminus domain of human Nup153 is involved in its NPC localization (Enarson et al., 1998), we speculate that the N-terminal domains of MicNup153 and MacNup153 may also be involved in their nucleus-specific localization in Tetrahymena. Further study is required to elucidate their nucleus-specific localization.
While the expression of this set of FG-Nups is upregulated during conjugation (Fig. 4D), the MIC-specific components tend to be expressed 2 h earlier than MAC-specific ones. For example, MicNup214 expression peaks at 2 h in conjugation (C-2) versus MacNup214 at C-4; similarly, MicNup153 peaks at C-6 versus MacNup153 at C-8 (Fig. 4D). The earlier expression of MICspecific components compared with MAC-specific ones may reflect a selective requirement for MIC-specific NPCs during early stages of conjugation, such as the crescent stage (Sugai and Hiwatashi, 1974). In contrast, the later expression of MAC-specific components probably reflects formation of the new MACs that occurs in the later stages of conjugation.
The fifth candidate FG-Nup identified by this screen was a 39 kDa protein (TTHERM_01122680). This protein is composed of an N-terminal FG-repeat region and a C-terminal coiled-coil region with the characteristics of central channel FG-Nups and is assigned as a nucleoporin NSP1/NUP62 family protein (IPR026010) (Fig. 2). Consequently, this protein is the likely Tetrahymena ortholog of Nup62; therefore, we named it TtNup62. The GFPtagged protein was distributed to both nuclei (Nup62 in Fig. 4C), similarly to the central channel Nups TtNup58 (Figs 1E and 4C) and TtNup54 (Iwamoto et al., 2009), although TtNup62 was slightly enriched in the MAC NE, whereas TtNup58 was slightly enriched in the MIC NE. The expression profile of TtNup62 was similar to that of TtNup58, with an expression peak after 4 h of conjugation (C-4) (Fig. 4D).
TtNup62 has relatively few repeats in its FG motif compared with homologs such as human Nup62 and S. cerevisiae Nsp1 (Fig. 2), although it has several FX repeats (X=N, Q, A or T in the case of this protein). A feature unique to Tetrahymena is the presence of GLFG repeats in Nup308, an ortholog of Nup205. The Nup93 complex containing Nup205 anchors Nup62 (Vollmer and Antonin, 2014), and it is likely that the Tetrahymena Nup93 complex containing Nup308 anchors TtNup62. Thus, we hypothesize that the GLFG repeats present in Nup308 compensate for the low number of FG repeats of TtNup62 present in the central channel.

Nup88, Nup185 and Tpr
We used a variety of strategies to identify additional Nups. Homology searches against InterPro (http://www.ebi.ac.uk/interpro/) revealed a gene (TTHERM_00455610) with a conserved Nup88 domain 'TtNup88 (PTHR13257:SF0)' (Fig. 2) and an expression profile similar to those of some other Tetrahymena Nups (Fig. 5A). Localization of a GFP fusion to NPCs was highly biased, albeit not exclusive, to the MAC (Fig. 5C). We therefore named this protein TtNup88, and it is known that it localizes to the cytoplasmic side of the NPC in other species (Fig. 5B). As Nup88 in other species is known to interact with Nup214 and Nup98 (Fornerod et al., 1997), TtNup88 may contribute to the nucleus-specific localization of Nup214 and Nup98 paralogs. TTHERM_00755920 (encoding a 185 kDa protein), which lies adjacent to the open reading frame (ORF) of MacNup214, attracted our interest because its predicted molecular structure resembled those of large scaffold Nups such as Nup160, Nup155 and Nup133, and because its expression profile is similar to those of some other Tetrahymena Nups (Fig. 5A). A GFP fusion localized to NPCs, with a bias to the MAC (Fig. 5D). Based on its predicted molecular mass, we named this protein Nup185. Nup185 contains a conserved domain annotated as 'Nucleoporin' in the SUPERFAMILY database (SSF117289) (Fig. 2), which is generally found near the N-terminal regions of Nup155 and Nup133 homologs. The expression peak of Nup185 appeared at C-6 ( Fig. 5A).
To assess the location of Nup185 within the NPC architecture, we identified interacting proteins by immunoprecipitating GFP-Nup185. One interacting protein was TTHERM_00268040, which bears predicted coiled-coil motifs throughout its entire sequence (Fig. 2) and is thus similar to the nuclear basket component Tpr (Fig. 5B). TTHERM_00268040 fused with GFP localized equivalently to MAC and MIC NPCs (Fig. 5E). This protein is a likely ortholog of human Tpr; therefore, we named it TtTpr. Nup185 did not interact with any members of the Y-or Nup93 complexes (Table S6).

The TM Nups Pom121 and Pom82 show nucleus-specific localization
Some but not all of the TM Nups are conserved between vertebrates and yeasts: the former have POM121, gp210 and NDC1 (Cronshaw et al., 2002;Stavru et al., 2006), while the latter have Pom34, Pom152 and Ndc1 (Rout et al., 2000;Asakawa et al., 2014). The only reported TM Nup in T. thermophila is gp210 (Iwamoto et al., 2009). Because all Tetrahymena Nups identified so far have a similar expression pattern in which a large expression peak appears during early conjugation stage (Figs 3C, 4C and 5A), we used expression profiling and TM domain search to identify possible TM Nups in the updated TetraFGD and the TMHMM Server (http://www.cbs.dtu.dk/ services/TMHMM-2.0/), respectively. By using this approach, we found two candidate TM Nups. Each has one TM domain and an FG-repeat region ('TtPom121' and 'TtPom82' in Fig. 6A). Their expression profiles are shown in Fig. 6B.
One of the TM Nup candidates (TTHERM_00312730; TtPom121) has an N-terminal TM domain and C-terminal FG repeats (Fig. 6A, middle) with a deduced molecular mass of 129 kDa. These attributes are very similar to those of vertebrate POM121 (compare top and middle parts of Fig. 6A) (Rothballer and Kutay, 2012). TtPom121 fused with GFP at its C-terminus (TtPom121-GFP) localized specifically to MAC NPCs ( Fig. 6C, upper). Consequently, this protein is the likely the Tetrahymena ortholog to human POM121; therefore, we named it TtPom121.
Notably, when GFP was fused with the N-terminus of TtPom121 at a region close to the TM domain (GFP-TtPom121), the tagged protein localized in the MAC nucleoplasm, but not in MAC NPCs or the MIC nucleoplasm (Fig. 6C, lower panels). This result suggests that TtPom121 bears a MAC-specific nuclear localization signal (NLS) in its N-terminal region. Similarly, POM121 homologs in vertebrates have NLS sequences in the N-terminal region (Yavuz et al., 2010;Funakoshi et al., 2011).
In contrast, the other TM Nup candidate (TTHERM_00375160; TtPom82) localized exclusively to MIC NPCs (Fig. 6D, upper). This protein has predicted molecular features that have not been reported in Nups from any other organism: a TM domain near the C-terminus, a central coiled-coil and N-terminal FG repeats (Fig. 6A, bottom). We named this protein TtPom82 according to its predicted molecular mass (82 kDa). A construct lacking the TM domain showed diffuse cytoplasmic localization (Fig. 6D, lower panels), suggesting that MIC NPC-specific localization of TtPom82 does not depend on the MIC-specific nuclear transport of TtPom82. This result suggests that TtPom121 and TtPom82 use different mechanisms to target to the MAC and MIC NPCs.
Next, we performed immuno-electron microscopy (iEM) for the Pom proteins using anti-GFP antibody in order to determine their sub-NPC localization. Intriguingly, their sub-NPC localizations were opposite; Pom121 was exclusively localized to the nuclear side of the MAC NPC (Fig. 6E), whereas Pom82 was exclusively localized to the cytoplasmic side of the MIC NPC (Fig. 6F).
Given the difference in molecular features, their behaviors when the TM domain function was disrupted, and their sub-NPC localizations, Pom121 and Pom82 are unlikely to be functional homologs of each other. Taken together, these findings lead to the conclusion that MAC and MIC NPCs contain distinct TM components (Fig. 6G,H). The protein components of MAC and MIC NPCs are summarized in Fig. 7.
One TM Nup, found in both fungi and animals but missing from our Tetrahymena catalog, is Ndc1. We identified a potential Ndc1 homolog in TTHERM_00572170, a protein with six predicted TM domains that is co-transcribed with other Nups (see http://tfgd.ihb. ac.cn/search/detail/gene/TTHERM_00572170). However, neither N-nor C-terminal GFP fusions of this protein localized to NPCs (Fig. S3D). Therefore, Tetrahymena NPCs may lack Ndc1. Similarly, Ndc1 has not been detected in Trypanosoma NPCs (Obado et al., 2016).
The permeability of the nuclear pore differs between MAC and MIC To better understand the functional consequences of structural differences between MAC and MIC NPCs, we examined the relative pore exclusion sizes by asking whether probes of different sizes could gain access to each nucleoplasm. GFP (∼28 kDa) was excluded only from MICs, whereas GFP-GST (more than 100 kDa owing to its oligomerization) was excluded from both MACs and MICs (Fig. S4A). In addition, FITC-dextran of 40 kDa could enter MACs, whereas 70-kDa FITC-dextran was completely excluded (Fig. S4B). These results indicate that MAC pores exclude molecules greater than ∼50 kDa, which is similar to the permeability size limit of nuclear pores in other species (Paine et al., 1975;Gorlich and Mattaj, 1996;Keminer and Peters, 1999). On the other hand, MIC pores impose a much smaller exclusion size, and exclude molecules of even 10-20 kDa (Fig. S4B). This difference in exclusion size may be due to differences between the protein composition and structural arrangement of NPCs of these dimorphic nuclei.

DISCUSSION
We have now identified 28 nucleoporins in the ciliate T. thermophila: 15 Nups reported here, and 13 in our previous study (Iwamoto et al., 2009). This total comprises 24 different Nups for the MAC and MIC: this number includes 18 Nups that are localized in both nuclei, four Nups with nucleus-specific homologs (Nup214, Nup153, Nup98A, and Nup98B), and TtPom82 and TtPom121. This total is somewhat smaller than the roughly 30 Nups known in other eukaryotes, e.g. 34 in human and in Drosophila melanogaster, 27 in Caenorhabditis elegans, 33 in S. pombe and 35 in S. cerevisiae (Rothballer and Kutay, 2012;Asakawa et al., 2014). The deficit in T. thermophila Nups is due to the absence of homologs for Nup358, GLE1, human CG1 (also known as NUPL2; ScNup42), Nup43, Nup37, centrin-2, Nup53, TMEM33, ELYS and Aladin. Similarly, the protist Trypanosoma brucei is missing homologs of Nup358, GLE1, human CG1, Nup37, centrin-2, TMEM33 and ELYS, and 25 Nups in total have been identified by  ,Pom121;P82,Pom82;98,Nup98 paralogs;214,Nup214;153,Nup153]. Green boxes represent shared components including the nuclear basket structure Tpr and its associated Nup50 (50). TtNup50 is distributed mostly in the nucleoplasm in MACs, whereas it localizes to the NPC in MICs (Malone et al., 2008;Iwamoto et al., 2009). Yellow boxes are MICbiased Y-complexes, and purple boxes are MAC-biased TtNup88 (88). The number of duplications of yellow and purple boxes does not reflect the actual quantity of those components in vivo. Homologs of Nup358 (358), hCG1 (CG), Aladin (AL), and ELYS constituting the cytoplasmic structure, were not found in T. thermophila. interactome analysis (DeGrasse et al., 2009;Obado et al., 2016). One conserved Nup identified in Trypanosoma but not Tetrahymena is Nup53 (TbNup65; Genbank XP_822630.1) (Obado et al., 2016). This raises the question of whether a T. thermophila Nup53 homolog eluded our search due to sequence or structural divergence. Alternatively, T. thermophila may have lost a Nup that is not essential for viability.

A role for nucleus-specific Nups
We previously reported that the GLFG-repeat and NIFN-repeat domains in MacNup98A and B, and MicNup98A and B, respectively, are involved in the nucleus-specific transport of linker histones (histone H1 and MLH, respectively), arguing that these nucleus-specific Nups are determinants of nucleus-specific transport (Iwamoto et al., 2009). Importantly, we can now expand this argument, since our expanded catalog shows that all NPC subunits that are nucleus-specific are Nup153,Nup98 and Pom proteins). Since the FG repeats interact with nuclear transport receptors such as importin-β family proteins (Allen et al., 2001;Isgro and Schulten, 2005;Liu and Stewart, 2005;Tetenbaum-Novatt et al., 2012), specificity for the MAC or MIC is likely to be determined in cooperation with importin-βs. This idea is also supported by the presence of nucleus-specific importin family proteins (Malone et al., 2008).
It is interesting to note that both MAC-and MIC-specific Nups contain atypical repeat motifs including the NIFN motif and also more subtle variations on the FG repeat (FN, FQ, FA, FS and so on) (Fig. 2). Because the NIFN-repeat domain of MicNup98A is known to function in blocking misdirected nuclear transport of MACspecific linker histones (Iwamoto et al., 2009), the atypical FG repeats may similarly be involved in controlling nucleus-specific transport of particular proteins. However, importin-βs that preferentially interact with the NIFN repeat and their cargos have not been found, and thus the complete role of the NIFN-repeat motif in nucleus-specific transport remains to be elucidated.

A role of biased Nups to build different NPC structures
The nucleus-specific Nups generate obvious structural differences between MAC and MIC NPCs. However, these different components have to be integrated into two NPC scaffold structures that are constructed of the same components. One way to make different structures from the same components is to incorporate different amounts of these components, leading to different structures that allow biased localization/assembly of nucleus-specific components. The localization of the Y-complex (Fig. 3B) and Nup88 (Fig. 5C) was highly biased to either MICs or MACs, respectively. Thus, these biased components may be critical for directing assembly of MAC-or MIC-type NPCs. Consistent with this idea, Nup98 homologs in vertebrates interact with the Y-complex components Nup96 (Hodel et al., 2002) and Nup88 (Griffis et al., 2003). This model raises the question of how structurally similar paralogs in Tetrahymena can differentially recruit nucleus-specific FG-Nups.
The copy number of the Y-complex within individual NPCs differs between the MAC and MIC (Fig. 3B,D), indicating that at least two NPC structures with different Y-complex stoichiometries can form in ciliates. This quantitative difference in Y-complex incorporation may be directed by membrane Nups. The nucleusspecific TM Nups Pom121 and Pom82 are currently strong candidates for initiating NPC assembly on the nuclear membrane. In vertebrates, Pom121 binds the Y-complex through a Nup160 homolog (Mitchell et al., 2010). In Tetrahymena, TtPom121 and TtPom82 may differentially affect Y-complex integration into MAC or MIC NPCs. This model can be extended to biased integration of Nup98 paralogs, since Pom121 has been shown to directly bind Nup98 proteins (Mitchell et al., 2010), supporting our idea that biased Nups and nucleus-specific Nup98 paralogs cooperate to build two distinct NPCs. In this model, the acquisition of specialized Pom proteins might have been one of the most crucial evolutionary events for generating nuclear dimorphism in ciliates. Taken overall, our study contributes to understanding the diversity of NPC architectures in eukaryotes, including potential functional and evolutionary aspects.

DNA construction
cDNAs were amplified by PrimeSTAR reagent (Takara, Kyoto, Japan) from the reverse transcripts prepared from the total RNA fraction of vegetative or conjugating cells as described previously (Iwamoto et al., 2009). The cDNAs were digested with XhoI and ApaI, and cloned into the pIGF1 vector to ectopically express them as N-terminal GFP-tagged proteins (Malone et al., 2005). The pIGF1C vector with the multi-cloning site at the 5′ site of the GFP-coding sequence was generated by modifying the pIGF1 vector, and used to ectopically express GFP-tagged Nup58 and Pom121 as C-terminal GFP-tagged proteins; the cDNAs of these Nups were cloned into the pIGF1C vector using the XhoI and KpnI sites. To endogenously express Nups tagged with a fluorescent protein at the C-termini of the macronuclear ORFs, MicNup214, Nup160, and Nup133 were tagged with GFP using a pEGFP-neo4 vector (Mochizuki, 2008) (a kind gift from Kazufumi Mochizuki, Institute of Molecular Biotechnology of the Austrian Academy of Sciences, Vienna, Austria), MicNup153 was tagged with mNeon using a p2xmNeon_6xmyc_Neo4 vector (a kind gift from Aaron Turkewitz, University of Chicago, Chicago, IL), and Seh1 was tagged with mCherry using a pmCherry-pur4 vector . Primers used in this study are listed in Table S7.

Expression of GFP-tagged Nups in Tetrahymena cells
Conjugating cells were subjected to transfection by electroporation using a Gene Pulser II (Bio-Rad, Hercules, CA) as described previously (Iwamoto et al., , 2015. The resulting cell suspension was cultivated for 18 h and then treated with paromomycin sulfate (Sigma-Aldrich, St Louis, MO) at 120 µg/ml when using pIGF1, pIGF1C, pEGFP-neo4 and p2xmNeon_6xmyc_Neo4 vectors, or puromycin dihydrochloride (Fermentek, Jerusalem, Israel) at 200 µg/ml when using a pmCherry-pur4 vector. Cadmium chloride was also added at 0.5 µg/ml to induce the expression of drug-resistant genes for pEGFP-neo4, p2xmNeon_6xmyc_Neo4, and pmCherry-pur4 vectors. Resistant cells usually appeared within a few days after the drug was added. We checked that at least five independent clones (i.e. grown in five different wells) exhibited the same intracellular localization of each GFP-Nup.

Immunoprecipitation
For immunoprecipitation, GFP-Nup-expressing cells in logarithmic growth were pretreated with 0.5 mM phenylmethylsulfonyl fluoride (PMSF) for 30 min at 30°C and then collected by centrifugation (700 g for 1 min). The cells were resuspended at 2.5×10 6 cells/ml in homogenization buffer composed of 150 mM NaCl, 1% Triton X-100, 2 mM PMSF, and Complete protease inhibitor cocktail (Roche Diagnostics, Mannheim, Germany), and then homogenized with sonication on ice. The supernatant obtained after centrifugation at 10,000 g for 15 min was pretreated with Protein-A-Sepharose to absorb non-specifically bound proteins. After removal of the beads by low-speed centrifugation (720 g for 5 min), the supernatant was incubated with 50 µg anti-GFP rabbit polyclonal antibody (#600-401-215, Rockland Immunochemicals, Limerick, PA) for 2 h at 4°C. To collect immunoprecipitated target proteins of interest, fresh Protein-A-Sepharose was added, incubated for another 2 h at 4°C, and then collected by centrifugation (720 g for 5 min). After a brief washing with homogenization buffer, the Sepharose beads were incubated with NuPAGE sample buffer (Thermo Fisher Scientific, Waltham, MA) to elute bound proteins. The proteins were separated by SDS-PAGE.

Mass spectrometry analysis
The gel sample lane was cut into several pieces, and each treated with trypsin. The trypsinized peptide sample was subjected to liquid chromatography-tandem mass spectrometry (LC-MS/MS) using the LXQ linear ion trap (Thermo Finnigan, San Jose, CA) equipped with a Magic2002 and nanospray electrospray ionization device (Michrom BioResources, Auburn, CA and AMR, Tokyo, Japan), as described previously (Obuse et al., 2004). The LC-MS/MS data were searched by Mascot (Matrix Science, London, UK) with a non-redundant T. thermophila-specific database (25,131 sequences) constructed from the nr NCBI database. The resulting files were loaded into Scaffold software (Proteome Software, Portland, OR) for comparing identified proteins between samples.

Microscopic observation
Intracellular localizations of GFP-tagged Nups were observed by performing fluorescence microscopy (IX-70; Olympus, Tokyo, Japan). Images were taken using the DeltaVision microscope system (GE Healthcare, Issaquah, WA) with oil-immersion objective lens UApo40 (NA=1.35) (Olympus). Line profiles of fluorescence intensity were obtained with a measurement tool included in the DeltaVision system. Background fluorescence was measured from the cytoplasm as an averaged value of 5×5 pixels and was subtracted from the peak values of fluorescence on the NE.

Indirect immunofluorescence staining
Tetrahymena cells expressing GFP-tagged Nups were first fixed with cold methanol for 20 min, and then additionally fixed with 4% formaldehyde in PBS for 20 min. After treatment with 1% bovine serum albumin (BSA), cells were treated with 5 µg/ml anti-GLFG monoclonal antibody 21A10 for 2-3 h (Iwamoto et al., 2013). After washing with PBS, cells were treated with Alexa Fluor 594-conjugated goat anti-mouse IgG at 1:1000 dilution for 1 h (Thermo Fisher Scientific). Images of 40 z-sections with a 0.2-μm interval were taken for cells by using the DeltaVision microscope system with an oil immersion objective lens PlanApoN60OSC (NA=1.4) (Olympus), and were processed by deconvolution using SoftWoRx software equipped with the microscope.

Immuno-electron microscopy
Tetrahymena cells expressing GFP-tagged Nups were fixed with 4% formaldehyde for 30 min. After washing three times with PBS, they were permeabilized with 0.1% saponin for 15 min at room temperature. After treatment with 1% BSA, cells were incubated with anti-GFP polyclonal antibody (cat. no. 600-401-215; Rockland Immunochemicals) at 1:200 dilution for 2 h, washed three times with PBS, then incubated with FluoroNano gold-conjugated anti-rabbit Fab′ also conjugated to Alexa Fluor 594 (Nanoprobes, Yaphank, NY) at 1:400 dilution for 1 h. The immunolabeled cells were fixed with 2.5% (w/v) glutaraldehyde (Nacalai tesque, Kyoto, Japan) for 1 h. After washing with 50 mM HEPES ( pH 5.8), they were incubated with silver enhancement reagent (Tange et al., 2016) for 7 min. The reaction was stopped by washing three times with distilled water. Then the cells were post-fixed with 1% OsO 4 for 15 min, electron stained with 2% uranyl acetate for 1 h, dehydrated with sequentially increased concentrations of ethanol and embedded in epoxy resin (Epon812). The ultrathin sections sliced from the resin block were stained with 4% uranyl acetate for 15 min and lead citrate (Sigma-Aldrich) for 1 min, and observed with a transmission electron microscope JEM-1400 (JEOL, Tokyo, Japan) with an acceleration voltage of 80 kV.