ABSTRACT
A novel intermediate filament cDNA, pG-IF, has been isolated from a Drosophila melanogaster embryonic expression library screened with a polyclonal antiserum produced against a 46 kDa cytoskeletal protein isolated from Kc cells. This 46 kDa protein is known to be immunologically related to vertebrate intermediate filament proteins. The screen resulted in the isolation of four different cDNA groups. Of these, one has been identified as the previously characterized Drosophila nuclear lamin cDNA, Dm0, and a second, pG-IF, demonstrates homology to Dm0 by cross hybridization on Southern blots. DNA sequence analysis reveals that pG-IF encodes a newly identified intermediate filament pro-tein in Drosophila. Its nucleotide sequence is highly homologous to nuclear lamins with lower homology to cytoplasmic intermediate filament proteins. pG-IF pre-dicts a protein of 621 amino acids with a predicted molecular mass of 69,855 daltons. In vitro transcription and translation of pG-IF yielded a protein with a SDS-PAGE estimated molecular weight of ∼70 kDa. It contains sequence principles characteristic of class V intermedi-ate filament proteins. Its near neutral pI (6.83) and the lack of a terminal CaaX motif suggests that it may rep-resent a lamin C subtype in Drosophila. In situ hybridization to polytene chromosomes detects one band of hybridization on the right arm of chromosome 2 at or near 51A. This in conjunction with Southern blot analysis of various genomic digests suggests one or more closely placed genes while Northern blot analysis detects two messages in Kc cells.
INTRODUCTION
Intermediate filaments (IFs) are a major protein component of the detergent-insoluble cytoskeleton. In vertebrates, a growing multigene family encodes these proteins, which have thus far been divided into six subclasses based on pro-tein and DNA structural homologies (Lendahl et al., 1990; reviewed by Steinert and Roop, 1988). Cytoplasmic IFs (classes I-IV, VI) form a 10 nm filamentous network which extends from the plasma membrane to the nuclear enve-lope. Their expression patterns are cell- and tissue-specific while the nuclear IFs or lamins (class V) are expressed by all vertebrate cells (reviewed by Krohne and Benavente, 1986). These proteins do not form an analogous 10 nm net-work but a meshwork of filaments comprising the karyoskeleton. All IF proteins share a conserved secondary structure consisting of a central alpha-helical rod domain flanked by variable amino- and carboxyl-terminal end domains.
The detergent-insoluble cytoskeleton is less well characterized in invertebrates than in vertebrates and most IF clas-sification and study has been done in vertebrate systems. A few invertebrate IF proteins have been identified by cDNA or gene cloning (Dodemont et al., 1990; Szaro et al., 1991; Gruenbaum et al., 1988) or by protein sequencing and fil-ament assembly (Weber et al., 1988, 1989; Bartnik et al., 1985, 1986). All have been classified as Class V IFs regard-less of their tissue origin, implying less IF complexity in invertebrates. Cytoplasmic IF proteins were identified in epithelial cells of the gastropod mollusk Helix pomatia and muscle of the nematode Ascaris lumbricoide, based on in vitro filament assembly, sequence principles and sequence identities (Bartnik et al., 1985, 1986; Weber et al. 1988, 1989). Invertebrate IF genes characterized include the nuclear lamin gene, Dm0, from Drosophila melanogaster (Osman et al., 1990) and a lamin-related IF gene in Helix aspersa (Dodemont et al., 1990), which encodes a cyto-plasmic IF. cDNA sequence recently reported for two neu-rofilament proteins of the squid Loligo pealei also revealed class V IF characteristics (Szaro et al., 1991). Typical IF morphology was absent in an electron microscope survey of arthropods (Bartnik and Weber, 1989), leading to the suggestion that cytoplasmic IFs may not be present in this group. However, a group of IF-related proteins have been implicated in Drosophila by morphological and immuno-logical criteria (Falkner et al., 1981; Walter and Biessmann, 1984a,b) but rigorous characterization has not been reported.
Although IFs are nearly ubiquitous, the function of both the cytoplasmic and nuclear IFs is unknown. Their identi-fication and characterization in a genetically manipulable eukaryote such as Drosophila could clearly contribute to the study of the evolution and function of IFs. We describe here the characterization of a cDNA isolated from an expression library using an antibody against a cytoplasmic Drosophila protein. This protein has a molecular mass of 46 kDa, is immunologically related to vertebrate IFs (Falkner et al., 1981; Walter and Biessmann, 1984a,b; Sher-wood et al., 1989) and shares cellular distribution, phos-phorylation and solubility characteristics expected of IFs (Bossie et al., unpublished data). The cDNA codes for a novel Drosophila protein with sequence and predicted pro-tein characteristics most like the class V nuclear lamins.
MATERIALS AND METHODS
Materials
ELISA grade bovine serum albumin (BSA) and 3,3′-diaminoben-zidine (DAB) were purchased from Sigma; bromochloroindoyl phosphate/nitro blue tetrazolium (BCIP-NBT) substrate from Kirkegaard and Perry; biotin-14-dATP from GIBCO BRL; [35S]methionine from Amersham; nitrocellulose membranes from Schleicher and Schuell; PVDF membranes from Millipore; pGEM-4Z vector, RNasin and rabbit reticulocyte lysate from Promega; avidin-linked horseradish peroxidase (HRP) from ENZO Diagnostics, Inc.; and Protein A-linked HRP was from Cappell. Kits utilized for various procedures included Gene Clean Kit from BIO 101, Multiprime Labeling System and the ECL Western blotting kit were from Amersham, Riboprobe System II from Promega and the Sequenase Kit from U.S. Biochemicals. The λgt11 expression library was a gift from T.-S. Hsieh, Duke University, and the Dm0 cDNA, purified Dm1 and Dm2 proteins and anti-Dm antibody were gifts from Paul Fisher and Nico Stu-urman, SUNY at Stony Brook.
Cell lines and culture conditions
Kc cell lines were maintained at 25°C in D22 medium contain-ing 10% fetal calf serum as described previously (Sanders, 1981). The Kc clone, EC-1, and the 4,4′-diisothiocyanostilbene-2,2′-disulfonate (DIDS)-resistant Kc cell line, P-20, were maintained as described previously (Sherwood et al., 1989).
Protein preparation and SDS-PAGE
Nuclease-treated cytoskeleton proteins (NCP) were isolated as described by Bossie et al. (unpublished data). For the heavily loaded gels shown here, 5 × 106 cell equivalents of the NCP frac-tion were applied to each lane. A standard discontinuous one-dimensional SDS-PAGE protocol was followed using 10 or 12.5% gels (Sanders, 1981; Sanders et al., 1986) with electrophoresis at 4 mA (overnight) or 15-20 mA constant current.
Polyclonal antibodies
The 46 kDa protein was gel purified and used to inject rabbits in complete Freund’s adjuvant. Polyclonal sera designated NJ-2 (raised by Hazleton Research Products Inc., Denver, PA) and VB-2 (raised in house) were utilized as described.
Immunoblots
The protocol for transferring proteins from SDS-gels to PVDF membranes was adapted from Matsudaira (1987) and utilized 10 mM CAPS, pH 11, 10% methanol and electrophoretic transfer at 200 mA for 1 hour at room temperature. Reaction of the blots with antibodies and washing protocols have been described (Bossie et al., unpublished data). The Amersham ECL Western chemi-luminescent substrate system for HRP-Protein A was used for detection according to the manufacturer’s instructions.
Isolation of cDNA clones
The Drosophila cDNA expression library was screened immuno-logically according to established procedures (Sambrook et al., 1989) using duplicate plaque lifts from sixteen 150 mm Petri plates at densities of 104 to 5 × 104 plaques per plate. The filters were blocked in 3% BSA in 10 mM Tris-HCl, pH 8, 150 mM NaCl, 0.05% Tween-20 (TNT) for 1 h and incubated with a 1:1000 dilution of antiserum in 1% BSA-TNT for 4 h. Immunopositives were detected with the BCIP-NBT substrate after a 1 h incuba-tion with a 1:1500 dilution of Protein A-linked alkaline phos-phatase in 1% BSA-TNT. Immunopositive plaques were purified by two more dilution rescreens and further tested by screening with preimmune serum.
To obtain the full-length cDNA, a ∼100 bp fragment from the 5′ end of pG-10 was isolated from an EcoRI-HindIII digest of pG-10 and end-labeled (Sambrook et al., 1989). Duplicate lifts of ∼3 ×105 plaques, from the same cDNA library, were screened with this probe using high-stringency hybridization and wash condi-tions (Sambrook et al., 1989). Eight positive plaques were obtained and purified.
λDNA was isolated by a liquid lysate method (Chisholm, 1989). For further manipulations, the cDNAs were subcloned into the unique EcoRI site of the pGEM-4Z vector.
Probes
DNA fragments to be used as probes were isolated from agarose gels using the Gene Clean Kit. [32P]dATP-labeled DNA probes for the Southern and Northern blots were synthesized using the Multiprime Labeling System and heat denatured immediately before addition to hybridization solutions as described below. Biotinylated probe for the in situ hybridization was synthesized using 30 μM biotin-14-dATP and 30 ng pG-IF insert in a 2 h reaction in the same conditions.
Southern blots
DNA was transferred from 0.9% agarose gels by capillary trans-fer to nitrocellulose membranes in 10× concentrated 0.15 M NaCl, 0.015 M sodium citrate (SSC). The filters were rinsed, air dried and baked at 80°C for 2 h. Prehybridization was carried out in 6 × SSC, 5 × Denhardt’s solution (0.1% Ficoll, 0.1% polyvinylpyrrolidone and 0.1% BSA), 0.5% SDS and 100 μg/ml fragmented denatured salmon sperm DNA for 2 h at 68°C and hybridization in fresh solution was carried out overnight at 68°C. Filters were washed in 2 × SSC, 0.5% SDS at room temperature, 0.1 × SSC, 0.5% SDS at 37°C for 60 minutes and 0.1 × SSC, 0.5% SDS at 68°C for 60 minutes. Autoradiography was done with Kodak XAR film.
Northern blots
Kc RNA was isolated from a pellet of 5 × 107 cells by resus-pending in 250 μl 20 mM Tris-HCl, pH 7.4, 140 mM KCl, 5 mM Mg(OAc)2, 0.1 mM EDTA, 0.5 mM DTT, 0.3 M sucrose, 0.35% Triton X-100, 0.35% deoxycholate plus 80 units RNAsin and vor-texing vigorously for two 30-second periods. Nuclei were pelleted by centrifugation at 13000 g for 30 seconds, and an equal volume of 20 mM Tris-HCl, pH 7.8, 10 mM EDTA, 1% SDS plus 7 μl 20 mg/ml Proteinase K were added to the supernatant and incu-bated at 37°C for 30 minutes. RNA was extracted twice each with phenol/chloroform/isoamyl alcohol (25:24:1, by vol.) and chloro-form-isoamyl alcohol (24:1 v/v), the aqueous phase was adjusted to 0.3 M sodium acetate and RNA was precipitated with ethanol. The yield was typically 25-30 μg RNA per 107 cells.
For electrophoresis, RNA was denatured in 50% for-mamide/13% formaldehyde and 107 cell equivalents of RNA were loaded per lane on 1% agarose/0.8% formaldehyde gels. Follow-ing electrophoresis in 10 mM sodium phosphate buffer (pH 7.0), the gels were soaked for 30 minutes in two changes of sterile water followed by 1-2 hours in 20 × SSC. Transfer to nitrocellu-lose was done by capillary action in 20 × SSC. Blots were dried and baked at 80°C for 2 hours. Prehybridization was at 42°C for 2 hours in 50% formamide, 0.36 M NaCl, 2 mM EDTA, 20 mM NaH2PO4, pH 7.4, 0.1% SDS and 100 μg/ml fragmented, dena-tured salmon sperm DNA. Hybridization with probe added to fresh prehybridization solution was at 42°C overnight.
DNA sequencing
Double-stranded plasmid templates were prepared from mini plas-mid preparations of pGEM-4Z recombinants (Kraft et al., 1988). Priming and dideoxy sequencing with 35S-dATP utilized the U.S. Biochemical Sequenase kit. Gels were exposed to Kodak XAR film at −70°C for 1-7 days. The GCG Sequence Analysis Soft-ware Package (Devereux et al., 1984) was used to analyze nucleotide and predicted amino acid sequences.
In vitro analysis of pG-IF
cDNA inserts into pGEM-4Z with the open reading frame under the control of the SP6 promotor were selected by restriction enzyme analysis. Templates were linearized by cleaving in the unique polylinker site with SmaI. Capped mRNA was produced in vitro using the Riboprobe System II kit. After incubation at 38°C for 1 hour, an additional 40 units of polymerase was added for another hour. Following phenol/chloroform and chloroform extractions and ethanol precipitation of the nucleic acid, tran-scription products were translated in vitro. The reaction mixture consisted of 35 μl nuclease-treated rabbit reticulocyte lysate, 40 units RNAsin, 1 μl 1 mM amino acid mix (−)methionine, 2 μl capped RNA template (heated to 67°C for 10 minutes), 4 μl [35S]methionine (1000 Ci/mmole, 15 mCi/ml) in a 50 μl final volume. Reactions were incubated at 30°C for 70 minutes. The reaction products were dissolved in 2 × sample buffer (Laemmli, 1970), sonicated, heated in a boiling water bath, separated by stan-dard SDS-PAGE and autoradiographed to visualize template driven translation products.
In situ hybridization to polytene chromosomes
Salivary glands were dissected from third instar larvae and fixed in 45% acetic acid for 5-10 minutes. They were squashed between a glass slide and a siliconized coverslip and warmed to 45°C for 3-5 minutes. The slides were then placed on solid CO2 for 15-30 minutes and the coverslip was flipped off. Slides containing squashes were immersed in ethanol/acetic acid (3:1, v/v) for 10 minutes, 100% ethanol for 10 minutes, air dried and stored with desiccant at room temperature. Before hybridization, the slides were washed in 2 × SSC for 30 minutes at 65°C and rinsed in 2 × SSC at room temperature. The chromosomes were acetylated in 500 ml of 0.1 M triethylamine, pH 8.0, plus 0.625 ml acetic anhy-dride for 10 minutes. They were washed in 2 × SSC, denatured in 0.07 M NaOH for 3 minutes, washed in 2 × SSC and dehy-drated in 70% ethanol followed by 94% ethanol and finally air dried for 30-60 minutes.
Hybridization was carried out at 58°C, overnight, in a moist chamber with 20 μl of heat-denatured biotinylated DNA. Hybridization buffer consisted of 0.6 M NaCl, 1 × Denhardt’s solution, 3 mM MgCl2, 50 mM sodium phosphate, pH 7.2. Slides were washed in 2 × SSC at 51°C; 0.15 M NaCl, 10 mM NaPO4, pH 7.2 (PBS), at room temperature; PBS plus 0.1% Triton X-100; and PBS. Avidin-HRP was added at 1:250 dilution and incubated at room temperature for 2 hours. Slides were washed in PBS, PBS plus 0.1% Triton X-100, and PBS. Freshly made substrate was added (0.5 mg/ml DAB, 0.5 mM NiCl, 0.01% H2O2 in PBS). After 30 minutes slides were rinsed with water and stained in Giemsa.
RESULTS
A 46 kDa cytoskeletal protein
A 46 kDa protein is a major component of the detergent-insoluble cytoskeleton of Drosophila Kc cells (Sherwood et al., 1989). It is immunologically related to vertebrate intermediate filament proteins (Falkner et al., 1981; Walter and Biessmann, 1984a,b; Sherwood et al., 1989). To char-acterize and classify this protein, we have generated two polyclonal antisera, NJ-2 and VB-2, against gel-purified 46 kDa cytoskeletal protein (Bossie et al., unpublished data). Both sera reacted strongly with the 46 kDa protein on immunoblots as shown in the left- and right-most panels of Fig. 1, and initial characterization of the reaction of NJ-2 with both whole cell and NCP preparations showed no sig-nificant cross-reactions (data not shown). However, later bleeds tested with more sensitive detection systems showed reaction of NJ-2 with a 75 kDa protein doublet in heavily loaded NCP preparations from Kc cells (left panel of Fig. 1, lane NCP). The VB-2 antiserum maintained its specificity for the 46 kDa protein throughout the immunization protocol.
The abundance, fractionation properties and molecular mass of the 75 kDa doublet are characteristic of the Drosophila nuclear lamin proteins, Dm1 and Dm2 (Lin and Fisher, 1990). Affinity-purified Dm1 and Dm2, loaded in lane D of the left panel comigrate with the Kc protein dou-blet and react with NJ-2. The center panel of Fig. 1 shows that this Kc protein doublet in the NCP sample is recog-nized by anti-Dm1/2 antibody. This observation provides a second example of reaction of an antibody to the 46 kDa protein with an authentic member of the intermediate fila-ment protein family.
The affinity of NJ-2 was sufficiently high to recognize nanogram quantities of the 46 kDa protein spotted onto nitrocellulose and this serum was chosen to screen a λgt11 cDNA expression library made from mRNA isolated from 0-20 hour embryos of Drosophila melanogaster (Nolan et al., 1986). Previous experiments had shown that the 46 kDa cytoskeletal protein is expressed at this time in develop-ment (Walter and Alberts, 1984). Ten λgt11 recombinants retained reproducible immunoreactivity with NJ-2 and con-tained inserts released with EcoRI. Screening these ten iso-lates with the second polyclonal antiserum, VB-2, revealed one weakly positive λgt11 recombinant (data not shown, summarized in Table 1). While consistent with the possi-bility that the insert in this phage coded for the 46 kDa pro-tein, the weakness of the reaction contributed to the deci-sion to further analyze all ten recombinants.
To characterize the recombinants and identify related iso-lates, the inserts were released with EcoRI, separated on agarose gels and blotted to nitrocellulose. Separate blots were hybridized with probes from individual inserts. Five different cDNA groups were distinguished by these cross-hybridization studies. Four of these, designated groups 4, 5, 6 and 10, hybridized with a Kc genomic DNA digest (summarized in Table 1). Two groups, 4 and 10, cross-hybridized with each other to some extent at high strin-gency. Fig. 2 demonstrates that prolonged exposures of Southern blots probed with either group detects the other, indicating that some sequence homology exists between these two groups.
Northern blot analyses of total Kc cell RNA showed that group 5 had no detectable transcript in these cells. Tran-scripts were detected for groups 4, 6 and 10 (Fig. 3). Com-parison of the size of each insert with that of the mRNA detected indicated that 6 was an incomplete cDNA while some members of group 4 were close to full length. Group 10 was unique in that it identified two low abundance tran-scripts of ∼1.3 and 2.3 kb. The cDNAs of this group were 1.7-2.0 kb and probably represented incomplete copies of the 2.3 kb transcript.
Group 4 cDNAs code for Drosophila lamin Dm0
The size of the group 4 cDNAs (∼2.7-3.0 kb) and corre-sponding message (∼3.0 kb), and the hybridization of the 4 insert to a genomic EcoRI fragment of ∼10 kb (lower left panel of Fig. 2), are characteristic of the only previously identified Drosophila IF cDNA, the nuclear lamin known as Dm0 (Gruenbaum et al., 1988). The possibility that this group coded for Dm0 was further suggested by the reaction of NJ-2 with the Dm0-derived proteins shown on the immunoblot in Fig. 1. To determine if group 4 represented Dm0 cDNA, a Southern blot of an isolate from each cDNA group was probed with authentic Dm0 cDNA. Fig. 4 shows that group 4 hybridized with the Dm0 probe under high-stringency conditions. As expected, group 10 weakly cross-hybridized to the Dm0 probe as it did to the probe from group 4. Together, these data indicated that group 4 repre-sents Dm0 cDNA.
Analysis of group 10-the novel IF
The inserts from groups 5, 6 and 10 were subcloned into the EcoRI site of the pGEM-4Z vector and are subsequently designated pG-“isolate number”. These vectors were tran-scribed and translated in vitro and the translation products were precipitated with the NJ-2 antiserum. Translation products from pG-6 and pG-10 immunoprecipitated verify-ing that these cDNAs encoded immunoreactive polypep-tides (data not shown, summarized in Table 1). Since the largest cDNA of group 10 generated a fusion protein in the library screen which was recognized by the second poly-clonal antiserum, VB-2 (data not shown), and since all cDNAs of this group demonstrated low level hybridization to the authentic IF cDNA coding for Dm0, group 10 was further analyzed. The above-described properties of the various isolates from the screen are summarized in Table 1.
A comparison of the DNA and predicted amino acid sequences of the three group 10 cDNAs with the GenEMBL data base showed that this group encodes a protein homol-ogous to known IF proteins at the level of both the nucleotide and predicted amino acid sequences. pG-10, the largest member in the group, contained the most 5′ sequence, but was not full length. Therefore, a 100 bp HindIII-EcoRI 5′ fragment of pG-10 was isolated, end labeled and used to rescreen the library (Fig. 5). The 2.3 kb clone, pG-IF, was isolated and complete coding sequence was obtained (Figs 5 and 6). pG-IF has 122 nucleotides of untranslated 5′ sequence, a coding region of 1863 nucleotides and 335 nucleotides of 3′ noncoding sequence. The starting ATG (residues 123-125) sequence context (AAAAAUG) is consistent with that described for Drosophila translation start sites (C/AAAC/AAUG) (Cavener, 1987). This cDNA sequence predicts a protein of 621 amino acids with a molecular mass of 69,855 kDa and a neutral isoelectric point of 6.83. Restriction map poly-morphism among the original 3 isolates revealed that two alleles of pG-IF may have been isolated. Sequencing of pG-9 confirmed that this isolate had 15 base changes (under-lined in Fig. 6) from the pG-IF sequence. Only two of these resulted in an amino acid change, namely serine-165 to ala-nine and isoleucine-527 to valine. In vitro transcription and translation of pG-IF yielded a protein whose SDS-PAGE determined molecular mass is ∼70 kDa, in agreement with the predicted molecular mass (Fig. 7).
Analysis of the predicted amino acid sequence revealed a hydrophilic protein with a high predicted alpha helix con-tent by the Kyte-Doolittle, Chou-Fasman and Garnier-Osguthorpe-Robson secondary structure predictions (data not shown). The pG-IF nucleotide sequence is ∼60% iden-tical to the Drosophila lamin cDNA, Dm0. The predicted amino acid sequence is ∼50% identical to Dm0 and 30-40% identical to vertebrate lamins with slightly lower identities with vertebrate cytoplasmic IF proteins. However, regions of great homology exist in the beginning of coil 1A and the end of coil 2, typical of all IF proteins (reviewed by Steinert and Roop, 1988). Fig. 8 summarizes a comparison of pG-IF’s predicted amino acid sequence with the Drosophila lamin, Dm0, and human lamins B and C. Features characteristic of the C-terminal tail of lamin sequences include a putative nuclear localization signal (NLS) and an isoprenylation site (CaaX motif) typical of lamins A and B but not C (Vorburger et al., 1989; Holtz et al., 1989). A CaaX motif (at the very C terminus) is present in the Drosophila Dm0 lamin sequence (Gruenbaum et al., 1988) but not in pG-IF. There is a putative NLS, KRRRTV, in pG-IF at residues 453-458 (marked in line 7, Fig. 8). This is similar to the position of the putative NLS present in Dm0 (KRKRAV at residues 445-450) and other lamins. In addition, nuclear lamins contain cdc2 kinase phosphoryla-tion sites in the head and tail domains (Peter et al., 1990). One of these is a highly conserved SPTR consensus sequence at in the N-terminal head in all lamin sequences characterized to date and present in pG-IF at residues 37-40 (marked in line 1, Fig. 8).
Southern blot analysis of restriction digests of Kc genomic DNA with various enzymes probed with a fragment from the pG-10 coding sequence was carried out (data not shown). For each of the six enzymes tested, there was one more band of hybridization than the number predicted by the cDNA map. This suggests that either one or a few closely placed copies of the gene in agreement with the single band detected by in situ hybridization to salivary gland polytene chromosomes shown in Fig. 9. This local-ized the gene homologous to pG-IF to the right arm of chro-mosome 2 at or near 51A.
DISCUSSION
Drosophila intermediate filaments
Gruenbaum et al. (1988) determined the sequence for Dm0, which codes for the only IF protein identified previously in Drosophila and the only invertebrate nuclear lamin protein identified to date. Smith et al. (1987) have shown that the lamin protein, Dm0, is synthesized, proteolytically cleaved and differentially phosphorylated giving rise to three dis-tinguishable developmentally regulated forms known as Dm0, Dm1 and Dm2. It was proposed that Dm0 is more like the B type mammalian lamins (Gruenbaum et al., 1988) and before the isolation of pG-IF, no evidence existed for the A/C lamin subtype in Drosophila. Based on these obser-vations, the invertebrate lamin system was thought to differ from the vertebrate systems where multiple lamin proteins exist (reviewed by Krohne and Benavente, 1986). Even less is known about cytoplasmic IFs in Drosophila, which have only been identified from immunological and biochemical data (Falkner et al., 1981; Walter and Biessmann, 1984a,b; Bossie et al., unpublished data).
The pG-IF-encoded protein
Our cDNA library screen was initiated with an interest in a 46 kDa putative cytoplasmic IF. The cDNA characterized here, pG-IF, clearly encodes a larger protein of ∼70 kDa as determined by in vitro translation analysis of the isolated cDNA (Fig. 7) and nucleotide sequence predicted molecular mass (Fig. 6). However, pG-IF does encode an IF pro-tein in Drosophila. Steinert and Roop (1988) have suggested certain features to identify IFs. These include an α-helical rod domain with conserved linker positions and variable head and tail domains. The rod domains of all IFs consist of heptad repeats of amino acids (a,b,c,d,e,f,g) where positions a and d are nonpolar residues favoring a coiled-coil secondary structure. Eighty-six per cent of the residues in pG-IF in positions a and d are nonpolar or uncharged residues (marked in Fig. 6 by •). The amino acid composition of these positions agrees with that expected according to Lupas et al. (1991), who summarized the fre-quency of each residue in each position of the heptad in IF protein sequences in the database. Finally, although there is not significant sequence homology among the IF protein classes, there are highly conserved regions in the beginning of coil 1A and the end of coil 2. The pG-IF/lamin com-parison in Fig. 8 depicts these regions of high sequence identity among pG-IF and class V IF proteins.
Sequence comparisons suggest that the pG-IF cDNA encodes a lamin subtype. Lamins have a longer rod (coil) domain than other IF classes due to the insertion of six hep-tads (or 42 amino acids) in coil 1b (Steinert and Roop, 1988). pG-IF encodes this longer coil 1b and shares the same coil 2 phase shifts in the heptad repeats as Dm0, which are similar to those described for the human lamins (depicted as a break in the pattern of the a and d repeats in Fig. 6; Gruenbaum et al., 1988). Other lamin features described by McKeon et al. (1986) include a strongly basic N-terminal domain. The predicted 43 amino acid N-termi-nal head of pG-IF has a pI of 12.9. Lamins also contain a large C-terminal tail with clustered charged groups in the region following coil 2 and again in the extreme C termi-nus of the C lamin subtype. The sequence in Fig. 6 also shows that there is a high percentage of serine/threonine residues in the C-terminal tail. In addition, pG-IF contains a putative NLS (KRRRTV at residues 453-458) containing an atypical threonine at position 5 (Kalderon et al., 1984b). A threonine in the analogous position (amino acid 131) in the SV40 large T antigen NLS results in a mixed nuclear and cytoplasmic localization of the protein (Kalderon et al., 1984a). The two basic amino acids (RR) present in residues 439-440, may enhance nuclear protein localization (Ding-wall and Laskey, 1991). Since many factors, trans and cis, appear to be involved in nuclear localization of proteins (reveiwed by Silver, 1991), it not possible definitively to determine protein localization based on the presence or con-tent of this motif. The pG-IF sequence does, however, pre-dict a protein with a number of features strongly suggest-ing lamin identity. It does not encode the carboxy-terminal CaaX motif of lamins A and B and its predicted neutral isoelectric point (pI 6.83) is characteristic of lamins A/C. Therefore, it probably represents a lamin C subtype in Drosophila.
It is known from DNA and protein sequence data that invertebrate cytoplasmic IFs contain sequence principles defining class V IFs and are most homologous to the lamins (Weber et al., 1988, 1989; Dodemont et al., 1990; Szaro et al., 1991). This probably explains the results of the library screen where an antibody to a Drosophila putative cyto-plasmic IF protein resulted in the isolation of two class V IF sequences (pG-IF and Dm0). Both probably represent nuclear lamins, since the pG-IF-encoded protein has even greater sequence homology to the lamins than invertebrate cytoplasmic IFs. In addition, pG-IF-encodes the amino-ter-minal cdc2 kinase consensus, SPTR, present in all charac-terized lamins (Peter et al., 1990) but absent in the known invertebrate cytoplasmic IF sequences (also class V IFs) (Weber et al., 1989, 1988; Dodemont et al., 1990; Szaro et al., 1991), further supporting that pG-IF encodes a lamin. The cDNA for the Drosophila B-type lamin, Dm0, was shown to hybridize to two alternately spliced messages of 2.8 and 3.0 kb. They differ only in noncoding sequence and encode identical proteins. The expression of the two mRNAs is developmentally regulated (Gruenbaum et al., 1988). pG-IF also hybridizes to two messages in Kc cells. However, the two pG-IF messages must encode somewhat different proteins. The larger of which probably directs the synthesis of the 70 kDa lamin C-like protein. The smaller ∼1.3 kb message must encode a smaller protein, and may arise by alternative splicing of the pG-IF gene. It is also possible that there are two genes, each giving rise to a dif-ferent mRNA. However, the in situ hybridization to Drosophila polytene chromosomes with pG-IF shows only a single band of hybridization (Fig. 9). Therefore, if there is more than one copy of the pG-IF gene, the copies must be closely placed on the chromosome. The possibility of more than one structural gene does not contradict the genomic Southern blot analysis discussed in Results. The extra band of hybridization for each enzyme could be due to: (1) a single copy of the pG-IF gene with a site for each enzyme in an intron, resulting in the extra genomic band of hybridization; or (2) multiple copies of the pG-IF gene closely placed on the chromosome.
The 70 kDa lamin C-like protein predicted by pG-IF has not yet been identified by our immunoblot analyses of Kc cell proteins. This may be due to low expression of the pro-tein, as suggested by low abundance of the mRNA in Northern blot analysis, or to low affinity of the NJ-2 anti-body for the natural cellular (as opposed to the fusion) pro-tein. It is also possible that its migration in SDS-PAGE pre-cisely coincides with one of the Dm0-derived proteins. The frequency of isolation of the cDNAs from the embryo-derived λgt11 expression library utilized here suggests that the protein is expressed in embryos. Studies to identify the predicted 70 kDa protein in cells from the intact organism are underway.
The 46 kDa cytoskeletal protein
Clearly the 46 kDa cytoplasmic protein and the protein products of the pG-IF gene and the Dm0 gene share struc-ture recognized by the polyclonal antiserum, NJ-2. How-ever, the previous characterization of the 46 kDa protein and the fact that two class V intermediate filament cDNAs were isolated repeatedly in this screen does not clarify the relationship between the 46 kDa cytoplasmic protein and the nuclear lamins in Drosophila. Several possibilities exist. One is that the 46 kDa protein is derived from post-trans-lational processing of the pG-IF protein. A post-transla-tional mechanism for generation of the 46 kDa protein from a lamin is suggested by reports characterizing a 46 kDa rat liver protein which has amino acid sequence identical to the first 376 amino acids of lamins A and C, and which functions as an NTPase involved in nuclear mRNA trans-port (Clawson et al., 1988, 1990). However, cyanogen bro-mide cleavage patterns, partial proteolysis experiments and preliminary microsequencing analysis of the 46 kDa pro-tein compared with the predicted structure of the 70 kDa C-type lamin and Dm0 cast doubt on this possibility (data not shown). A second possibility is that the pG-IF 1.3 kb message is derived from alternate splicing of the pG-IF gene and codes for the 46 kDa protein. However, P-20, a cell line which overexpresses the 46 kDa protein (Sherwood et al., 1989), has little to none of this message, as detected by Northern blot analysis (data not shown). A third possi-bility is that the 46 kDa protein is encoded by the gene giving rise to group 6 from the initial library screen (Table 1), or by an uncharacterized gene. Experiments are under-way to explore these possibilities further.
This research constitutes the beginning of an investiga-tion to identify IF proteins in Drosophila, for which pow-erful genetic approaches are available to provide insight into the cellular role of these proteins.
ACKNOWLEGMENTS
We thank T.-S. Hsieh of Duke University for the kind gift of the λgt11 expression library, and Nico Stuurman and Paul Fisher of SUNY-Stoney Brook for valuable discussion and for gener-ously providing the anti-Dm0 antibody, purified Dm1 plus Dm2 protein and the Dm0 cDNA. We are also grateful to Daniel Bopp and Paul Schedl at Princeton University for advice on polytene chromosome mapping.