[Back to Number 5 ToC] [Back to Journal Contents] [Back to Biokhimiya Home page]
[Download Reprint (PDF)]

Nucleotide Sequence of the Gene and Features of the Major Outer Membrane Protein of a Virulent Rickettsia prowazekii Strain

V. V. Emelyanov1* and N. G. Demyanova2

1Gamaleya Institute of Epidemiology and Microbiology, Russian Academy of Medical Sciences, ul. Gamalei 18, Moscow, 123098 Russia; fax: (95) 193-6183; E-mail: sileks@glasnet.ru

2State Research Institute of Genetics and Selection of Industrial Microorganisms, 1-yi Dorozhnyi Proezd 1, Moscow, 113545 Russia

* To whom correspondence should be addressed.

Received September 22, 1998; Revision received January 10, 1999
We have determined the nucleotide sequence of the gene for a major outer membrane protein (MOMP) of apparent molecular weight 29.5 kD of the virulent Breinl strain of Rickettsia prowazekii. The gene contains an open reading frame (ORF) that encodes a 282-amino-acid polypeptide with a calculated molecular mass of 31549 daltons. A signal-like peptide sequence is found at the deduced N terminus. A heterologous 29.5-kD antigen expressed in Escherichia coli was shown to be secreted into the periplasm. A database search for similar protein sequences revealed considerable homology of the polypeptide with the E. coli peptidyl-prolyl cis/trans isomerase and related proteins of the parvulin family. The genes for MOMP of the virulent Breinl and EVir strains and the vaccine Madrid E strain were amplified using specific primers and cloned into expression vector pQE-30. We found that the polypeptides encoded by the recombinant DNAs do not differ in SDS-PAGE mobility, while the native MOMP of the Breinl strain is known to be different from the corresponding proteins of the Madrid E and EVir strains. Furthermore, no differences within the ORF for the 29.5-kD proteins of the three strains were found by restriction endonuclease analysis of polymerase chain reaction (PCR) products. A possible role of parvulin-like protein (Plp) in the virulence of epidemic typhus agent and the nature of interstrain differences are discussed. Near the plp gene on the opposite strand, an origin of the gene that codes for the SecA subunit of a preprotein translocase was found.
KEY WORDS: Rickettsia prowazekii, PPIase, virulence, secA


Abbreviations: LPS) lipopolysaccharide; Cbf) cell binding factor; IPTG) isopropyl beta-D-thiogalactopyranoside; (M)OMP) (major) outer membrane protein; ORF) open reading frame; PCR) polymerase chain reaction; Plp) parvulin-like protein; plp) Plp gene; PPIase) peptidyl-prolyl cis/trans isomerase; RFLP) restriction fragment length polymorphism.


Rickettsia prowazekii, the etiological agent of epidemic typhus, is an obligate intracellular gram-negative bacterium. Surface bacterial proteins are known to play an important role in rickettsial parasitism [1-3]. Several envelope proteins of R. prowazekii have been described [1, 4-6], among them major polypeptides with apparent molecular weights of 134, 31, and 29.5 kD were assumed to be immediately involved in interactions with eukaryotic cells [1, 3, 7]. These proteins were shown by SDS-PAGE following [3H]galactose labeling in vivo to be associated with lipopolysaccharides (LPS) [5]. The 134-, 31-, and 29.5-kD outer membrane proteins (OMPs) are heat-modifiable, i.e., their electrophoretic mobility depends on treatment temperature before SDS-PAGE [1]. Dasch and coworkers showed that a 134-kD soluble surface protein antigen (SPA) is a protective antigen [1], and it is reported to be the main target of the immune response in infected humans and animals [3]. A gene encoding SPA was cloned and sequenced [8]. Some indirect data concerning its role in rickettsia--host cell interactions have been obtained [3]. The 31- and 29.5-kD polypeptides are integral OMPs [4, 6]. The 31-kD protein appears to be a cleavable C-terminal moiety of a high-molecular-weight precursor of SPA, perhaps connecting the mature protein to the outer membrane [2, 9]. The structure and function of the 29.5-kD protein were until now unknown. It was earlier shown using SDS-PAGE that the only apparent difference in polypeptide patterns of Breinl and other pathogenic strains and the avirulent Madrid E strain is a higher electrophoretic mobility of the 29.5-kD protein from latter irrespective of pre-treatment temperature [1, 7]. Moreover, this difference was ascribed to a surface-exposed domain [3]. The 29.5-kD MOMP was also shown to be the most accessible to extrinsic radioiodination of intact cells [5, 10]. It is suggested that this antigen may be an important virulence factor. Earlier demonstrated selective solubilization of the integral MOMP of R. prowazekii with the nonionic detergent octyl glucoside has made it possible to explore its physical and chemical properties and functions [4]. However, in the study of rickettsiae that are difficult to cultivate and isolate in large amounts molecular cloning appears to be the best approach.

Three R. prowazekii envelope protein genes were cloned earlier using the lambda gt11 expression vector system. Several clones expressing the 29.5-kD protein were selected with monospecific antibody but not with typhus patient serum [11]. Here we describe the complete nucleotide sequence of the gene and the DNA-inferred amino acid sequence as well as homologies and some features of the 29.5-kD MOMP (designated Plp, parvulin PPIase-like protein). Database search for protein similarities also shows that the SecA translocase gene of R. prowazekii is located in the vicinity of the gene encoding Plp.


MATERIALS AND METHODS

Bacterial strains, vectors, and media. E. coli host strains RY1089 and RY1090 and bacteriophage vector lambda gt11 used in this work were described previously [12]. lambda gt11 recombinant phages were grown and purified as described in a previous work [11]. E. coli JM109 [13] was used as the bacterial host for recombinant plasmids. pBlueScript II K/S+ (Stratagene, USA) was used as a vector for recloning and subcloning. Plasmid-containing cells were grown at 37°C in LB broth or agar supplemented with ampicillin (100 µg/ml).

Passage history of the R. prowazekii strains Breinl, Madrid E, and EVir, cultivation in chicken embryos, and purification of rickettsial cells were described previously [14].

Reagents. Lysozyme, sucrose, EDTA, most reagents for SDS-PAGE, and Tween-20 were obtained from Serva (Germany). Tris, acrylamide, agarose, and SDS were from Sigma (USA); salts were from Merck (Germany). Phage T4 DNA ligase, EcoRI, ClaI, and some other restriction endonucleases were purchased from Pharmacia (Sweden), and HindIII, EcoRV, PstI, HincII, SspI, and MboII were from Fermentas (Lithuania).

DNA manipulations. Plasmid DNA isolation by alkaline lysis and phage and chromosomal DNA extractions were carried out according to Silhavy et al. [15] as well as preparation of competent cells and transformation by plasmid DNA. Plasmids from colorless bacterial colonies grown on LB agar containing isopropyl beta-D-thiogalactopyranoside (IPTG) and 5-bromo-4-chloro-3-indolyl-beta-D-thiogalactoside were screened by size as described by Maniatis et al. [16]. Restriction endonuclease mapping and PAGE and agarose-gel electrophoresis were carried out using published techniques [16]. To prepare DNA probe for Southern hybridization, 1644-base pairs (bp) of a NotI insert of recombinant pVE3 plasmid was isolated by electrotransfer onto a NA45 DEAE-membrane as described earlier [11]. The DNA fragment was labeled with digoxigenin-11-dUTP as recommended by the supplier (Boehringer-Mannheim, Germany) using a non-radioactive DNA labeling and detection kit. Genomic DNAs were treated with HindIII and electrophoresed in 0.8% agarose mini-gels at 80 V for 2 h. Treatment of the gels and alkali transfer of DNA fragments onto Hybond N nylon membranes were done according to a standard protocol (Amersham, England). Hybridization under stringent conditions and immunological detection were performed according to the instructions of Boehringer-Mannheim using the mentioned kit. HindIII fragments of lambda DNA served as DNA size markers.

A 1240-bp fragment was amplified by PCR from pVE3 and genomic DNAs of the Breinl, Madrid E, and EVir strains using 20-nucleotide upstream 5´-TACAGTACGATCATTAGGAG-3´ and downstream 5´-TACAAAGGCAAGAATCTATA-3´ primers of the MOMP gene. The PCR products were digested consecutively with MboII and SspI and subjected to PAGE on 6% gels (PCR-RFLP). The gels were stained with silver nitrate according to the Silver sequence manual (Promega, USA) or with ethidium bromide.

The majority of the overlapping fragments of the initial 1644-bp NotI insert were subcloned from pVE3 into pBluescript II K/S+ vector. The nucleotide sequence was determined by the dideoxynucleotide chain termination method [17] using Taq DNA polymerase (Biomaster, Moscow) and plasmid DNA as templates. All the overlapping regions were sequenced on both strands using universal and reverse M13 primers, Bluescript KS and SK primers, and specific primers. The nucleotide sequence reported in this paper was deposited in the EMBL/GenBank/DDBJ database (accession number X89470).

Fractionation of E. coli cells. TES/lysozyme treatment methodology was applied to localize heterologous MOMP in the E. coli cell. Most experiments were performed with the use of a controlled osmotic shock procedure essentially as described by Randall and Hardy [18]. E. coli JM109 cells containing recombinant pVE3 plasmid were grown to A550 = 0.6 and resuspended in 1/20 volume of hyperosmotic 2× TES buffer (40 mM Tris-HCl, pH 7.8, 0.5 M sucrose, 4 mM EDTA) followed by addition of 0.15 mg/ml lysozyme and an equal volume of cold water (controlled osmotic shock). Formation of spheroplasts (5 to 10 min) was monitored by phase contrast microscopy. About 80% of the cells were converted to spheroplasts within 5 min, then MgSO4 was added to 20 mM to stabilize the latter. After saving the periplasmic fraction (the shock fluid clarified by ultracentrifugation after pelleting of the spheroplasts by centrifugation), the spheroplasts were thoroughly washed with TMS (20 mM Tris-HCl, pH 7.8, 10 mM MgSO4, 0.25 M sucrose), resuspended in the minimal volume, and broken by addition of 20 volumes of ice-cold TE (cold osmotic shock) followed by isolation of the cytoplasmic and membrane fractions [4, 18]. Enzymatic activities of periplasmic alkaline and acid phosphatase and of cytoplasmic malate dehydrogenase were assayed in the soluble fractions exactly as described [19].

Overexpression of the R. prowazekii plp gene. Genomic DNA of the Breinl, Madrid E, and EVir strains were amplified using the 27-nucleotide primer 5´-GCGGATCCGACGAAGATAAAGTAGTAG-3´ that contains a BamHI site and encodes the N terminus of a putative mature part of Plp and the previously mentioned 20-mer downstream of plp. PCR products were treated with BamHI and cloned into N-terminal His-tag encoding in-frame expression vector pQE-30 (QIAGEN, USA) cleaved by BamHI and Ecl136II, an isoschizomer of SstI generating blunt ends (Fermentas). This resulted in pAEB, pAEE, and pAEV plasmids (correspondingly to the strains Breinl, Madrid E, and EVir). E. coli K-12 M15 [pREP4] transformed with these plasmids were grown at 37°C overnight in LB broth containing appropriate antibiotics, diluted 1:30, grown for 2 h, and induced with 3 mM IPTG for 4 h. Polypeptides of whole cells and soluble cellular proteins were analyzed by SDS-PAGE and immunoblotting.

Protein electrophoresis and Western blot analysis. Polypeptides of whole cells and cell subfractions were separated by SDS-PAGE according to Laemmli as described earlier [4]. The polypeptides were transferred onto nitrocellulose membranes in 48 mM Tris, 39 mM glycine, 0.037% SDS, and 20% (v/v) methanol at 60 mA for 1.5 h as recommended by the supplier using a semi-dry transfer apparatus (BioControl, Moscow). Anti-Plp antibody was diluted 1:1000. Nitrocellulose sheets were processed exactly as described previously [11] using anti-rabbit IgG--alkaline phosphatase conjugates and chromogenic substrates from Bio-Rad (USA).

Computer analysis. DNA sequence and deduced protein sequences were thoroughly analyzed with the PC/Gene software package (IntelliGenetics Inc., Switzerland). A database search for similar protein sequences based on an ungapped alignment algorithm was carried out using the BLASTX E-mail server [20] at the National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health (Bethesda, MD, USA). All the six potential translation frames were used as query sequences to screen non-redundant PDB, SWISS-PROT, SPupdate, PIR, GenPept, and GPupdate databases. To spread alignment onto complete polypeptide sequences which appeared the most homologous, the PALIGN program was applied (PC/Gene) that allows the gaps to be introduced. Groups of similar amino acid residues were considered to be the following: Ile, Leu, Val, Met, Ala, Pro; Gln, Asn; Glu, Asp; Lys, Arg, His; Phe, Tyr, Trp; Cys, Ser, Thr, Gly.


RESULTS

Nucleotide sequence of the R. prowazekii MOMP gene. Genomic DNA of the virulent Breinl strain of R. prowazekii has been previously cloned by Emelyanov in the bacteriophage lambda gt11 expression vector using EcoRI/NotI adaptor [11]. DNA inserts of three recombinant clones expressing 29.5-kD antigen as a full-length polypeptide in the absence (lambda gt11B3) and presence (lambda gt11B5) of IPTG and as a beta-galactosidase fusion polypeptide (lambda gt11A2-4) were oriented relative to lacZ (restriction analysis of EcoRI and HindIII digests and double digests) followed by recloning into the NotI site of pBluescript and detailed physical mapping (Fig. 1). Because the insert of the first clone was oriented oppositely to the inserts of the two other clones, this allowed us to predict the direction of transcription of the cloned gene and the approximate localization of the promoter operating in E. coli. In the present study, a 1644-bp DNA insert of recombinant plasmid pVE3 presumably containing the complete MOMP gene (initial clone lambda gt11B3) was sequenced. The 71.8% A + T content of this cloned DNA fragment is in good agreement with the R. prowazekii genomic A + T content of 72% [21]. Within the sequenced region, an ORF of 846 bp that begins with an ATG triplet (nucleotides 462 to 464) and terminates with a TAA stop codon (nucleotides 1308 to 1310) was identified. Relatedness of this translation frame to MOMP was confirmed by partial sequencing of a DNA insert from a clone (lambda gt11A2-4) expressing beta-galactosidase fusion protein. At least two structures resembling the E. coli sigma70 promoter consensus sequence [22] and an inverted repeat potentially forming a hairpin structure with free energy of -48.6 kJ/mole were found upstream and downstream (nucleotides 1386 to 1415), respectively, of the ORF. Codon usage analysis revealed a preference for A and T in the third base position (84.3%), this being typical of A + T rich bacteria including rickettsiae [21]. Codon frequencies in the R. prowazekii genes encoding two MOMP (29.5-kD protein and SPA) and inner membrane protein ATP/ADP translocase were shown to be very similar but substantially different from those of highly expressed E. coli genes for porins [23, 24]. The deduced protein sequence begins with two Lys residues followed by a long stretch of mainly apolar residues, i.e., a structure similar to a prepeptide with a predicted proteolytic processing site between Gly-20 and Asp-21 [25, 26]. The putative MOMP precursor has calculated molecular weight of 31549 daltons. Arg and Cys residues are absent from the amino acid sequence. The molecular weight of 29408 daltons for the mature MOMP form coincides with that determined by SDS-PAGE [11].

Figure 1

Fig. 1. Physical map of the fragments of recombinant plasmids bearing the complete gene for 29.5-kD MOMP (pVE3), the gene with a presumably shortened promoter region (pVE5), and an incomplete ORF of the gene (pVE4). The ORF is marked by the long horizontal arrow. Short arrows indicate the direction of lacZ transcription on corresponding lambda gt11-recombinants. Restriction endonucleases that cleave cloned inserts are EcoRI (RI), HindIII (III), EcoRV (RV), PstI (P), ClaI (C), and HincII (HII). Numbers at the insert right termini show right coordinates according to insert sizes. The scale is in kilobase pairs.

Homologies and comparison of the R. prowazekii MOMP with other bacterial envelope proteins. Comparison of the complete sequence of the 1644-bp R. prowazekii DNA fragment against 147297 sequences of GenBank databases (see "Materials and Methods") revealed substantial homology of the 29.5-kD polypeptide to an E. coli PPIase (EC 5.2.1.8) of the parvulin family [27], Cbf2 of Campylobacter jejuni [28], PrsA of Bacillus subtilis involved in late stage of protein export [29], and PrtM of Lactococcus lactis required for maturation of an extracellular serine proteinase [30]. PPIase, PrsA, and PrtM were known to be homologous [27]. The statistical significance of the homology between R. prowazekii Plp and C. jejuni Cbf2 determined by the Monte-Carlo test using the PCOMPARE program (PC/Gene) appeared to be 8.9 SD, whereas any value above 3.0 SD indicates substantial similarity. Alignment of the most similar regions of several above-mentioned polypeptide sequences is shown in Fig. 2. From the high homology of R. prowazekii MOMP with Cbf2 and PrsA (48.3 and 48.9% of overall homology including conservative substitutions, respectively) and the similar sizes of these proteins, similarity of their properties would be not unexpected. In particular, alpha-helical conformation [31] and hydrophobicity [32] plots of the three proteins made using the GARNIER and SOAP programs (PC/Gene), respectively, are very similar (not shown), although PrsA has a significantly lower average hydropathy index (-8.23 against -3.68 and -4.62 for Plp and Cbf2). The PPIase of E. coli was not compared to the other polypeptides because of its too small size. Nevertheless, the 29.5-kD polypeptide has the highest homology of 58.7% with parvulin (Fig. 2). Some properties of homologous proteins inferred from primary structure were compared to those of the R. prowazekii 31-kD polypeptide presumably arising from the C terminus of an SPA precursor [8] and E. coli porins LamB [23] and OmpF [24] which share many features [33]. The N terminus of the 31-kD MOMP was chosen by homology with the sequenced N terminus of the corresponding polypeptide from Rickettsia rickettsii [2]. While porins are known to be acidic proteins, Plp of R. prowazekii is almost neutral (predicted pI of 7.62) and the others are basic proteins. More than one third of the amino acid residues are apolar in all the proteins under investigation except for the relatively hydrophilic PrsA. None of the polypeptides has long hydrophobic sequences which could span the membrane. It should be pointed out that no other homologies between these OMP including E. coli OmpA were found besides those described above. In addition, no specific-to-porins features described by Nikaido and Wu [33] was revealed in the DNA-inferred Plp sequence. Plp, Cbf2, and PrsA are predominantly in alpha-helical conformation, whereas E. coli porins and R. prowazekii 31-kD protein have a relatively high content of beta-sheet structure as computed by both the AACOMP program (PC/Gene) based on amino acid composition and the GARNIER program.

Figure 2

Fig. 2. Amino acid sequence alignment of R. prowazekii MOMP (R.p.), C. jejuni Cbf2 (C.j.), B. subtilis PrsA (B.s.), and E. coli parvulin (E.c.). The L. lactis PrtM sequence was not included for simplicity. The R. prowazekii polypeptide sequence was initially compared to each other sequence in pairwise fashion using PALIGN, then the individual alignments were combined. Dashes show gaps introduced during pairwise comparisons, and asterisks show gaps included subsequently to keep maximum homology. Identical amino acid residues are in frames, and similar residues are in boldface type. The following groups of similar residues were chosen: H, K, R; D, E; N, Q; C, G, S, T; F, W, Y; I, L, V, M, A, P. The pairs of numbers at bottom right corner show percentage of identity and similarity, respectively, of each sequence at the end of which they are located compared with the R. prowazekii Plp sequence.

Localization of the cloned 29.5-kD protein in E. coli cells. It was of particular interest to determine the localization of the cloned R. prowazekii MOMP in the E. coli cell. Cells were fractionated using approaches based on the conversion of the cells of gram-negative bacteria into spheroplasts using lysozyme in osmotically stabilized buffer. The distribution of malate dehydrogenase (marker for cytoplasm) and alkaline and acid phosphatase (markers for periplasm) activities was measured to check cross-contamination of soluble fractions. 10-15% of the total malate dehydrogenase and 3-5% of the total alkaline or acid phosphatase were found in the periplasmic and cytoplasmic fractions, respectively. The result of the cell fractionation experiments presented as Western blot analysis of subcellular fractions is shown in Fig. 3a. It is seen that immunoreactive polypeptide almost corresponding by electrophoretic mobility to 29.5-kD antigen from R. prowazekii outer membrane is localized mainly in the periplasmic space. A minute amount of Plp and a polypeptide with an apparent molecular weight of Plp precursor are revealed in the whole envelopes. We infer from these data that both export into periplasm and processing of R. prowazekii MOMP precursor proceed in E. coli.

Figure 3

Fig. 3. a) Immunoblotting of E. coli cells and cellular subfractions. E. coli JM109 containing plasmid pVE3 was fractionated as described in "Materials and Methods". Proteins were subjected to 10% SDS-PAGE followed by transfer onto a nitrocellulose membrane: 1) E. coli carrying vector plasmid; 2) E. coli [pVE3]; 3) cytoplasmic fraction; 4) membrane fraction; 5) periplasmic fraction of E. coli; 6) membrane vesicles of R. prowazekii [4]. The probable precursor form (filled arrow) and mature form (empty arrow) of Plp are shown. b) 10% SDS-PAGE and silver nitrate staining of recombinant and native Plp. E. coli cells carrying pQE-30 (1), pAEB (2), pAEE (3), and pAEV (4) were grown in the presence of 3 mM IPTG. Whole cells of R. prowazekii Breinl (5), Madrid E (6), and EVir (7). R. prowazekii 29.5-kD antigens are indicated (empty arrow). Before electrophoresis all the samples were boiled for 4 min.

Analysis of amplification products and Southern hybridization analysis using a plp gene probe. PCR-RFLP analysis of a DNA region encompassing the MOMP ORF was performed to show expected differences in the gene of the avirulent Madrid E and virulent Breinl R. prowazekii strains. PCR products were digested with MboII and SspI and then separated by 6% PAGE. Calculated fragment sizes were in perfect agreement with the sequence. Surprisingly, all the fragments of amplified DNA from pVE3 (not shown) and genomic DNA of Breinl, Madrid E, and EVir strains coincided except that one fragment arising from the extreme (beyond ORF) 3´-terminal region in the latter two cases was 1 nucleotide longer than the corresponding 107-bp fragment in Breinl (Fig. 4a).Also, only the one mentioned difference was revealed using other frequently cutting restrictases. These results were confirmed by Southern hybridization analysis of the same digests (V. Emelyanov, unpublished observations). As shown in Fig. 4b, HindIII fragments of the genomic DNA of three R. prowazekii strains with sizes 3.1, 1.6, and 0.54 kb gave hybridization signals with the 1.644-kb NotI insert of the pVE3 plasmid labeled with digoxigenin-11-dUTP.

Figure 4

Fig. 4. a) Silver-stained 6% polyacrylamide gel of amplification products from genomic DNA of Breinl (B), Madrid E (E), and EVir (V) strains of R. prowazekii and of E. coli RY1090 (C) digested by MboII and SspI. Fragment sizes (in bp) of pUC19 cleaved by MspI (M) are on the left (markers). b) Southern hybridization analysis of the pVE3 plasmid (P) and genomic DNA (marked as above) digested by HindIII, subjected to electrophoresis in 0.8% agarose gel, transferred to Hybond N membrane, and hybridized with digoxigenin-labeled 1644-bp probe. Fragment sizes (in kilobase pairs) are indicated on the right.

Comparison of Plp cloned from different R. prowazekii strains. PCR products amplified from the DNA region of Breinl, Madrid E, and EVir strains that codes for mature Plp were cloned into in-frame expression vector pQE-30. High expression level (50-100 µg per ml of cell suspension) allowed us to compare Plp directly by SDS-PAGE (Fig. 3b). While native 29.5-kD MOMP of Breinl strain has slightly lower electrophoretic mobility compared to the proteins of Madrid E and EVir strains, the recombinant polypeptides coincide by mobility. This finding agrees with the absence of deletions in plp shown by PCR-RFLP. Inserts of various clones were excised by BamHI and SmaI and compared by PCR-RFLP analysis as above. Again, only the above-mentioned difference between Breinl and both Madrid E and EVir beyond ORF was revealed. It should be noted that both denatured (SDS-PAGE) and native (dot-blot) cloned 29.5-kD polypeptides reacted with monospecific antibody to Plp but neither reacted with convalescent sera (data not shown).

SecA preprotein translocase homolog in R. prowazekii. In the course of the search for protein similarities using the BLAST mail server, the sequence of 79 amino acid residues encoded by translation frame -1 of the 1644-bp NotI fragment was found to be highly homologous to known members of the SecA subunit family of preprotein translocase including those of E. coli [34], B. subtilis (accession number P28366), Listeria monocytogenes (accession number L32090), chloroplasts (accession number X82404), and others. This rickettsial sequence is encoded by the 5' terminus of an ORF that begins with the ATG start codon at position 237 to 235 on the other DNA strand relative to plp, and proceeds continuously in the opposite direction. Figure 5 represents alignments of nucleotide and deduced amino acid sequences of a rickettsial segment with those of the respective 5'-terminal part of the most homologous E. coli secA. The best match could be achieved by deleting the third codon from the R. prowazekii sequence. Homology of the nucleotide sequences was shown to be 51.25% (NALIGN program, PC/Gene). Alignment of amino acid sequences was completed using the PALIGN program. The R. prowazekii and E. coli polypeptide sequences share 48.1% of identical and 19% of 'similar' residues, hence, overall homology approaches 67.1%. Like plp, the incomplete rickettsial secA ORF displayed obvious preference for A and T in the wobble position (75%). At the same amino acid positions and even for the same amino acid residues R. prowazekii uses triplets rich in A and T much more frequently. It is striking, for example, that out of 10 basic (Arg and Lys) amino acids in the same position, four conservative replacements of Arg for Lys (AAA) are found in the rickettsial sequence.

Figure 5

Fig. 5. Alignment of the inferred amino acid sequence and non-coding nucleotide sequence of the opposite (relative to plp) strand of the 1644-bp DNA fragment (from nucleotide 237 to 1) with the 5'(N)-terminal segment of E. coli secA. Symbol 'plus' indicates conservative amino acid residues. Groups of similarity are the same as in Fig. 2.


DISCUSSION

The MOMPs of R. prowazekii with molecular weights of 134 (SPA), 31, and 29.5 kD are assumed to be largely involved in rickettsia--host cell interaction and immune response [1, 3]. Based on observed interstrain variability, it was concluded that the 29.5-kD antigen may be an important virulence factor of R. prowazekii [3]. Cloning and expression of the plp gene in E. coli provided the opportunity to characterize this gene and to deduce the primary structure of the 29.5-kD MOMP. In this paper we report the nucleotide sequence of plp and features of the gene product. It was found that the N-terminal region of the protein sequence (282 amino acid residues) deduced from the DNA has all the characteristic features of a signal peptide. Specifically, the extreme N terminus contains two positively charged Lys residues followed by 16 uncharged residues. The tripeptide Ala-Phe-Gly at the C terminus of the putative prepeptide conforms to a consensus sequence of the leader peptidase I cleavage site [25, 26]. The deduced mature protein sequence begins with a group of three negatively charged residues that can further proper orientation of the signal sequence in a polarized cytoplasmic membrane as proposed by the 'loop' model [25]. Initiation of translation from any other ATG within the corresponding ORF is thought to be less probable considering that a polypeptide with either of two other downstream Met residues at the extreme N terminus would be devoid of a leader peptide. In addition, despite the fact that the ATGAG stretch six nucleotides before the putative translation initiation codon has little homology with the ribosome-binding site consensus sequence AGGAGGT [35], there are no structures resembling a ribosome-binding site upstream from other ATG triplets. Cell fractionation experiments show that the protein is secreted into the periplasm of E. coli and processed, and only a small amount of Plp precursor remains associated with the membranes (Fig. 3a). We guess that early steps of preprotein export are similar in E. coli and R. prowazekii, and even use homologous factors. Improper localization of R. prowazekii MOMP in the E. coli cell may be due to distinctive properties of this rickettsial protein (see below). Furthermore, late stages of protein export and insertion into the outer membrane may differ in E. coli and R. prowazekii. Different microorganisms have evolved special machineries for export of extracellular proteins and subunits of cell appendages. Mechanisms for export and insertion into the outer membrane of the OMP in gram-negative bacteria remain, however, an enigma. Different models have been proposed which assume specific topogenic sequences in the mature part of OMP, co-export of proteins and LPS, and movement to the outer membrane through adhesion zones [36]. It was previously suggested that rickettsial but not synthesized in E. coli 29.5-kD antigen reacts with immunoglobulins of human convalescent sera because of its possible linkage with LPS of R. prowazekii [11]. One can suppose that some signals (topogenic sequences) in mature Plp target it to the R. prowazekii outer membrane, perhaps via specific interaction with LPS. Whether or not the periplasmic space of R. prowazekii is a normal route of protein traffic to the outer membrane is unknown. It is of interest that another rickettsial OMP, 17-kD lipoprotein of Rickettsia rickettsii, has been revealed in the outer membrane of E. coli using sarcosyl solubilization of cytoplasmic membrane [37].

Assuming the heat-modifiable character, relative size similarity, and abundance of 31- and 29.5-kD MOMPs in the cell, we could expect that they share at least some properties with MOMPs of other gram-negative bacteria. It was shown, however, that Plp differs significantly from porins and OmpA of E. coli in both primary and predicted secondary structure. The 31-kD polypeptide may resemble porins in secondary structure, but it is a strongly basic polypeptide. Thus, the obligate intracellular bacterium R. prowazekii is unlikely to have proteins similar at least to MOMPs of free-living E. coli. Expressed in E. coli, Plp is thought to be similar to soluble polypeptides, this agreeing, in particular, with its localization in the periplasm. Interestingly, it lacks the heat-modifiability inherent to the native protein (data not shown). It is suggested that this feature reflects differences in the amount of SDS bound to beta-structured regions of OMP at different temperatures [33, 38]. Because Plp was predicted to adopt mainly alpha-helical conformation, in contrast with E. coli OMP, its heat-modifiability may depend on linkage to LPS [5, 11]. It is known, for example, that LPSs facilitate conversion of E. coli OmpA to a heat-modifiable form [38]. Indeed, it was recently shown that the 29.5-kD MOMP makes up complexes with carbohydrates in the heat-modified but not the unmodified state (V. Emelyanov, unpublished results). It should be noted that Plp has substantially higher electrophoretic mobility in the heat-unmodified state compared to the heat-modified state (this is a characteristic of heat-modifiable envelope proteins). We suggest that MOMP undergoes some post-translational modification(s) in R. prowazekii (see also below), this resulting in enhanced mobility of its heat-unmodified LPS-free form, whereas boiling before SDS-PAGE leads to its close association with LPS and to the appearance of a heat-modified form nearly coinciding (occasionally) in mobility with its recombinant form. It should be pointed out that the theoretical pI value of mature Plp (7.62) differs from the value of 5.0 determined experimentally by Oaks et al. [7]. Various modifications of the protein such as phosphorylation, deamidation and tight linkage to carbohydrates might account for this difference.

It was shown during a global search for similarity that 29.5-kD protein has a high level of homology with the parvulin family PPIase of E. coli [27] and related proteins (Fig. 2). These are Cbf2 of C. jejuni [28], PrsA of B. subtilis, and PrtM of L. lactis. The latter two proteins are engaged in late stages of protein excretion and maturation in the corresponding gram-positive bacteria [29, 30]. As compared to parvulin, parvulin-like polypeptides have long terminal extensions which may be involved in protein traffic and arrangement in the outer membrane. Of interest, despite the rather high similarity of the above proteins, the bacteria themselves belong to distant bacterial phyla. R. prowazekii, E. coli, and C. jejuni are from the alpha, gamma, and delta subdivisions of Proteobacteria, respectively, while B. subtilis and L. lactis are gram-positive bacteria with low G + C content [39]. Homology of Plp and Cbf2 of the two pathogenic gram-negative microorganisms that approaches 47.1 and 48.3% at the DNA and protein levels, respectively, is most intriguing (Fig. 2). Cbf2, previously designated PEB4, is not the main adherence factor but facilitates adhesion of C. jejuni to eukaryotic cells [40]. If the 29.5-kD protein and Cbf2 indeed possess PPIase activity, this raises the possibility that both these factors act to change the conformation of some bacterial or host cell surface components, thereby assisting adherence. Fischer et al. have shown that macrophage infectivity potentiator (Mip) of Legionella pneumophila is a PPIase belonging to another family of FK506-binding proteins [41]. The only main biological feature distinguishing virulent R. prowazekii strains from vaccine Madrid E strain is high infectivity for macrophages and macrophage-like cells [7]. One can suppose that the low virulence of the Madrid E strain depends on altered activity of Plp based on its structural difference from the protein of the Breinl strain. It should be noted, however, that no differences in the 29.5-kD protein of the Madrid E strain and its virulent derivative EVir were revealed by SDS-PAGE at any pretreatment temperature (V. Emelyanov, unpublished data; see also Fig. 3b, lanes 6 and 7). Surprisingly, despite the clear difference in mobility of the 29.5-kD antigens from the Madrid E and Breinl strains corresponding to a molecular mass difference of ~0.5 kD (Fig. 3b, lanes 5 and 6), no differences in the ORF encoding this polypeptide were revealed by restriction analysis of amplification products (Fig. 4a) which could account for variability in protein size. Moreover, we have shown that Plp cloned in E. coli from three R. prowazekii strains coincide in SDS-PAGE mobility, in contrast to their counterparts from rickettsiae (Fig. 3b, lanes 2-4 versus 5-7). We suggest, therefore, that various post-translational modifications are responsible for apparent difference in the Plp of Breinl and Madrid E, and perhaps less drastic difference (if any) in the Plp of the latter and EVir. Mutational substitution(s) of amino acid residues also cannot be excluded. Comparative studies of the 29.5-kD polypeptide structure and putative PPIase activity are needed to answer these questions.

Southern hybridization analysis using HindIII (Fig. 4b) and ClaI (not shown) revealed that plp is present in one copy in the R. prowazekii genome. It is evident that the R. prowazekii plp gene consists of one cistron taking into account the presence of the hairpin-like structure with characteristic features of a transcription terminator [42] immediately downstream from the ORF. Second, there is no ORF in the vicinity of the one encoding the 29.5-kD polypeptide. A DNA insert from a clone expressing full-length polypeptide only in IPTG-induced fashion (lambda gt11B5) was sequenced at the corresponding terminus in an effort to localize the transcription initiation site operating in E. coli. Surprisingly, two promoter-like structures upstream from the ORF were also present in this fragment, while the 1644-bp fragment from a clone expressing Plp constitutively (lambda gt11B3) contains in addition only a potential TATA-box (nucleotides 290-295). It remains an open question what are the structure(s) that act as promoter(s) in R. prowazekii and E. coli. It should be noted, however, that the origin of another ORF was revealed on the opposite DNA strand, from position 237 to the end of the 1644-bp insert. The nucleotide sequences 310-305 and 287-282 located upstream of this occasionally interrupted ORF and overlapping with the putative TATA-box possess substantial homology to -35 and -10 consensus sequences, respectively. We suggest that inter-gene space may contain 'divergent' promoter that acts in both directions. To this point, the prtP gene of L. lactis encoding a cell envelope-attached proteinase, and the prtM gene encoding the above-mentioned parvulin-like protein involved in the maturation of PrtP are transcribed from a single promoter region in opposite directions [30].

The protein sequence encoded by the above-mentioned ORF was found upon database search for similarity to be highly homologous to the corresponding N-terminal parts of members of the SecA subunit superfamily of the preprotein translocase complex (Fig. 5). Striking homology (67.1%) with the N terminus of E. coli SecA exceeds the level of DNA homology, this evidently reflecting different base composition of R. prowazekii and E. coli genomes. We assume that the SecA subunit is under strong selective pressure. It should be pointed out that the ORF coding for SecA of E. coli is preceded by another ORF of a single operon [34]. Interestingly, the Plp gene encoding a possible PPIase that may function in protein export [36] and the SecA gene are positioned together on the R. prowazekii chromosome or even constitute a divergently transcribed operon. Identification of the regulatory elements of both the plp and secA genes functioning in intracellular bacteria as well as cloning and characterization of the R. prowazekii secA await further investigation.

In conclusion, we determined the nucleotide sequence of the virulent R. prowazekii Breinl strain gene for a putative virulence factor 29.5-kD OMP but revealed no expected difference from the gene of the attenuated Madrid E strain by PCR-RFLP. Most importantly, we have found it to code for a polypeptide with extensive homology to peptidyl-prolyl cis/trans isomerase and related proteins that may catalyze a change in conformation of proteins localized on cell surfaces. These results may contribute to an understanding of the molecular basis for obligate intracellular parasitism and virulence.1

We thank A. O. Zhigoulin for performing PCR. We are also grateful to G. Fischer and J.-U. Rahfeld from Max-Planck-Gesellschaft (Halle, Germany) for helpful discussions.

1 After this work was sent to Biochemistry (Moscow), an article by S. G. E. Andersson et al. appeared in Nature (396, 133-140) on the sequencing of the complete genome of the R. prowazekii Madrid E strain. The reported nucleotide sequence is available in databases. The only difference of this sequence from the sequence of the 1644-bp fragment (Breinl strain) we determined is an additional nucleotide (adenine) within a transcription terminator loop (see "Analysis of amplification products" in "Results").


REFERENCES

1.Dasch, G. A., Burans, J. P., Dobson, M. E., Jaffe, R. I., and Sewell, W. G. (1985) in Rickettiae and Rickettsial Diseases (Kazar, J., ed.) Slovak Academy of Science, Bratislava, pp. 54-61.
2.Hackstadt, T., Messer, R., Cieplak, W., and Peacock, W. G. (1992) Infect. Immun., 60, 159-165.
3.Oaks, E. V. (1983) Dissert. Abstr. Int., 44, 1012-B.
4.Emelyanov, V. V. (1992) Biokhimiya, 57, 1196-1205.
5.Osterman, J. V., and Eisemann, C. S. (1978) Infect. Immun., 21, 866-873.
6.Smith, D. K., and Winkler, H. H. (1979) J. Bacteriol., 137, 963-971.
7.Oaks, E. V., Wisseman, C. L., Jr., and Smith, J. F. (1981) in Rickettsiae and Rickettsial Diseases (Anacker, R. L., and Burgdorfer, W., eds.) Academic Press Inc., N. Y., pp. 461-472.
8.Carl, M., Dobson, M. E., Ching, W.-M., and Dasch, G. A. (1990) Proc. Natl. Acad. Sci. USA, 87, 8237-8241.
9.Emelyanov, V. V. (1995) Abst. 7th Europ. Congr. of Clinical Microbiology and Infectious Diseases, Vienna, p. 92.
10.Smith, D. K., and Winkler, H. H. (1980) Infect. Immun., 29, 831-834.
11.Emelyanov, V. V. (1993) Microb. Pathogenesis, 15, 7-16.
12.Young, R. A., and Davis, R. W. (1983) Science, 222, 778-782.
13.Yanish-Perron, C., Vieira, J., and Messing, J. (1985) Gene, 33, 103-119.
14.Rodionov, A. V., Eremeeva, M. E., and Balayeva, N. M. (1991) Acta Virol., 35, 557-565.
15.Silhavy, T. J., Berman, M. L., and Enquist, L. W. (1984) Experiments with Gene Fusions, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N. Y.
16.Maniatis, Ò., Fritsch, E. F., and Sambrook, J. (1984) Molecular Cloning [Russian translation], Mir, Moscow.
17.Sanger, F., Nicklen, S., and Coulson, A. R. (1977) Proc. Natl. Acad. Sci. USA, 74, 5463-5467.
18.Randall, L. L., and Hardy, S. J. C. (1986) Cell, 46, 921-928.
19.Bergmeyer, H. U., Gawehn, K., and Grassl, M. (1974) in Methods of Enzymatic Analysis, Vol. 1 (Bergmeyer, H. U., ed.) Verlag Chemie, Weinheim, pp. 425-522.
20.Altschul, S. F., Gish, W., Miller, W., Myers, E. W., and Lipman, D. J. (1990) J. Mol. Biol., 215, 403-410.
21.Williamson, L. R., Plano, G. V., Winkler, H. H., Krause, D. C., and Wood, D. O. (1989) Gene, 80, 269-278.
22.McClure, W. R. (1985) Annu. Rev. Biochem., 54, 171-204.
23.Clement, J. M., and Hofnung, M. (1980) Cell, 27, 507-514.
24. Inokuchi, K., Mutoh, K., Matsuyama, S.-i., and Mizushima, S. (1982) Nucleic Acids Res., 10, 6957-6967.
25.Perlman, D., and Halvorson, H. O. (1983) J. Mol. Biol., 167, 391-409.
26.Von Heijne, G. (1983) Eur. J. Biochem., 133, 17-21.
27.Rahfeld, J.-U., Rucknagel, K. P., Schelbert, B., Ludwig, B., Hacker, J., Mann, K., and Fischer, G. (1994) FEBS Lett., 352, 180-184.
28.Burucoa, C., Fremaux, C., Pei, Z., Murali, T., Blaser, M., and Fauchere, J. L. (1995) EMBL/GenBank/DDBJ Database. Accession No. X 84703.
29.Kontinen, V. P., Saris, P., and Sarvas, M. (1991) Mol. Microbiol., 5, 1273-1283.
30.Vos, P., van Asseldonk, M., van Jeveren, F., Siezen, R., Simons, G., and de Vos, M. (1989) J. Bacteriol., 171, 2795-2802.
31.Garnier, J., Osguthorpe, D., and Robson, B. (1978) J. Mol. Biol., 120, 97-120.
32.Kyte, J., and Doolittle, R. F. (1982) J. Mol. Biol., 157, 105-132.
33.Nikaido, H., and Wu, H. C. P. (1984) Proc. Natl. Acad. Sci. USA, 81, 1048-1052.
34.Schmidt, M. G., Rollo, E. E., Grodberg, J., and Oliver, D. B. (1988) J. Bacteriol., 170, 3404-3414.
35.Shine, J., and Dalgarno, L. (1974) Proc. Natl. Acad. Sci. USA, 71, 1342-1346.
36.Pugsley, A. P. (1993) Microbiol. Rev., 57, 50-108.
37.Anderson, B. E., Regnery, R. L., Carlone, G. M., Tzianabos, T., MacDade, J. E., Fu, Z. Y., and Bellini, W. J. (1987) J. Bacteriol., 169, 2385-2390.
38.Freudl, R., Schwarz, H., Stierhof, Y.-D., Gamon, K., Hindennach, I., and Henning, U. (1986) J. Biol. Chem., 261, 11355-11361.
39.Olsen, G. J., Woese, C. R., and Overbeek, R. (1994) J. Bacteriol., 176, 1-6.
40.Kervella, M., Page, J. M., Pei, Z., Grollier, G., Blaser, M. J., and Fauchere, J. L. (1993) Infect. Immun., 61, 3440-3448.
41.Fischer, G., Bang, H., Ludwig, B., Mann, K., and Hacker, J. (1992) Mol. Microbiol., 6, 1375-1383.
42.Platt, T. (1981) Cell, 24, 10-23.