Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes.

PubWeight™: 66.87‹?› | Rank: Top 0.01% | All-Time Top 1000

🔗 View Article (PMID 11152613)

Published in J Mol Biol on January 19, 2001


A Krogh1, B Larsson, G von Heijne, E L Sonnhammer

Author Affiliations

1: Center for Biological Sequence Analysis, Technical University of Denmark, Building 208, 2800 Lyngby, Denmark.

Articles citing this

(truncated to the top 100)

The Pfam protein families database. Nucleic Acids Res (2002) 51.34

Human non-synonymous SNPs: server and survey. Nucleic Acids Res (2002) 50.45

Genome sequence of the human malaria parasite Plasmodium falciparum. Nature (2002) 37.89

Recent improvements to the SMART domain-based sequence annotation resource. Nucleic Acids Res (2002) 25.06

The InterPro Database, 2003 brings increased coverage and new features. Nucleic Acids Res (2003) 24.72

NetAffx: Affymetrix probesets and annotations. Nucleic Acids Res (2003) 14.37

The Sorcerer II Global Ocean Sampling expedition: expanding the universe of protein families. PLoS Biol (2007) 13.99

MaGe: a microbial genome annotation system supported by synteny results. Nucleic Acids Res (2006) 12.48

Insights on evolution of virulence and resistance from the complete genome analysis of an early methicillin-resistant Staphylococcus aureus strain and a biofilm-producing methicillin-resistant Staphylococcus epidermidis strain. J Bacteriol (2005) 8.72

InterProScan 5: genome-scale protein function classification. Bioinformatics (2014) 8.52

Subcellular localization of the yeast proteome. Genes Dev (2002) 7.93

Escherichia coli K-12: a cooperatively developed annotation snapshot--2005. Nucleic Acids Res (2006) 7.93

A forty-kilodalton protein of the inner membrane is the mitochondrial calcium uniporter. Nature (2011) 7.44

Protein structure prediction and analysis using the Robetta server. Nucleic Acids Res (2004) 7.39

Prediction of lipoprotein signal peptides in Gram-negative bacteria. Protein Sci (2003) 6.85

WormBase: better software, richer content. Nucleic Acids Res (2006) 6.78

PlasmoDB: a functional genomic database for malaria parasites. Nucleic Acids Res (2008) 6.73

Proteomic analysis of a eukaryotic cilium. J Cell Biol (2005) 6.54

Integrative genomics identifies MCU as an essential component of the mitochondrial calcium uniporter. Nature (2011) 6.48

Variant ionotropic glutamate receptors as chemosensory receptors in Drosophila. Cell (2009) 6.16

Biochemical and genetic analysis of the yeast proteome with a movable ORF collection. Genes Dev (2005) 6.14

The InterPro protein families database: the classification resource after 15 years. Nucleic Acids Res (2014) 6.00

Atypical membrane topology and heteromeric function of Drosophila odorant receptors in vivo. PLoS Biol (2006) 5.91

The Ensembl analysis pipeline. Genome Res (2004) 5.90

Macronuclear genome sequence of the ciliate Tetrahymena thermophila, a model eukaryote. PLoS Biol (2006) 5.44

Advantages of combined transmembrane topology and signal peptide prediction--the Phobius web server. Nucleic Acids Res (2007) 5.29

Whole genome comparisons of serotype 4b and 1/2a strains of the food-borne pathogen Listeria monocytogenes reveal new insights into the core genome components of this species. Nucleic Acids Res (2004) 4.97

Transmembrane protein topology prediction using support vector machines. BMC Bioinformatics (2009) 4.67

Complete genome sequence of the oral pathogenic Bacterium porphyromonas gingivalis strain W83. J Bacteriol (2003) 4.41

Automated inference of molecular mechanisms of disease from amino acid substitutions. Bioinformatics (2009) 4.31

Unique features revealed by the genome sequence of Acinetobacter sp. ADP1, a versatile and naturally transformation competent bacterium. Nucleic Acids Res (2004) 4.19

Complete reannotation of the Arabidopsis genome: methods, tools, protocols and the final release. BMC Biol (2005) 4.18

Protein interaction mapping: a Drosophila case study. Genome Res (2005) 4.15

Assembling the marine metagenome, one cell at a time. PLoS One (2009) 4.10

The institute for genomic research Osa1 rice genome annotation database. Plant Physiol (2005) 3.96

Protein abundance profiling of the Escherichia coli cytosol. BMC Genomics (2008) 3.78

The Lgr5 intestinal stem cell signature: robust expression of proposed quiescent '+4' cell markers. EMBO J (2012) 3.70

Transmembrane topology and signal peptide prediction using dynamic bayesian networks. PLoS Comput Biol (2008) 3.56

Genomic analysis of the necrotrophic fungal pathogens Sclerotinia sclerotiorum and Botrytis cinerea. PLoS Genet (2011) 3.52

The PEDANT genome database. Nucleic Acids Res (2003) 3.47

Gramene: a bird's eye view of cereal genomes. Nucleic Acids Res (2006) 3.47

TOPCONS: consensus prediction of membrane protein topology. Nucleic Acids Res (2009) 3.45

Complete genome sequence of the prototype lactic acid bacterium Lactococcus lactis subsp. cremoris MG1363. J Bacteriol (2007) 3.43

The architecture of respiratory complex I. Nature (2010) 3.43

The vegetative vacuole proteome of Arabidopsis thaliana reveals predicted and unexpected proteins. Plant Cell (2004) 3.41

Genome sequence of Avery's virulent serotype 2 strain D39 of Streptococcus pneumoniae and comparison with that of unencapsulated laboratory strain R6. J Bacteriol (2006) 3.40

Best alpha-helical transmembrane protein topology predictions are achieved using hidden Markov models and evolutionary information. Protein Sci (2004) 3.40

A census of membrane-bound and intracellular signal transduction proteins in bacteria: bacterial IQ, extroverts and introverts. BMC Microbiol (2005) 3.38

Widespread protein aggregation as an inherent part of aging in C. elegans. PLoS Biol (2010) 3.36

Evolution of sensory complexity recorded in a myxobacterial genome. Proc Natl Acad Sci U S A (2006) 3.35

Expanded protein information at SGD: new pages and proteome browser. Nucleic Acids Res (2006) 3.31

Genome sequence of Babesia bovis and comparative analysis of apicomplexan hemoprotozoa. PLoS Pathog (2007) 3.27

Non-classical protein secretion in bacteria. BMC Microbiol (2005) 3.17

Insights into genome plasticity and pathogenicity of the plant pathogenic bacterium Xanthomonas campestris pv. vesicatoria revealed by the complete genome sequence. J Bacteriol (2005) 3.12

The TIGR rice genome annotation resource: annotating the rice genome and creating resources for plant biologists. Nucleic Acids Res (2003) 3.12

TransportDB: a comprehensive database resource for cytoplasmic membrane transport systems and outer membrane channels. Nucleic Acids Res (2006) 3.10

Dissecting the bacterial type VI secretion system by a genome wide in silico analysis: what can be learned from available microbial genomic resources? BMC Genomics (2009) 3.08

An ancestral oomycete locus contains late blight avirulence gene Avr3a, encoding a protein that is recognized in the host cytoplasm. Proc Natl Acad Sci U S A (2005) 3.06

Functional proteomics mapping of a human signaling pathway. Genome Res (2004) 3.04

Ancient protostome origin of chemosensory ionotropic glutamate receptors and the evolution of insect taste and olfaction. PLoS Genet (2010) 3.03

The genome of the simian and human malaria parasite Plasmodium knowlesi. Nature (2008) 3.02

The genome of Rhizobium leguminosarum has recognizable core and accessory components. Genome Biol (2006) 2.93

Whole-genome sequence of Schistosoma haematobium. Nat Genet (2012) 2.91

Complete genome sequence of the N2-fixing broad host range endophyte Klebsiella pneumoniae 342 and virulence predictions verified in mice. PLoS Genet (2008) 2.90

Structure of class B GPCR corticotropin-releasing factor receptor 1. Nature (2013) 2.87

Sequence and genetic map of Meloidogyne hapla: A compact nematode genome for plant parasitism. Proc Natl Acad Sci U S A (2008) 2.85

The complete genome sequence of Haloferax volcanii DS2, a model archaeon. PLoS One (2010) 2.84

Tung tree DGAT1 and DGAT2 have nonredundant functions in triacylglycerol biosynthesis and are localized to different subdomains of the endoplasmic reticulum. Plant Cell (2006) 2.81

PDB_TM: selection and membrane localization of transmembrane proteins in the protein data bank. Nucleic Acids Res (2005) 2.80

Multipass membrane protein structure prediction using Rosetta. Proteins (2006) 2.79

Identification and characterization of the Arabidopsis PHO1 gene involved in phosphate loading to the xylem. Plant Cell (2002) 2.79

Principles of ER cotranslational translocation revealed by proximity-specific ribosome profiling. Science (2014) 2.78

De novo assembly of a 40 Mb eukaryotic genome from short sequence reads: Sordaria macrospora, a model organism for fungal morphogenesis. PLoS Genet (2010) 2.77

Mutations in STRA6 cause a broad spectrum of malformations including anophthalmia, congenital heart defects, diaphragmatic hernia, alveolar capillary dysplasia, lung hypoplasia, and mental retardation. Am J Hum Genet (2007) 2.74

Identification of ciliary localization sequences within the third intracellular loop of G protein-coupled receptors. Mol Biol Cell (2008) 2.73

The genome of Burkholderia cenocepacia J2315, an epidemic pathogen of cystic fibrosis patients. J Bacteriol (2008) 2.71

Mass-spectrometric identification and relative quantification of N-linked cell surface glycoproteins. Nat Biotechnol (2009) 2.70

Insights into the genome of large sulfur bacteria revealed by analysis of single filaments. PLoS Biol (2007) 2.69

Mouse proteome analysis. Genome Res (2003) 2.69

CryptoDB: a Cryptosporidium bioinformatics resource update. Nucleic Acids Res (2006) 2.68

Diverse lifestyles and strategies of plant pathogenesis encoded in the genomes of eighteen Dothideomycetes fungi. PLoS Pathog (2012) 2.66

A computational genomics pipeline for prokaryotic sequencing projects. Bioinformatics (2010) 2.63

A complete sequence of the T. tengcongensis genome. Genome Res (2002) 2.60

MycoCosm portal: gearing up for 1000 fungal genomes. Nucleic Acids Res (2013) 2.60

PhosphoPep--a phosphoproteome resource for systems biology research in Drosophila Kc167 cells. Mol Syst Biol (2007) 2.56

The standard operating procedure of the DOE-JGI Microbial Genome Annotation Pipeline (MGAP v.4). Stand Genomic Sci (2015) 2.55

Prediction of membrane-protein topology from first principles. Proc Natl Acad Sci U S A (2008) 2.55

The genome of S-PM2, a "photosynthetic" T4-type bacteriophage that infects marine Synechococcus strains. J Bacteriol (2005) 2.52

Exome-wide association study identifies a TM6SF2 variant that confers susceptibility to nonalcoholic fatty liver disease. Nat Genet (2014) 2.49

Central functions of the lumenal and peripheral thylakoid proteome of Arabidopsis determined by experimentation and genome-wide prediction. Plant Cell (2002) 2.48

Transcript annotation in FANTOM3: mouse gene catalog based on physical cDNAs. PLoS Genet (2006) 2.48

Lymphopenia in the BB rat model of type 1 diabetes is due to a mutation in a novel immune-associated nucleotide (Ian)-related gene. Genome Res (2002) 2.45

The genome of deep-sea vent chemolithoautotroph Thiomicrospira crunogena XCL-2. PLoS Biol (2006) 2.43

The genome of Salinibacter ruber: convergence and gene exchange among hyperhalophilic bacteria and archaea. Proc Natl Acad Sci U S A (2005) 2.43

Local slowdown of translation by nonoptimal codons promotes nascent-chain recognition by SRP in vivo. Nat Struct Mol Biol (2014) 2.43

PairsDB atlas of protein sequence space. Nucleic Acids Res (2007) 2.42

Complete genome sequence of the industrial bacterium Bacillus licheniformis and comparisons with closely related Bacillus species. Genome Biol (2004) 2.41

Identification of the prokaryotic ligand-gated ion channels and their implications for the mechanisms and origins of animal Cys-loop ion channels. Genome Biol (2004) 2.39

Rice phosphate transporters include an evolutionarily divergent gene specifically activated in arbuscular mycorrhizal symbiosis. Proc Natl Acad Sci U S A (2002) 2.39

EMRE is an essential component of the mitochondrial calcium uniporter complex. Science (2013) 2.39

Articles by these authors

(truncated to the top 100)

The Pfam protein families database. Nucleic Acids Res (2000) 42.28

Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Protein Eng (1997) 38.38

A new method for predicting signal sequence cleavage sites. Nucleic Acids Res (1986) 37.19

Patterns of amino acids near signal-sequence cleavage sites. Eur J Biochem (1983) 26.68

Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. J Mol Biol (2000) 22.77

Signal sequences. The limits of variation. J Mol Biol (1985) 21.24

Automatic clustering of orthologs and in-paralogs from pairwise species comparisons. J Mol Biol (2001) 16.47

A hidden Markov model for predicting transmembrane helices in protein sequences. Proc Int Conf Intell Syst Mol Biol (1998) 14.18

Volume changes in protein evolution. J Mol Biol (1994) 12.07

Pfam 3.1: 1313 multiple alignments and profile HMMs match the majority of proteins. Nucleic Acids Res (1999) 11.64

How signal sequences maintain cleavage specificity. J Mol Biol (1984) 11.03

TopPred II: an improved software for membrane protein structure predictions. Comput Appl Biosci (1994) 10.03

ChloroP, a neural network-based method for predicting chloroplast transit peptides and their cleavage sites. Protein Sci (1999) 9.82

Prediction of transmembrane alpha-helices in prokaryotic membrane proteins: the dense alignment surface method. Protein Eng (1997) 8.25

Genome-wide analysis of integral membrane proteins from eubacterial, archaean, and eukaryotic organisms. Protein Sci (1998) 7.88

A neural network method for identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Int J Neural Syst (1999) 7.40

Mitochondrial targeting sequences may form amphiphilic helices. EMBO J (1986) 7.11

Machine learning approaches for the prediction of signal peptides and other protein sorting signals. Protein Eng (1999) 6.72

Abdominal adipose tissue distribution, obesity, and risk of cardiovascular disease and death: 13 year follow up of participants in the study of men born in 1913. Br Med J (Clin Res Ed) (1984) 5.79

Sequence determinants of cytosolic N-terminal protein processing. Eur J Biochem (1986) 5.59

Distribution of adipose tissue and risk of cardiovascular disease and death: a 12 year follow up of participants in the population study of women in Gothenburg, Sweden. Br Med J (Clin Res Ed) (1984) 5.06

Fibrinogen as a risk factor for stroke and myocardial infarction. N Engl J Med (1984) 4.51

A conserved cleavage-site motif in chloroplast transit peptides. FEBS Lett (1990) 4.25

The influence of body fat distribution on the incidence of diabetes mellitus. 13.5 years of follow-up of the participants in the study of men born in 1913. Diabetes (1985) 4.23

Sequence differences between glycosylated and non-glycosylated Asn-X-Thr/Ser acceptor sites: implications for protein engineering. Protein Eng (1990) 4.17

How proteins adapt to a membrane-water interface. Trends Biochem Sci (2000) 3.76

NIFAS: visual analysis of domain evolution in proteins. Bioinformatics (2001) 3.60

Cardiac and pulmonary causes of dyspnoea--validation of a scoring test for clinical-epidemiological use: the Study of Men Born in 1913. Eur Heart J (1987) 3.15

Trans-membrane translocation of proteins. The direct transfer model. Eur J Biochem (1979) 3.13

Determination of the distance between the oligosaccharyltransferase active site and the endoplasmic reticulum membrane. J Biol Chem (1993) 3.02

Predicting the topology of eukaryotic membrane proteins. Eur J Biochem (1993) 2.73

YidC, the Escherichia coli homologue of mitochondrial Oxa1p, is a component of the Sec translocase. EMBO J (2000) 2.70

Membrane proteins: the amino acid composition of membrane-penetrating segments. Eur J Biochem (1981) 2.52

Analysis of the distribution of charged residues in the N-terminal region of signal sequences: implications for protein export in prokaryotic and eukaryotic cells. EMBO J (1984) 2.43

Cleavage-site motifs in mitochondrial targeting peptides. Protein Eng (1990) 2.38

The Escherichia coli SRP and SecB targeting pathways converge at the translocon. EMBO J (1998) 2.35

A receptor component of the chloroplast protein translocation machinery. Science (1994) 2.34

Green fluorescent protein as an indicator to monitor membrane protein overexpression in Escherichia coli. FEBS Lett (2001) 2.28

Prospective study of social influences on mortality. The study of men born in 1913 and 1923. Lancet (1985) 2.26

Fine-tuning the topology of a polytopic membrane protein: role of positively and negatively charged amino acids. Cell (1990) 2.24

The health consequences of moderate obesity. Int J Obes (1981) 2.21

Analysis of risk factors for stroke in a cohort of men born in 1913. N Engl J Med (1987) 2.08

Sequence of the human immunoglobulin diversity (D) segment locus: a systematic analysis provides no evidence for the use of DIR segments, inverted D segments, "minor" D segments or D-D recombination. J Mol Biol (1997) 2.03

On the hydrophobic nature of signal sequences. Eur J Biochem (1981) 2.01

FAT: a novel domain in PIK-related kinases. Trends Biochem Sci (2000) 1.96

Competition between Sec- and TAT-dependent protein translocation in Escherichia coli. EMBO J (1999) 1.95

Prediction of organellar targeting signals. Biochim Biophys Acta (2001) 1.93

Structures of N-terminally acetylated proteins. Eur J Biochem (1985) 1.92

Assembly of a cytoplasmic membrane protein in Escherichia coli is dependent on the signal recognition particle. FEBS Lett (1996) 1.89

Topology, subcellular localization, and sequence diversity of the Mlo family in plants. J Biol Chem (1999) 1.88

A new concept for isokinetic hamstring: quadriceps muscle strength ratio. Am J Sports Med (1998) 1.85

Nascent membrane and presecretory proteins synthesized in Escherichia coli associate with signal recognition particle and trigger factor. Mol Microbiol (1997) 1.85

Net N-C charge imbalance may be important for signal sequence function in bacteria. J Mol Biol (1986) 1.85

Topological rules for membrane protein assembly in eukaryotic cells. J Biol Chem (1997) 1.83

Amino acid distributions around O-linked glycosylation sites. Biochem J (1991) 1.82

Mortality associated with body fat, fat-free mass and body mass index among 60-year-old swedish men-a 22-year follow-up. The study of men born in 1913. Int J Obes Relat Metab Disord (2000) 1.78

Anionic phospholipids are determinants of membrane protein topology. EMBO J (1997) 1.77

Swedish obese subjects (SOS)--an intervention study of obesity. Baseline evaluation of health and psychosocial functioning in the first 1743 subjects examined. Int J Obes Relat Metab Disord (1993) 1.75

Molecular mechanism of membrane protein integration into the endoplasmic reticulum. Cell (1997) 1.70

Differential use of the signal recognition particle translocase targeting pathway for inner membrane protein assembly in Escherichia coli. Proc Natl Acad Sci U S A (1998) 1.62

Self-esteem, depression and anxiety among Swedish children and adolescents on and off cancer treatment. Acta Paediatr (2000) 1.59

Topological "frustration" in multispanning E. coli inner membrane proteins. Cell (1994) 1.59

Swedish obese subjects (SOS). Recruitment for an intervention study and a selected description of the obese state. Int J Obes Relat Metab Disord (1992) 1.55

Consensus predictions of membrane protein topology. FEBS Lett (2000) 1.55

Membrane protein topology: effects of delta mu H+ on the translocation of charged residues explain the 'positive inside' rule. EMBO J (1994) 1.53

Risk factors for type 2 (non-insulin-dependent) diabetes mellitus. Thirteen and one-half years of follow-up of the participants in a study of Swedish men born in 1913. Diabetologia (1988) 1.53

Translation rate modification by preferential codon usage: intragenic position effects. J Theor Biol (1987) 1.51

A nascent secretory protein may traverse the ribosome/endoplasmic reticulum translocase complex as an extended chain. J Biol Chem (1996) 1.51

The aromatic residues Trp and Phe have different effects on the positioning of a transmembrane helix in the microsomal membrane. Biochemistry (1999) 1.49

The role of human papillomavirus in cervical adenocarcinoma carcinogenesis. Eur J Cancer (2001) 1.48

Risk factors for heart failure in the general population: the study of men born in 1913. Eur Heart J (1989) 1.47

Towards a comparative anatomy of N-terminal topogenic protein sequences. J Mol Biol (1986) 1.46

Chloroplast transit peptides from the green alga Chlamydomonas reinhardtii share features with both mitochondrial and higher plant chloroplast presequences. FEBS Lett (1990) 1.45

Widespread eukaryotic sequences, highly similar to bacterial DNA polymerase I, looking for functions. Curr Biol (1997) 1.45

The glucose uptake of human adipose tissue in obesity. Eur J Clin Invest (1971) 1.44

Prevalence of headache in Swedish schoolchildren, with a focus on tension-type headache. Cephalalgia (2004) 1.44

A 30-residue-long "export initiation domain" adjacent to the signal sequence is critical for protein translocation across the inner membrane of Escherichia coli. Proc Natl Acad Sci U S A (1991) 1.43

The distribution of charged amino acids in mitochondrial inner-membrane proteins suggests different modes of membrane integration for nuclearly and mitochondrially encoded proteins. Eur J Biochem (1992) 1.42

Sec dependent and sec independent assembly of E. coli inner membrane proteins: the topological rules depend on chain length. EMBO J (1993) 1.41

Feature-extraction from endopeptidase cleavage sites in mitochondrial targeting peptides. Proteins (1998) 1.41

The 'positive-inside rule' applies to thylakoid membrane proteins. FEBS Lett (1991) 1.40

[A complication in Budd-Chiari syndrome. Heparin therapy caused thrombocytopenia]. Lakartidningen (1994) 1.39

The COOH-terminal ends of internal signal and signal-anchor sequences are positioned differently in the ER translocase. J Cell Biol (1994) 1.39

Architecture of helix bundle membrane proteins: an analysis of cytochrome c oxidase from bovine mitochondria. Protein Sci (1997) 1.38

Determination of the border between the transmembrane and cytoplasmic domains of human integrin subunits. J Biol Chem (1999) 1.37

A turn propensity scale for transmembrane helices. J Mol Biol (1999) 1.37

Signal sequences are not uniformly hydrophobic. J Mol Biol (1982) 1.37

Turns in transmembrane helices: determination of the minimal length of a "helical hairpin" and derivation of a fine-grained turn propensity scale. J Mol Biol (1999) 1.36

Life and death of a signal peptide. Nature (1998) 1.35

Stop-transfer function of pseudo-random amino acid segments during translocation across prokaryotic and eukaryotic membranes. Eur J Biochem (1998) 1.34

Different conformations of nascent polypeptides during translocation across the ER membrane. BMC Cell Biol (2000) 1.34

Comparison of oxybutynin and its active metabolite, N-desethyl-oxybutynin, in the human detrusor and parotid gland. J Urol (1997) 1.34

Social network and activities in relation to mortality from cardiovascular diseases, cancer and other causes: a 12 year follow up of the study of men born in 1913 and 1923. J Epidemiol Community Health (1992) 1.32

Defining a similarity threshold for a functional protein sequence pattern: the signal peptide cleavage site. Proteins (1996) 1.31

[456,000 Swedes may have urinary incontinence. Only every fourth person seeks help for the disorder]. Lakartidningen (1985) 1.29

Positively and negatively charged residues have different effects on the position in the membrane of a model transmembrane helix. J Mol Biol (1998) 1.27

CHD in Sweden: mortality, incidence and risk factors over 20 years in Gothenburg. Int J Epidemiol (1989) 1.26

Proline-induced disruption of a transmembrane alpha-helix in its natural environment. J Mol Biol (1998) 1.26

A 12-residue-long polyleucine tail is sufficient to anchor synaptobrevin to the endoplasmic reticulum membrane. J Biol Chem (1996) 1.25

Glycosylation efficiency of Asn-Xaa-Thr sequons depends both on the distance from the C terminus and on the presence of a downstream transmembrane segment. J Biol Chem (2000) 1.25