A neural network method for identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites.

PubWeight™: 7.40‹?› | Rank: Top 0.1%

🔗 View Article (PMID 10065837)

Published in Int J Neural Syst on May 07, 1999

Authors

H Nielsen1, J Engelbrecht, S Brunak, G von Heijne

Author Affiliations

1: Department of Biotechnology, The Technical University of Denmark, Lyngby.

Articles citing this

(truncated to the top 100)

The InterPro Database, 2003 brings increased coverage and new features. Nucleic Acids Res (2003) 24.72

The InterPro protein families database: the classification resource after 15 years. Nucleic Acids Res (2014) 6.00

The Ensembl analysis pipeline. Genome Res (2004) 5.90

Selection in the evolution of gene duplications. Genome Biol (2002) 5.58

Advantages of combined transmembrane topology and signal peptide prediction--the Phobius web server. Nucleic Acids Res (2007) 5.29

PSORT-B: Improving protein subcellular localization prediction for Gram-negative bacteria. Nucleic Acids Res (2003) 4.25

Exploring the conditional coregulation of yeast gene expression through fuzzy k-means clustering. Genome Biol (2002) 4.24

ARAMEMNON, a novel database for Arabidopsis integral membrane proteins. Plant Physiol (2003) 4.15

Protein interaction mapping: a Drosophila case study. Genome Res (2005) 4.15

The genome sequence of the probiotic intestinal bacterium Lactobacillus johnsonii NCC 533. Proc Natl Acad Sci U S A (2004) 3.88

The complete genome of hyperthermophile Methanopyrus kandleri AV19 and monophyly of archaeal methanogens. Proc Natl Acad Sci U S A (2002) 3.72

The psychrophilic lifestyle as revealed by the genome sequence of Colwellia psychrerythraea 34H through genomic and proteomic analyses. Proc Natl Acad Sci U S A (2005) 3.10

Functional proteomics mapping of a human signaling pathway. Genome Res (2004) 3.04

Prediction of twin-arginine signal peptides. BMC Bioinformatics (2005) 3.03

Evolutionary history, structural features and biochemical diversity of the NlpC/P60 superfamily of enzymes. Genome Biol (2003) 3.02

Adaptive divergence in experimental populations of Pseudomonas fluorescens. I. Genetic and phenotypic bases of wrinkly spreader fitness. Genetics (2002) 2.91

PhosphoPep--a phosphoproteome resource for systems biology research in Drosophila Kc167 cells. Mol Syst Biol (2007) 2.56

Central functions of the lumenal and peripheral thylakoid proteome of Arabidopsis determined by experimentation and genome-wide prediction. Plant Cell (2002) 2.48

The genome sequence of Mycoplasma mycoides subsp. mycoides SC type strain PG1T, the causative agent of contagious bovine pleuropneumonia (CBPP). Genome Res (2004) 2.33

The genome of Sulfolobus acidocaldarius, a model organism of the Crenarchaeota. J Bacteriol (2005) 2.31

Transmembrane helix predictions revisited. Protein Sci (2002) 2.28

Mineralized tissue and vertebrate evolution: the secretory calcium-binding phosphoprotein gene cluster. Proc Natl Acad Sci U S A (2003) 2.27

Identification of neuropeptide-like protein gene families in Caenorhabditiselegans and other species. Proc Natl Acad Sci U S A (2001) 2.19

A model for carbohydrate metabolism in the diatom Phaeodactylum tricornutum deduced from comparative whole genome analysis. PLoS One (2008) 2.17

The Chlamydophila abortus genome sequence reveals an array of variable proteins that contribute to interspecies variation. Genome Res (2005) 2.12

Comparative genomic analysis of the thermophilic biomass-degrading fungi Myceliophthora thermophila and Thielavia terrestris. Nat Biotechnol (2011) 2.11

A comparative genome analysis identifies distinct sorting pathways in gram-positive bacteria. Infect Immun (2004) 2.04

Deep-sea vent epsilon-proteobacterial genomes provide insights into emergence of pathogens. Proc Natl Acad Sci U S A (2007) 2.00

Evolutionary expressed sequence tag analysis of Drosophila female reproductive tracts identifies genes subjected to positive selection. Genetics (2004) 1.98

Signal peptide prediction based on analysis of experimentally verified cleavage sites. Protein Sci (2004) 1.80

Genome-wide detection and analysis of cell wall-bound proteins with LPxTG-like sorting motifs. J Bacteriol (2005) 1.75

pyramus and thisbe: FGF genes that pattern the mesoderm of Drosophila embryos. Genes Dev (2004) 1.73

Identification of a novel human nuclear-encoded mitochondrial poly(A) polymerase. Nucleic Acids Res (2004) 1.73

Glycosylphosphatidylinositol lipid anchoring of plant proteins. Sensitive prediction from sequence- and genome-wide studies for Arabidopsis and rice. Plant Physiol (2003) 1.72

The genome of the heartwater agent Ehrlichia ruminantium contains multiple tandem repeats of actively variable copy number. Proc Natl Acad Sci U S A (2005) 1.72

A Phytophthora infestans cystatin-like protein targets a novel tomato papain-like apoplastic protease. Plant Physiol (2006) 1.67

EchoBASE: an integrated post-genomic database for Escherichia coli. Nucleic Acids Res (2005) 1.67

Helicobacter pylori exploits a unique repertoire of type IV secretion system components for pilus assembly at the bacteria-host cell interface. PLoS Pathog (2011) 1.67

Lactobacillus plantarum gene clusters encoding putative cell-surface protein complexes for carbohydrate utilization are conserved in specific gram-positive bacteria. BMC Genomics (2006) 1.65

K15 protein of Kaposi's sarcoma-associated herpesvirus is latently expressed and binds to HAX-1, a protein with antiapoptotic function. J Virol (2002) 1.65

Arsenite oxidase aox genes from a metal-resistant beta-proteobacterium. J Bacteriol (2003) 1.61

Comparative salivary gland transcriptomics of sandfly vectors of visceral leishmaniasis. BMC Genomics (2006) 1.60

Identification of the iron-responsive genes of Neisseria gonorrhoeae by microarray analysis in defined medium. J Bacteriol (2005) 1.58

Application of comparative genomics in the identification and analysis of novel families of membrane-associated receptors in bacteria. BMC Genomics (2003) 1.56

Genome-scale genotype-phenotype matching of two Lactococcus lactis isolates from plants identifies mechanisms of adaptation to the plant niche. Appl Environ Microbiol (2007) 1.55

Cytomegalovirus-encoded beta chemokine promotes monocyte-associated viremia in the host. Proc Natl Acad Sci U S A (1999) 1.53

Wolbachia interferes with ferritin expression and iron metabolism in insects. PLoS Pathog (2009) 1.49

The GOLD domain, a novel protein module involved in Golgi function and secretion. Genome Biol (2002) 1.48

A single intramuscular injection of recombinant plasmid DNA induces protective immunity and prevents Japanese encephalitis in mice. J Virol (2000) 1.46

The signaling helix: a common functional theme in diverse signaling proteins. Biol Direct (2006) 1.46

DBSubLoc: database of protein subcellular localization. Nucleic Acids Res (2004) 1.43

UniPep--a database for human N-linked glycosites: a resource for biomarker discovery. Genome Biol (2006) 1.43

Phytome: a platform for plant comparative genomics. Nucleic Acids Res (2006) 1.35

Complete genome sequence, lifestyle, and multi-drug resistance of the human pathogen Corynebacterium resistens DSM 45100 isolated from blood samples of a leukemia patient. BMC Genomics (2012) 1.35

Genome sequences of two closely related Vibrio parahaemolyticus phages, VP16T and VP16C. J Bacteriol (2003) 1.32

Quod erat demonstrandum? The mystery of experimental validation of apparently erroneous computational analyses of protein sequences. Genome Biol (2001) 1.30

Export pathway selectivity of Escherichia coli twin arginine translocation signal peptides. J Biol Chem (2007) 1.28

An approach for identifying cytokines based on a novel ensemble classifier. Biomed Res Int (2013) 1.27

Complete genomic sequence of the human ABCA1 gene: analysis of the human and mouse ATP-binding cassette A promoter. Proc Natl Acad Sci U S A (2000) 1.26

More than 1,001 problems with protein domain databases: transmembrane regions, signal peptides and the issue of sequence homology. PLoS Comput Biol (2010) 1.26

Search for potential vaccine candidate open reading frames in the Bacillus anthracis virulence plasmid pXO1: in silico and in vitro screening. Infect Immun (2002) 1.26

Utility of the Trypanosoma cruzi sequence database for identification of potential vaccine candidates by in silico and in vitro screening. Infect Immun (2004) 1.25

Whole-genome comparison of leucine-rich repeat extensins in Arabidopsis and rice. A conserved family of cell wall proteins form a vegetative and a reproductive clade. Plant Physiol (2003) 1.24

Signal peptide cleavage of a type I membrane protein, HCMV US11, is dependent on its membrane anchor. EMBO J (2001) 1.24

EBR-1, a novel Ambler subclass B1 beta-lactamase from Empedobacter brevis. Antimicrob Agents Chemother (2002) 1.24

Complete genome sequence of Corynebacterium variabile DSM 44702 isolated from the surface of smear-ripened cheeses and insights into cheese ripening and flavor generation. BMC Genomics (2011) 1.23

Purification and characterization of enzymes exhibiting beta-D-xylosidase activities in stem tissues of Arabidopsis. Plant Physiol (2004) 1.22

Mutations in a novel CLN6-encoded transmembrane protein cause variant neuronal ceroid lipofuscinosis in man and mouse. Am J Hum Genet (2001) 1.21

A gene cluster for chlorate metabolism in Ideonella dechloratans. Appl Environ Microbiol (2003) 1.21

PredSL: a tool for the N-terminal sequence-based prediction of protein subcellular localization. Genomics Proteomics Bioinformatics (2006) 1.20

Juvenile hormone regulation of male accessory gland activity in the red flour beetle, Tribolium castaneum. Mech Dev (2009) 1.19

Molecular evolution and population genetic analysis of candidate female reproductive genes in Drosophila. Genetics (2006) 1.19

Secreted protein prediction system combining CJ-SPHMM, TMHMM, and PSORT. Mamm Genome (2003) 1.19

Gene expression profiles of microdissected pancreatic ductal adenocarcinoma. Virchows Arch (2003) 1.18

MFP1 is a thylakoid-associated, nucleoid-binding protein with a coiled-coil structure. Nucleic Acids Res (2003) 1.18

Ac23, an envelope fusion protein homolog in the baculovirus Autographa californica multicapsid nucleopolyhedrovirus, is a viral pathogenicity factor. J Virol (2003) 1.17

Differential expression of anterior gradient gene AGR2 in prostate cancer. BMC Cancer (2010) 1.16

Expressed sequence tags from the oomycete fish pathogen Saprolegnia parasitica reveal putative virulence factors. BMC Microbiol (2005) 1.15

The genome sequence of the fish pathogen Aliivibrio salmonicida strain LFI1238 shows extensive evidence of gene decay. BMC Genomics (2008) 1.14

The HtrA protease of Campylobacter jejuni is required for heat and oxygen tolerance and for optimal interaction with human epithelial cells. Appl Environ Microbiol (2005) 1.14

Protein-protein interactions among Helicobacter pylori cag proteins. J Bacteriol (2006) 1.14

Anatomical and physiological evidence for involvement of tuberoinfundibular peptide of 39 residues in nociception. Proc Natl Acad Sci U S A (2002) 1.14

Insight into the sialome of the castor bean tick, Ixodes ricinus. BMC Genomics (2008) 1.13

OsTDL1A binds to the LRR domain of rice receptor kinase MSP1, and is required to limit sporocyte numbers. Plant J (2008) 1.12

The Pleurotus ostreatus laccase multi-gene family: isolation and heterologous expression of new family members. Curr Genet (2008) 1.11

Whole genome identification of Mycobacterium tuberculosis vaccine candidates by comprehensive data mining and bioinformatic analyses. BMC Med Genomics (2008) 1.10

Origin and evolution of a chimeric fusion gene in Drosophila subobscura, D. madeirensis and D. guanche. Genetics (2005) 1.10

Analysis of curated and predicted plastid subproteomes of Arabidopsis. Subcellular compartmentalization leads to distinctive proteome properties. Plant Physiol (2004) 1.09

Properties of a novel intracellular poly(3-hydroxybutyrate) depolymerase with high specific activity (PhaZd) in Wautersia eutropha H16. J Bacteriol (2005) 1.09

Comparative genomics uncovers novel structural and functional features of the heterotrimeric GTPase signaling system. Gene (2010) 1.08

De novo sequence assembly of Albugo candida reveals a small genome relative to other biotrophic oomycetes. BMC Genomics (2011) 1.08

Assessing the precision of high-throughput computational and laboratory approaches for the genome-wide identification of protein subcellular localization in bacteria. BMC Genomics (2005) 1.08

Characterization of an endoglucanase belonging to a new subfamily of glycoside hydrolase family 45 of the basidiomycete Phanerochaete chrysosporium. Appl Environ Microbiol (2008) 1.08

Chloroplast transit peptide prediction: a peek inside the black box. Nucleic Acids Res (2001) 1.07

Identification of candidates for a subunit vaccine against extraintestinal pathogenic Escherichia coli. Infect Immun (2006) 1.07

Neuronal ceroid lipofuscinoses are connected at molecular level: interaction of CLN5 protein with CLN2 and CLN3. Mol Biol Cell (2002) 1.05

Chlamydomonas reinhardtii has multiple prolyl 4-hydroxylases, one of which is essential for proper cell wall assembly. Plant Cell (2007) 1.04

Genetic and biochemical characterization of CGB-1, an Ambler class B carbapenem-hydrolyzing beta-lactamase from Chryseobacterium gleum. Antimicrob Agents Chemother (2002) 1.03

vLIP, a viral lipase homologue, is a virulence factor of Marek's disease virus. J Virol (2005) 1.03

Signal peptidase I: cleaving the way to mature proteins. Protein Sci (2011) 1.03

Articles by these authors

Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J Mol Biol (2001) 66.87

Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Protein Eng (1997) 38.38

A new method for predicting signal sequence cleavage sites. Nucleic Acids Res (1986) 37.19

Patterns of amino acids near signal-sequence cleavage sites. Eur J Biochem (1983) 26.68

Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. J Mol Biol (2000) 22.77

Signal sequences. The limits of variation. J Mol Biol (1985) 21.24

Sequence and structure-based prediction of eukaryotic protein phosphorylation sites. J Mol Biol (1999) 15.63

A hidden Markov model for predicting transmembrane helices in protein sequences. Proc Int Conf Intell Syst Mol Biol (1998) 14.18

Multiple alignment using simulated annealing: branch point definition in human mRNA splicing. Nucleic Acids Res (1992) 12.08

Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics (2000) 11.75

How signal sequences maintain cleavage specificity. J Mol Biol (1984) 11.03

TopPred II: an improved software for membrane protein structure predictions. Comput Appl Biosci (1994) 10.03

ChloroP, a neural network-based method for predicting chloroplast transit peptides and their cleavage sites. Protein Sci (1999) 9.82

Prediction of transmembrane alpha-helices in prokaryotic membrane proteins: the dense alignment surface method. Protein Eng (1997) 8.25

Prediction of human mRNA donor and acceptor sites from the DNA sequence. J Mol Biol (1991) 7.93

Genome-wide analysis of integral membrane proteins from eubacterial, archaean, and eukaryotic organisms. Protein Sci (1998) 7.88

Mitochondrial targeting sequences may form amphiphilic helices. EMBO J (1986) 7.11

Machine learning approaches for the prediction of signal peptides and other protein sorting signals. Protein Eng (1999) 6.72

Splice site prediction in Arabidopsis thaliana pre-mRNA by combining local and global sequence information. Nucleic Acids Res (1996) 6.13

A DNA structural atlas for Escherichia coli. J Mol Biol (2000) 5.72

Sequence determinants of cytosolic N-terminal protein processing. Eur J Biochem (1986) 5.59

Displaying the information contents of structural RNA alignments: the structure logos. Comput Appl Biosci (1997) 5.51

A conserved cleavage-site motif in chloroplast transit peptides. FEBS Lett (1990) 4.25

Sequence differences between glycosylated and non-glycosylated Asn-X-Thr/Ser acceptor sites: implications for protein engineering. Protein Eng (1990) 4.17

How proteins adapt to a membrane-water interface. Trends Biochem Sci (2000) 3.76

NetOglyc: prediction of mucin type O-glycosylation sites based on sequence context and surface accessibility. Glycoconj J (1998) 3.19

Trans-membrane translocation of proteins. The direct transfer model. Eur J Biochem (1979) 3.13

Determination of the distance between the oligosaccharyltransferase active site and the endoplasmic reticulum membrane. J Biol Chem (1993) 3.02

Sensitive quantitative predictions of peptide-MHC binding by a 'Query by Committee' artificial neural network approach. Tissue Antigens (2003) 2.93

On the total number of genes and their length distribution in complete microbial genomes. Trends Genet (2001) 2.80

Predicting the topology of eukaryotic membrane proteins. Eur J Biochem (1993) 2.73

YidC, the Escherichia coli homologue of mitochondrial Oxa1p, is a component of the Sec translocase. EMBO J (2000) 2.70

PhosphoBase, a database of phosphorylation sites: release 2.0. Nucleic Acids Res (1999) 2.61

Prediction of human protein function according to Gene Ontology categories. Bioinformatics (2003) 2.54

Membrane proteins: the amino acid composition of membrane-penetrating segments. Eur J Biochem (1981) 2.52

Analysis of the distribution of charged residues in the N-terminal region of signal sequences: implications for protein export in prokaryotic and eukaryotic cells. EMBO J (1984) 2.43

Cleavage-site motifs in mitochondrial targeting peptides. Protein Eng (1990) 2.38

The Escherichia coli SRP and SecB targeting pathways converge at the translocon. EMBO J (1998) 2.35

A receptor component of the chloroplast protein translocation machinery. Science (1994) 2.34

Green fluorescent protein as an indicator to monitor membrane protein overexpression in Escherichia coli. FEBS Lett (2001) 2.28

Fine-tuning the topology of a polytopic membrane protein: role of positively and negatively charged amino acids. Cell (1990) 2.24

env sequences of simian immunodeficiency viruses from chimpanzees in Cameroon are strongly related to those of human immunodeficiency virus group N from the same geographic area. J Virol (2000) 2.17

Cleavage site analysis in picornaviral polyproteins: discovering cellular targets by neural networks. Protein Sci (1996) 2.12

Protein distance constraints predicted by neural networks and probability density functions. Protein Eng (1997) 2.07

Prediction of human protein function from post-translational modifications and localization features. J Mol Biol (2002) 2.05

On the hydrophobic nature of signal sequences. Eur J Biochem (1981) 2.01

Competition between Sec- and TAT-dependent protein translocation in Escherichia coli. EMBO J (1999) 1.95

The biology of eukaryotic promoter prediction--a review. Comput Chem (1999) 1.93

Prediction of organellar targeting signals. Biochim Biophys Acta (2001) 1.93

Structures of N-terminally acetylated proteins. Eur J Biochem (1985) 1.92

Assembly of a cytoplasmic membrane protein in Escherichia coli is dependent on the signal recognition particle. FEBS Lett (1996) 1.89

Topology, subcellular localization, and sequence diversity of the Mlo family in plants. J Biol Chem (1999) 1.88

Nascent membrane and presecretory proteins synthesized in Escherichia coli associate with signal recognition particle and trigger factor. Mol Microbiol (1997) 1.85

Net N-C charge imbalance may be important for signal sequence function in bacteria. J Mol Biol (1986) 1.85

Topological rules for membrane protein assembly in eukaryotic cells. J Biol Chem (1997) 1.83

Amino acid distributions around O-linked glycosylation sites. Biochem J (1991) 1.82

Exploiting the past and the future in protein secondary structure prediction. Bioinformatics (1999) 1.82

O-GLYCBASE version 4.0: a revised database of O-glycosylated proteins. Nucleic Acids Res (1999) 1.81

Prediction of O-glycosylation of mammalian proteins: specificity patterns of UDP-GalNAc:polypeptide N-acetylgalactosaminyltransferase. Biochem J (1995) 1.81

Anionic phospholipids are determinants of membrane protein topology. EMBO J (1997) 1.77

Molecular mechanism of membrane protein integration into the endoplasmic reticulum. Cell (1997) 1.70

Generating genome-scale candidate gene lists for pharmacogenomics. Clin Pharmacol Ther (2009) 1.66

Prediction of protein secondary structure at 80% accuracy. Proteins (2000) 1.65

Kissing loops hide premature termination codons in pre-mRNA of selenoprotein genes and in genes containing programmed ribosomal frameshifts. RNA (1997) 1.64

Differential use of the signal recognition particle translocase targeting pathway for inner membrane protein assembly in Escherichia coli. Proc Natl Acad Sci U S A (1998) 1.62

Cleaning the GenBank Arabidopsis thaliana data set. Nucleic Acids Res (1996) 1.62

Topological "frustration" in multispanning E. coli inner membrane proteins. Cell (1994) 1.59

G+C-rich tract in 5' end of human introns. J Mol Biol (1992) 1.57

Consensus predictions of membrane protein topology. FEBS Lett (2000) 1.55

Membrane protein topology: effects of delta mu H+ on the translocation of charged residues explain the 'positive inside' rule. EMBO J (1994) 1.53

Statistical analysis of protein kinase specificity determinants. FEBS Lett (1998) 1.53

SARS CTL vaccine candidates; HLA supertype-, genome-wide scanning and biochemical validation. Tissue Antigens (2004) 1.52

Translation rate modification by preferential codon usage: intragenic position effects. J Theor Biol (1987) 1.51

A nascent secretory protein may traverse the ribosome/endoplasmic reticulum translocase complex as an extended chain. J Biol Chem (1996) 1.51

Protein secondary structure and homology by neural networks. The alpha-helices in rhodopsin. FEBS Lett (1988) 1.50

The aromatic residues Trp and Phe have different effects on the positioning of a transmembrane helix in the microsomal membrane. Biochemistry (1999) 1.49

DNA structure in human RNA polymerase II promoters. J Mol Biol (1998) 1.46

Towards a comparative anatomy of N-terminal topogenic protein sequences. J Mol Biol (1986) 1.46

Chloroplast transit peptides from the green alga Chlamydomonas reinhardtii share features with both mitochondrial and higher plant chloroplast presequences. FEBS Lett (1990) 1.45

A 30-residue-long "export initiation domain" adjacent to the signal sequence is critical for protein translocation across the inner membrane of Escherichia coli. Proc Natl Acad Sci U S A (1991) 1.43

The distribution of charged amino acids in mitochondrial inner-membrane proteins suggests different modes of membrane integration for nuclearly and mitochondrially encoded proteins. Eur J Biochem (1992) 1.42

Sec dependent and sec independent assembly of E. coli inner membrane proteins: the topological rules depend on chain length. EMBO J (1993) 1.41

Analysis of the secondary structure of the human immunodeficiency virus (HIV) proteins p17, gp120, and gp41 by computer modeling based on neural network methods. J Acquir Immune Defic Syndr (1990) 1.41

Feature-extraction from endopeptidase cleavage sites in mitochondrial targeting peptides. Proteins (1998) 1.41

The 'positive-inside rule' applies to thylakoid membrane proteins. FEBS Lett (1991) 1.40

The COOH-terminal ends of internal signal and signal-anchor sequences are positioned differently in the ER translocase. J Cell Biol (1994) 1.39

Architecture of helix bundle membrane proteins: an analysis of cytochrome c oxidase from bovine mitochondria. Protein Sci (1997) 1.38

MatrixPlot: visualizing sequence constraints. Bioinformatics (1999) 1.38

Naturally occurring nucleosome positioning signals in human exons and introns. J Mol Biol (1996) 1.38

Genome organisation and chromatin structure in Escherichia coli. Biochimie (2001) 1.37

Determination of the border between the transmembrane and cytoplasmic domains of human integrin subunits. J Biol Chem (1999) 1.37

A turn propensity scale for transmembrane helices. J Mol Biol (1999) 1.37

Signal sequences are not uniformly hydrophobic. J Mol Biol (1982) 1.37

Turns in transmembrane helices: determination of the minimal length of a "helical hairpin" and derivation of a fine-grained turn propensity scale. J Mol Biol (1999) 1.36