RNAcode: robust discrimination of coding and noncoding regions in comparative sequence data.

PubWeight™: 1.42‹?› | Rank: Top 5%

🔗 View Article (PMC 3062170)

Published in RNA on February 28, 2011

Authors

Stefan Washietl1, Sven Findeiss, Stephan A Müller, Stefan Kalkhof, Martin von Bergen, Ivo L Hofacker, Peter F Stadler, Nick Goldman

Author Affiliations

1: EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB101SD, United Kingdom. wash@mit.edu

Articles citing this

lincRNAs: genomics, evolution, and mechanisms. Cell (2013) 6.50

Long non-coding RNAs and cancer: a new frontier of translational research? Oncogene (2012) 3.83

CPAT: Coding-Potential Assessment Tool using an alignment-free logistic regression model. Nucleic Acids Res (2013) 2.88

Evolutionary dynamics and tissue specificity of human long noncoding RNAs in six mammals. Genome Res (2014) 1.69

Long noncoding RNAs in C. elegans. Genome Res (2012) 1.69

An atlas of over 90,000 conserved noncoding sequences provides insight into crucifer regulatory regions. Nat Genet (2013) 1.68

Genome-wide transcriptome analysis of the plant pathogen Xanthomonas identifies sRNAs with putative virulence functions. Nucleic Acids Res (2011) 1.25

The developmental transcriptome of the mosquito Aedes aegypti, an invasive species and major arbovirus vector. G3 (Bethesda) (2013) 1.10

Computational analysis of conserved RNA secondary structure in transcriptomes and genomes. Annu Rev Biophys (2014) 1.08

Computational analysis of noncoding RNAs. Wiley Interdiscip Rev RNA (2012) 1.01

Molecular mechanisms and function prediction of long noncoding RNA. ScientificWorldJournal (2012) 0.98

The Escherichia coli CydX protein is a member of the CydAB cytochrome bd oxidase complex and is required for cytochrome bd oxidase activity. J Bacteriol (2013) 0.97

The primary transcriptome of the marine diazotroph Trichodesmium erythraeum IMS101. Sci Rep (2014) 0.94

Long non-coding RNAs differentially expressed between normal versus primary breast tumor tissues disclose converse changes to breast cancer-related protein-coding genes. PLoS One (2014) 0.92

Investigating endogenous peptides and peptidases using peptidomics. Biochemistry (2011) 0.91

Evolutionary analysis across mammals reveals distinct classes of long non-coding RNAs. Genome Biol (2016) 0.90

Cell cycle, oncogenic and tumor suppressor pathways regulate numerous long and macro non-protein-coding RNAs. Genome Biol (2014) 0.88

Non-coding RNAs in marine Synechococcus and their regulation under environmentally relevant stress conditions. ISME J (2012) 0.88

Robust identification of noncoding RNA from transcriptomes requires phylogenetically-informed sampling. PLoS Comput Biol (2014) 0.88

Molecular Functions of Long Non-Coding RNAs in Plants. Genes (Basel) (2012) 0.88

Optimization of parameters for coverage of low molecular weight proteins. Anal Bioanal Chem (2010) 0.87

The Streptococcus suis transcriptional landscape reveals adaptation mechanisms in pig blood and cerebrospinal fluid. RNA (2014) 0.87

Dissecting the genetics of the human transcriptome identifies novel trait-related trans-eQTLs and corroborates the regulatory relevance of non-protein coding loci†. Hum Mol Genet (2015) 0.86

Comparison of splice sites reveals that long noncoding RNAs are evolutionarily well conserved. RNA (2015) 0.85

Noncoding RNA and colorectal cancer: its epigenetic role. J Hum Genet (2016) 0.84

Computational methods to detect conserved non-genic elements in phylogenetically isolated genomes: application to zebrafish. Nucleic Acids Res (2013) 0.83

Identification of non-coding RNAs with a new composite feature in the Hybrid Random Forest Ensemble algorithm. Nucleic Acids Res (2014) 0.80

RNAlien - Unsupervised RNA family model construction. Nucleic Acids Res (2016) 0.78

Transcripts with in silico predicted RNA structure are enriched everywhere in the mouse brain. BMC Genomics (2012) 0.78

A common set of distinct features that characterize noncoding RNAs across multiple species. Nucleic Acids Res (2014) 0.78

Long non-coding RNAs display higher natural expression variation than protein-coding genes in healthy humans. Genome Biol (2016) 0.77

LncRNAs in vertebrates: advances and challenges. Biochimie (2015) 0.77

Bioinformatics of prokaryotic RNAs. RNA Biol (2014) 0.77

LncRNApred: Classification of Long Non-Coding RNAs and Protein-Coding Transcripts by the Ensemble Algorithm with a New Hybrid Feature. PLoS One (2016) 0.76

COME: a robust coding potential calculation tool for lncRNA identification and characterization based on multiple features. Nucleic Acids Res (2016) 0.76

A Fleeting Glimpse Inside microRNA, Epigenetics, and Micropeptidomics. Adv Exp Med Biol (2015) 0.76

Evolution of the unspliced transcriptome. BMC Evol Biol (2015) 0.76

Comparative Expression Dynamics of Intergenic Long Noncoding RNAs in the Genus Drosophila. Genome Biol Evol (2016) 0.76

LncRNAs: From Basic Research to Medical Application. Int J Biol Sci (2017) 0.75

Coding exon-structure aware realigner (CESAR) utilizes genome alignments for accurate comparative gene annotation. Nucleic Acids Res (2016) 0.75

Identification and expression patterns of novel long non-coding RNAs in neural progenitors of the developing mammalian cortex. Neurogenesis (Austin) (2015) 0.75

Differentiation of ncRNAs from small mRNAs in Escherichia coli O157:H7 EDL933 (EHEC) by combined RNAseq and RIBOseq - ryhB encodes the regulatory RNA RyhB and a peptide, RyhP. BMC Genomics (2017) 0.75

The Transcriptome of Exophiala dermatitidis during Ex-vivo Skin Model Infection. Front Cell Infect Microbiol (2016) 0.75

Computational Detection of piRNA in Human Using Support Vector Machine. Avicenna J Med Biotechnol (2016) 0.75

A Genomic Analysis of Factors Driving lincRNA Diversification: Lessons from Plants. G3 (Bethesda) (2016) 0.75

Identification of Conserved and Potentially Regulatory Small RNAs in Heterocystous Cyanobacteria. Front Microbiol (2016) 0.75

Identification of long non-coding transcripts with feature selection: a comparative study. BMC Bioinformatics (2017) 0.75

Small proteins in cyanobacteria provide a paradigm for the functional analysis of the bacterial micro-proteome. BMC Microbiol (2016) 0.75

Identification of 15 candidate structured noncoding RNA motifs in fungi by comparative genomics. BMC Genomics (2017) 0.75

Evolinc: A Tool for the Identification and Evolutionary Comparison of Long Intergenic Non-coding RNAs. Front Genet (2017) 0.75

Coding sequence density estimation via topological pressure. J Math Biol (2014) 0.75

Articles cited by this

A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst Biol (2003) 102.57

Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature (2007) 75.09

EMBOSS: the European Molecular Biology Open Software Suite. Trends Genet (2000) 69.26

Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci U S A (1992) 61.33

Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. J Mol Evol (1985) 47.78

Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis (1999) 45.01

Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res (2005) 44.08

The transcriptional landscape of the mammalian genome. Science (2005) 37.63

Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. Anal Chem (2002) 35.30

A statistical model for identifying proteins by tandem mass spectrometry. Anal Chem (2003) 29.58

Sequencing and comparison of yeast species to identify genes and regulatory elements. Nature (2003) 29.16

Mass spectrometry-based proteomics. Nature (2003) 28.94

Aligning multiple genomic sequences with the threaded blockset aligner. Genome Res (2004) 24.52

The Universal Protein Resource (UniProt) in 2010. Nucleic Acids Res (2009) 19.70

TANDEM: matching proteins with tandem mass spectra. Bioinformatics (2004) 17.41

Seq-Gen: an application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees. Comput Appl Biosci (1997) 17.32

Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models. Mol Biol Evol (2000) 13.01

CRITICA: coding region identification tool invoking comparative analysis. Mol Biol Evol (1999) 12.64

A high-resolution map of transcription in the yeast genome. Proc Natl Acad Sci U S A (2006) 11.81

Discovery of functional elements in 12 Drosophila genomes using evolutionary signatures. Nature (2007) 11.66

Rfam: updates to the RNA families database. Nucleic Acids Res (2008) 11.61

The UCSC Genome Browser Database: update 2009. Nucleic Acids Res (2008) 10.31

Method to correlate tandem mass spectra of modified peptides to amino acid sequences in the protein database. Anal Chem (1995) 9.39

Fast and reliable prediction of noncoding RNAs. Proc Natl Acad Sci U S A (2005) 8.15

Distinguishing protein-coding and noncoding genes in the human genome. Proc Natl Acad Sci U S A (2007) 8.00

Tricine-SDS-PAGE. Nat Protoc (2006) 7.89

Finding the genes in genomic DNA. Curr Opin Struct Biol (1998) 7.31

Noncoding RNA gene detection using comparative sequence analysis. BMC Bioinformatics (2001) 7.07

EGASP: the human ENCODE Genome Annotation Assessment Project. Genome Biol (2006) 7.06

Differentiating protein-coding and noncoding RNA: challenges and ambiguities. PLoS Comput Biol (2008) 3.66

Small peptides switch the transcriptional activity of Shavenbaby during Drosophila embryogenesis. Science (2010) 3.62

Small peptide regulators of actin-based cell morphogenesis encoded by a polycistronic mRNA. Nat Cell Biol (2007) 3.04

FlyBase : a database for the Drosophila research community. Methods Mol Biol (2008) 3.02

Staphylococcus aureus RNAIII coordinately represses the synthesis of virulence factors and the transcription regulator Rot by an antisense mechanism. Genes Dev (2007) 2.86

The transcription unit architecture of the Escherichia coli genome. Nat Biotechnol (2009) 2.85

Metatranscriptomics reveals unique microbial small RNAs in the ocean's water column. Nature (2009) 2.72

Performance and scalability of discriminative metrics for comparative gene identification in 12 Drosophila genomes. PLoS Comput Biol (2008) 2.70

Structured RNAs in the ENCODE selected regions of the human genome. Genome Res (2007) 2.69

The UCSC Archaeal Genome Browser. Nucleic Acids Res (2006) 2.65

Peptides encoded by short ORFs control development and define a new eukaryotic gene family. PLoS Biol (2007) 2.54

Using multiple alignments to improve gene prediction. J Comput Biol (2006) 2.48

Transcriptome analysis of Escherichia coli using high-density oligonucleotide probe arrays. Nucleic Acids Res (2002) 2.40

Methods in comparative genomics: genome correspondence, gene identification and regulatory motif discovery. J Comput Biol (2004) 2.20

Steady progress and recent breakthroughs in the accuracy of automated genome annotation. Nat Rev Genet (2008) 2.08

Assessing the impact of secondary structure and solvent accessibility on protein evolution. Genetics (1998) 2.05

CONTRAST: a discriminative, phylogeny-free approach to multiple informant de novo gene prediction. Genome Biol (2007) 2.01

RNAz 2.0: improved noncoding RNA detection. Pac Symp Biocomput (2010) 1.87

Small membrane proteins found by comparative genomics and ribosome binding site models. Mol Microbiol (2008) 1.84

Discrimination of non-protein-coding transcripts from protein-coding mRNA. RNA Biol (2006) 1.82

nGASP--the nematode genome annotation assessment project. BMC Bioinformatics (2008) 1.52

The small untranslated RNA SR1 from the Bacillus subtilis genome is involved in the regulation of arginine catabolism. Mol Microbiol (2006) 1.38

Genome-wide discovery and verification of novel structured RNAs in Plasmodium falciparum. Genome Res (2007) 1.36

In vitro analysis of the interaction between the small RNA SR1 and its primary target ahrC mRNA. Nucleic Acids Res (2007) 1.36

Variation in evolutionary processes at different codon positions. Mol Biol Evol (2006) 1.26

Dinucleotide controlled null models for comparative RNA gene prediction. BMC Bioinformatics (2008) 1.23

Identification of candidate structured RNAs in the marine organism 'Candidatus Pelagibacter ubique'. BMC Genomics (2009) 1.21

Computational identification of protein coding potential of conserved sequence tags through cross-species evolutionary analysis. Nucleic Acids Res (2003) 1.21

Implication of CcpN in the regulation of a novel untranslated RNA (SR1) in Bacillus subtilis. Mol Microbiol (2005) 1.07

The low molecular weight proteome of Halobacterium salinarum. J Proteome Res (2007) 1.03

A dual-function sRNA from B. subtilis: SR1 acts as a peptide encoding mRNA on the gapA operon. Mol Microbiol (2010) 1.02

ETOPE: Evolutionary test of predicted exons. Nucleic Acids Res (2003) 1.00

Low-molecular-weight human serum proteome using ultrafiltration, isoelectric focusing, and mass spectrometry. Electrophoresis (2004) 0.99

Detection and identification of low-mass peptides and proteins from solvent suspensions of Escherichia coli by high performance liquid chromatography fractionation and matrix-assisted laser desorption/ionization mass spectrometry. Rapid Commun Mass Spectrom (1999) 0.94

Optimization of parameters for coverage of low molecular weight proteins. Anal Bioanal Chem (2010) 0.87

Gene prediction: compare and CONTRAST. Genome Biol (2007) 0.85

Molecular biology. Hiding in plain sight. Science (2010) 0.78

Articles by these authors

Initial sequencing and comparative analysis of the mouse genome. Nature (2002) 96.15

Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature (2007) 75.09

RNA maps reveal new RNA classes and a possible function for pervasive transcription. Science (2007) 18.59

Evolution of genes and genomes on the Drosophila phylogeny. Nature (2007) 18.01

Covariation in frequencies of substitution, deletion, transposition, and recombination during eutherian evolution. Genome Res (2003) 11.12

ViennaRNA Package 2.0. Algorithms Mol Biol (2011) 9.43

The Vienna RNA websuite. Nucleic Acids Res (2008) 9.02

Secondary structure prediction for aligned RNA sequences. J Mol Biol (2002) 8.80

A high-resolution map of human evolutionary constraint using 29 mammals. Nature (2011) 8.67

Fast and reliable prediction of noncoding RNAs. Proc Natl Acad Sci U S A (2005) 8.15

The primary transcriptome of the major human pathogen Helicobacter pylori. Nature (2010) 7.70

Analyses of deep mammalian sequence alignments and constraint predictions for 1% of the human genome. Genome Res (2007) 7.05

Phylogeny-aware gap placement prevents errors in sequence alignment and evolutionary analysis. Science (2008) 6.35

An algorithm for progressive multiple alignment of sequences with insertions. Proc Natl Acad Sci U S A (2005) 6.26

Molecular evolution of a microRNA cluster. J Mol Biol (2004) 5.94

Sex differences in the gut microbiome drive hormone-dependent regulation of autoimmunity. Science (2013) 5.75

Multi-platform next-generation sequencing of the domestic turkey (Meleagris gallopavo): genome assembly and analysis. PLoS Biol (2010) 5.39

Interleukin-6 dependent survival of multiple myeloma cells involves the Stat3-mediated induction of microRNA-21 through a highly conserved enhancer. Blood (2007) 5.14

tRNAdb 2009: compilation of tRNA sequences and tRNA genes. Nucleic Acids Res (2008) 4.88

Mapping of conserved RNA secondary structures predicts thousands of functional noncoding RNAs in the human genome. Nat Biotechnol (2005) 4.68

Inferring noncoding RNA families and classes by means of genome-scale structure-based clustering. PLoS Comput Biol (2007) 4.28

Fast mapping of short sequences with mismatches, insertions and deletions using index structures. PLoS Comput Biol (2009) 4.24

Accuracy and power of statistical methods for detecting adaptive evolution in protein coding sequences and for identifying positively selected sites. Genetics (2004) 4.22

Proteinortho: detection of (co-)orthologs in large-scale analysis. BMC Bioinformatics (2011) 4.15

Insights into hominid evolution from the gorilla genome sequence. Nature (2012) 4.12

miRNAMap: genomic maps of microRNA genes and their target genes in mammalian genomes. Nucleic Acids Res (2006) 3.59

MITOS: improved de novo metazoan mitochondrial genome annotation. Mol Phylogenet Evol (2012) 3.41

The reality of pervasive transcription. PLoS Biol (2011) 3.41

Detecting amino acid sites under positive selection and purifying selection. Genetics (2005) 3.35

RNAalifold: improved consensus structure prediction for RNA alignments. BMC Bioinformatics (2008) 3.29

Thermodynamics of RNA-RNA binding. Bioinformatics (2006) 3.10

Systematic evaluation of spliced alignment programs for RNA-seq data. Nat Methods (2013) 2.92

The expansion of the metazoan microRNA repertoire. BMC Genomics (2006) 2.80

The African coelacanth genome provides insights into tetrapod evolution. Nature (2013) 2.78

Memory efficient folding algorithms for circular RNA secondary structures. Bioinformatics (2006) 2.77

Partition function and base pairing probabilities of RNA heterodimers. Algorithms Mol Biol (2006) 2.74

Structured RNAs in the ENCODE selected regions of the human genome. Genome Res (2007) 2.69

The impact of target site accessibility on the design of effective siRNAs. Nat Biotechnol (2008) 2.51

Consensus folding of aligned sequences as a new measure for the detection of functional RNAs by comparative genomics. J Mol Biol (2004) 2.49

Alignment of RNA base pairing probability matrices. Bioinformatics (2004) 2.42

Evolution. Transitions from nonliving to living matter. Science (2004) 2.26

Different versions of the Dayhoff rate matrix. Mol Biol Evol (2004) 2.23

An empirical codon model for protein sequence evolution. Mol Biol Evol (2007) 2.20

Local RNA base pairing probabilities in large sequences. Bioinformatics (2005) 2.17

Protein evolution with dependence among codons due to tertiary structure. Mol Biol Evol (2003) 2.16

Hairpins in a Haystack: recognizing microRNA precursors in comparative genomics data. Bioinformatics (2006) 2.13

Gut microbiota disturbance during antibiotic therapy: a multi-omic approach. Gut (2012) 2.10

Small ncRNA transcriptome analysis from Aspergillus fumigatus suggests a novel mechanism for regulation of protein synthesis. Nucleic Acids Res (2008) 2.10

Recurrent mutation of the ID3 gene in Burkitt lymphoma identified by integrated genome, exome and transcriptome sequencing. Nat Genet (2012) 2.07

Accurate extension of multiple sequence alignments using a phylogeny-aware graph algorithm. Bioinformatics (2012) 2.07

webPRANK: a phylogeny-aware multiple sequence aligner with interactive alignment browser. BMC Bioinformatics (2010) 1.99

Genomic DNA k-mer spectra: models and modalities. Genome Biol (2009) 1.92

Variations on RNA folding and alignment: lessons from Benasque. J Math Biol (2007) 1.91

The RNAz web server: prediction of thermodynamically stable and evolutionarily conserved RNA structures. Nucleic Acids Res (2007) 1.88

RNAz 2.0: improved noncoding RNA detection. Pac Symp Biocomput (2010) 1.87

A survey of nematode SmY RNAs. RNA Biol (2009) 1.85

Conserved RNA secondary structures in Flaviviridae genomes. J Gen Virol (2004) 1.81

RNAplex: a fast tool for RNA-RNA interaction search. Bioinformatics (2008) 1.80

Inducible expression of Tau repeat domain in cell models of tauopathy: aggregation is toxic to cells but can be reversed by inhibitor drugs. J Biol Chem (2005) 1.79

Evolution of microRNAs located within Hox gene clusters. J Exp Zool B Mol Dev Evol (2005) 1.77

Centers of complex networks. J Theor Biol (2003) 1.74

RNAcentral: A vision for an international database of RNA sequences. RNA (2011) 1.73

Conservation and divergence in Toll-like receptor 4-regulated gene expression in primary human versus mouse macrophages. Proc Natl Acad Sci U S A (2012) 1.72

Global hairpin folding of tau in solution. Biochemistry (2006) 1.69

Pandit: a database of protein and associated nucleotide domains with inferred trees. Bioinformatics (2003) 1.69

Anthraquinones inhibit tau aggregation and dissolve Alzheimer's paired helical filaments in vitro and in cells. J Biol Chem (2004) 1.62

A multi-split mapping algorithm for circular RNA, splicing, trans-splicing and fusion detection. Genome Biol (2014) 1.60

AREsite: a database for the comprehensive investigation of AU-rich elements. Nucleic Acids Res (2010) 1.58

PANDIT: an evolution-centric database of protein and associated nucleotide domains with inferred trees. Nucleic Acids Res (2006) 1.58

The effects of alignment error and alignment filtering on the sitewise detection of positive selection. Mol Biol Evol (2011) 1.55

Evidence for human microRNA-offset RNAs in small RNA sequencing data. Bioinformatics (2009) 1.55

Sites of tau important for aggregation populate {beta}-structure and bind to microtubules and polyanions. J Biol Chem (2005) 1.54

Two-dimensional proteome reference map of Prototheca zopfii revealed reduced metabolism and enhanced signal transduction as adaptation to an infectious life style. Proteomics (2013) 1.54

The "fish-specific" Hox cluster duplication is coincident with the origin of teleosts. Mol Biol Evol (2005) 1.54

SnoReport: computational identification of snoRNAs with unknown targets. Bioinformatics (2007) 1.51

LocARNA-P: accurate boundary prediction and improved detection of structural RNAs. RNA (2012) 1.51

Evolutionary patterns of non-coding RNAs. Theory Biosci (2005) 1.49