Computational Identification of Novel Genes: Current and Future Perspectives.

PubWeight™: 0.75‹?›

🔗 View Article (PMID 27493475)

Published in Bioinform Biol Insights on August 01, 2016

Authors

Steffen Klasberg1, Tristan Bitard-Feildel1, Ludovic Mallet1

Author Affiliations

1: Institute for Evolution and Biodiversity, Westfalian Wilhelms University Muenster, Huefferstrasse 1, Muenster, Germany.

Articles cited by this

(truncated to the top 100)

Basic local alignment search tool. J Mol Biol (1990) 659.07

Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet (2000) 336.52

EMBOSS: the European Molecular Biology Open Software Suite. Trends Genet (2000) 69.26

Prediction of complete gene structures in human genomic DNA. J Mol Biol (1997) 58.76

Identifying bacterial genes and endosymbiont DNA with Glimmer. Bioinformatics (2007) 47.63

The Pfam protein families database. Nucleic Acids Res (2011) 33.46

TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol (2013) 32.42

Automated generation of heuristics for biological sequence comparison. BMC Bioinformatics (2005) 24.54

Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science (2009) 20.75

Elevated recombination rates in transcriptionally active DNA. Cell (1989) 17.42

Spliced segments at the 5' terminus of adenovirus 2 late mRNA. Proc Natl Acad Sci U S A (1977) 14.34

Ensembl 2014. Nucleic Acids Res (2013) 12.62

Non-coding RNA. Hum Mol Genet (2006) 10.57

Ribosome profiling of mouse embryonic stem cells reveals the complexity and dynamics of mammalian proteomes. Cell (2011) 10.56

An amazing sequence arrangement at the 5' ends of adenovirus 2 messenger RNA. Cell (1977) 9.44

Comprehensive comparative analysis of strand-specific RNA sequencing methods. Nat Methods (2010) 9.09

RNA sequencing: advances, challenges and opportunities. Nat Rev Genet (2010) 8.96

Horizontal gene transfer in prokaryotes: quantification and classification. Annu Rev Microbiol (2001) 8.14

Using GeneWise in the Drosophila annotation experiment. Genome Res (2000) 7.50

GeneMark: web software for gene finding in prokaryotes, eukaryotes and viruses. Nucleic Acids Res (2005) 6.98

Splice site prediction in Arabidopsis thaliana pre-mRNA by combining local and global sequence information. Nucleic Acids Res (1996) 6.13

Comparative functional genomics of the fission yeasts. Science (2011) 6.00

CPC: assess the protein-coding potential of transcripts using sequence features and support vector machine. Nucleic Acids Res (2007) 5.88

An exploration of the sequence of a 2.9-Mb region of the genome of Drosophila melanogaster: the Adh region. Genetics (1999) 5.87

Using native and syntenically mapped cDNA alignments to improve de novo gene finding. Bioinformatics (2008) 5.46

MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinformatics (2011) 5.35

What is a gene, post-ENCODE? History and updated definition. Genome Res (2007) 4.96

Hydrophobic cluster analysis: an efficient new way to compare and analyse amino acid sequences. FEBS Lett (1987) 4.71

PhyloCSF: a comparative genomics method to distinguish protein coding and non-coding regions. Bioinformatics (2011) 4.50

FragGeneScan: predicting genes in short and error-prone reads. Nucleic Acids Res (2010) 4.16

Proteinortho: detection of (co-)orthologs in large-scale analysis. BMC Bioinformatics (2011) 4.15

Single-cell sequencing-based technologies will revolutionize whole-organism science. Nat Rev Genet (2013) 4.03

Computational prediction of eukaryotic protein-coding genes. Nat Rev Genet (2002) 3.97

Vertebrate pseudogenes. FEBS Lett (2000) 3.93

Pseudogene.org: a comprehensive database and comparison platform for pseudogene annotation. Nucleic Acids Res (2006) 3.62

The yeast genome project: what did we learn? Trends Genet (1996) 3.53

Discovery and revision of Arabidopsis genes by proteogenomics. Proc Natl Acad Sci U S A (2008) 3.45

Proto-genes and de novo gene birth. Nature (2012) 3.43

Lateral transfer of genes from fungi underlies carotenoid production in aphids. Science (2010) 3.28

Current methods of gene prediction, their strengths and weaknesses. Nucleic Acids Res (2002) 3.12

The evolutionary origin of orphan genes. Nat Rev Genet (2011) 2.98

Origins, evolution, and phenotypic impact of new genes. Genome Res (2010) 2.98

Systematic evaluation of spliced alignment programs for RNA-seq data. Nat Methods (2013) 2.92

CPAT: Coding-Potential Assessment Tool using an alignment-free logistic regression model. Nucleic Acids Res (2013) 2.88

PseudoPipe: an automated pseudogene identification pipeline. Bioinformatics (2006) 2.85

Deciphering protein sequence information through hydrophobic cluster analysis (HCA): current status and perspectives. Cell Mol Life Sci (1997) 2.83

OrfPredictor: predicting protein-coding regions in EST-derived sequences. Nucleic Acids Res (2005) 2.82

Birth of a chimeric primate gene by capture of the transposase gene from a mobile element. Proc Natl Acad Sci U S A (2006) 2.81

Inverse relationship between evolutionary rate and age of mammalian genes. Mol Biol Evol (2004) 2.79

A beginner's guide to eukaryotic genome annotation. Nat Rev Genet (2012) 2.67

An evolutionary analysis of orphan genes in Drosophila. Genome Res (2003) 2.52

Identification of small ORFs in vertebrates using ribosome footprinting and evolutionary conservation. EMBO J (2014) 2.47

More than just orphans: are taxonomically-restricted genes important in evolution? Trends Genet (2009) 2.46

New genes in Drosophila quickly become essential. Science (2010) 2.39

Using OrthoMCL to assign proteins to OrthoMCL-DB groups or to cluster proteomes into new ortholog groups. Curr Protoc Bioinformatics (2011) 2.15

Evolutionary deterioration of the vomeronasal pheromone transduction pathway in catarrhine primates. Proc Natl Acad Sci U S A (2003) 2.14

Analysis of donor splice sites in different eukaryotic organisms. J Mol Evol (1997) 2.09

Origins of genes: "big bang" or continuous creation? Proc Natl Acad Sci U S A (1992) 2.07

A phylostratigraphy approach to uncover the genomic history of major adaptations in metazoan lineages. Trends Genet (2007) 2.06

Ribosome profiling: new views of translation, from single codons to genome scale. Nat Rev Genet (2014) 2.04

CONTRAST: a discriminative, phylogeny-free approach to multiple informant de novo gene prediction. Genome Biol (2007) 2.01

Contribution of horizontal gene transfer to the evolution of Saccharomyces cerevisiae. Eukaryot Cell (2005) 1.95

The life cycle of Drosophila orphan genes. Elife (2014) 1.94

Dosage compensation via transposable element mediated rewiring of a regulatory network. Science (2013) 1.91

Arrangements in the modular evolution of proteins. Trends Biochem Sci (2008) 1.78

Mammalian overlapping genes: the comparative perspective. Genome Res (2004) 1.74

Proteogenomics to discover the full coding content of genomes: a computational perspective. J Proteomics (2010) 1.71

Improving genome assemblies by sequencing PCR products with PacBio. Biotechniques (2012) 1.70

HiRIEF LC-MS enables deep proteome coverage and unbiased proteogenomics. Nat Methods (2013) 1.55

Proteogenomics: concepts, applications and computational strategies. Nat Methods (2014) 1.54

New genes as drivers of phenotypic evolution. Nat Rev Genet (2013) 1.49

Characterization of the human ESC transcriptome by hybrid sequencing. Proc Natl Acad Sci U S A (2013) 1.48

De novo origin of human protein-coding genes. PLoS Genet (2011) 1.48

Approaches to Fungal Genome Annotation. Mycology (2011) 1.47

Gene structure conservation aids similarity based gene prediction. Nucleic Acids Res (2004) 1.43

KISSPLICE: de-novo calling alternative splicing events from RNA-seq data. BMC Bioinformatics (2012) 1.42

Quantification of the elevated rate of domain rearrangements in metazoa. J Mol Biol (2007) 1.39

Orphans as taxonomically restricted and ecologically important genes. Microbiology (2005) 1.33

Quantifying the mechanisms of domain gain in animal proteins. Genome Biol (2010) 1.32

On homology searches by protein Blast and the characterization of the age of genes. BMC Evol Biol (2007) 1.28

Phylogenetic patterns of emergence of new genes support a model of frequent de novo evolution. BMC Genomics (2013) 1.23

Origin and spread of de novo genes in Drosophila melanogaster populations. Science (2014) 1.22

Mass spectrometry at the interface of proteomics and genomics. Mol Biosyst (2010) 1.22

Novel genes exhibit distinct patterns of function acquisition and network integration. Genome Biol (2010) 1.19

Neofunctionalization of young duplicate genes in Drosophila. Proc Natl Acad Sci U S A (2013) 1.14

Identifying and quantifying orphan protein sequences in fungi. J Mol Biol (2009) 1.11

The prediction of exons through an analysis of spliceable open reading frames. Nucleic Acids Res (1992) 1.09

De novo ORFs in Drosophila are important to organismal fitness and evolved rapidly from previously non-coding sequences. PLoS Genet (2013) 1.06

Ribosome profiling: a Hi-Def monitor for protein synthesis at the genome-wide scale. Wiley Interdiscip Rev RNA (2013) 1.02

New genes expressed in human brains: implications for annotating evolving genomes. Bioessays (2012) 1.01

Dynamics and adaptive benefits of protein domain emergence and arrangements during plant genome evolution. Genome Biol Evol (2012) 0.99

ProteinHistorian: tools for the comparative analysis of eukaryote protein origin. PLoS Comput Biol (2012) 0.97

The dynamics and evolutionary potential of domain loss and emergence. Mol Biol Evol (2011) 0.95

Dynamics and adaptive benefits of modular protein evolution. Curr Opin Struct Biol (2013) 0.94

Non-model organisms, a species endangered by proteogenomics. J Proteomics (2014) 0.94

A comparative analysis of relative occurrence of transcription factor binding sites in vertebrate genomes and gene promoter areas. Bioinformatics (2005) 0.93

Polycistronic peptide coding genes in eukaryotes--how widespread are they? Brief Funct Genomic Proteomic (2008) 0.92

Evolutionary simulations to detect functional lineage-specific genes. Bioinformatics (2006) 0.90

FineSplice, enhanced splice junction detection and quantification: a novel pipeline based on the assessment of diverse RNA-Seq alignment solutions. Nucleic Acids Res (2014) 0.89

Using N-SCAN or TWINSCAN to predict gene structures in genomic DNA sequences. Curr Protoc Bioinformatics (2007) 0.89