Gene prediction in eukaryotes with a generalized hidden Markov model that uses hints from external sources.

PubWeight™: 5.23‹?› | Rank: Top 1%

🔗 View Article (PMC 1409804)

Published in BMC Bioinformatics on February 09, 2006

Authors

Mario Stanke1, Oliver Schöffmann, Burkhard Morgenstern, Stephan Waack

Author Affiliations

1: lnstitut für Mikrobiologie und Genetik, Universität Göttingen, Göttingen, Germany. mstanke@gwdg.de

Articles citing this

(truncated to the top 100)

Resequencing of 31 wild and cultivated soybean genomes identifies patterns of genetic diversity and selection. Nat Genet (2010) 5.20

Resequencing 50 accessions of cultivated and wild rice yields markers for identifying agronomically important genes. Nat Biotechnol (2011) 4.84

AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Res (2006) 4.11

Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biol (2008) 3.73

Deep RNA sequencing at single base-pair resolution reveals high complexity of the rice transcriptome. Genome Res (2010) 3.34

Score-based prediction of genomic islands in prokaryotic genomes using hidden Markov models. BMC Bioinformatics (2006) 3.15

De novo genome sequence assembly of a filamentous fungus using Sanger, 454 and Illumina sequence data. Genome Biol (2009) 2.99

De novo assembly of a 40 Mb eukaryotic genome from short sequence reads: Sordaria macrospora, a model organism for fungal morphogenesis. PLoS Genet (2010) 2.77

A beginner's guide to eukaryotic genome annotation. Nat Rev Genet (2012) 2.67

Genome sequence of the recombinant protein production host Pichia pastoris. Nat Biotechnol (2009) 2.67

AUGUSTUS at EGASP: using EST, protein and genomic alignments for improved gene prediction in the human genome. Genome Biol (2006) 2.39

Biogeography and individuality shape function in the human skin metagenome. Nature (2014) 2.27

DIALIGN-TX: greedy and progressive approaches for segment-based multiple sequence alignment. Algorithms Mol Biol (2008) 2.07

Function of alternative splicing. Gene (2012) 2.00

Proteogenomics to discover the full coding content of genomes: a computational perspective. J Proteomics (2010) 1.71

Web Apollo: a web-based genomic annotation editing platform. Genome Biol (2013) 1.69

The king cobra genome reveals dynamic gene evolution and adaptation in the snake venom system. Proc Natl Acad Sci U S A (2013) 1.68

mGene: accurate SVM-based gene finding with an application to nematode genomes. Genome Res (2009) 1.67

Analysis of the genome and transcriptome of Cryptococcus neoformans var. grubii reveals complex RNA expression and microevolution leading to virulence attenuation. PLoS Genet (2014) 1.62

Whole-genome sequencing reveals untapped genetic potential in Africa's indigenous cereal crop sorghum. Nat Commun (2013) 1.60

AphidBase: a centralized bioinformatic resource for annotation of the pea aphid genome. Insect Mol Biol (2010) 1.57

The genome and life-stage specific transcriptomes of Globodera pallida elucidate key aspects of plant parasitism by a cyst nematode. Genome Biol (2014) 1.51

Approaches to Fungal Genome Annotation. Mycology (2011) 1.47

Finding the missing honey bee genes: lessons learned from a genome upgrade. BMC Genomics (2014) 1.46

Comparative genomics of the apicomplexan parasites Toxoplasma gondii and Neospora caninum: Coccidia differing in host range and transmission strategy. PLoS Pathog (2012) 1.44

Selection of Orthologous Genes for Construction of a Highly Resolved Phylogenetic Tree and Clarification of the Phylogeny of Trichosporonales Species. PLoS One (2015) 1.40

Splice site identification using probabilistic parameters and SVM classification. BMC Bioinformatics (2006) 1.32

Using ESTs to improve the accuracy of de novo gene prediction. BMC Bioinformatics (2006) 1.24

Genome sequencing of the lizard parasite Leishmania tarentolae reveals loss of genes associated to the intracellular stage of human pathogenic species. Nucleic Acids Res (2011) 1.23

The genomes of two key bumblebee species with primitive eusocial organization. Genome Biol (2015) 1.21

QC-Chain: fast and holistic quality control method for next-generation sequencing data. PLoS One (2013) 1.20

The genomic and phenotypic diversity of Schizosaccharomyces pombe. Nat Genet (2015) 1.19

The genome of the anaerobic fungus Orpinomyces sp. strain C1A reveals the unique evolutionary history of a remarkable plant biomass degrader. Appl Environ Microbiol (2013) 1.16

Comparative genomics suggests that the fungal pathogen pneumocystis is an obligate parasite scavenging amino acids from its host's lungs. PLoS One (2010) 1.14

A lover and a fighter: the genome sequence of an entomopathogenic nematode Heterorhabditis bacteriophora. PLoS One (2013) 1.11

Ln is a key regulator of leaflet shape and number of seeds per pod in soybean. Plant Cell (2012) 1.10

Comparative genome structure, secondary metabolite, and effector coding capacity across Cochliobolus pathogens. PLoS Genet (2013) 1.09

CodingQuarry: highly accurate hidden Markov model gene prediction in fungal genomes using RNA-seq transcripts. BMC Genomics (2015) 1.08

Whole-genome sequencing of the efficient industrial fuel-ethanol fermentative Saccharomyces cerevisiae strain CAT-1. Mol Genet Genomics (2012) 1.08

A new family of receptor tyrosine kinases with a venus flytrap binding domain in insects and other invertebrates activated by aminoacids. PLoS One (2009) 1.07

The completed genome sequence of the pathogenic ascomycete fungus Fusarium graminearum. BMC Genomics (2015) 1.07

mGene.web: a web service for accurate computational gene finding. Nucleic Acids Res (2009) 1.04

The genome sequence and effector complement of the flax rust pathogen Melampsora lini. Front Plant Sci (2014) 1.03

Whipworm genome and dual-species transcriptome analyses provide molecular insights into an intimate host-parasite interaction. Nat Genet (2014) 1.02

Saccharina genomes provide novel insight into kelp biology. Nat Commun (2015) 1.02

Leucoagaricus gongylophorus produces diverse enzymes for the degradation of recalcitrant plant polymers in leaf-cutter ant fungus gardens. Appl Environ Microbiol (2013) 1.01

Comprehensive analysis of RNA-seq data reveals the complexity of the transcriptome in Brassica rapa. BMC Genomics (2013) 1.00

Genome reannotation of the lizard Anolis carolinensis based on 14 adult and embryonic deep transcriptomes. BMC Genomics (2013) 0.97

Genetic basis of the highly efficient yeast Kluyveromyces marxianus: complete genome sequence and transcriptome analyses. Biotechnol Biofuels (2015) 0.96

Gene expression defines natural changes in mammalian lifespan. Aging Cell (2015) 0.96

Comparative genome sequencing reveals genomic signature of extreme desiccation tolerance in the anhydrobiotic midge. Nat Commun (2014) 0.96

Genome sequence of Anopheles sinensis provides insight into genetics basis of mosquito competence for malaria parasites. BMC Genomics (2014) 0.95

An automated proteogenomic method uses mass spectrometry to reveal novel genes in Zea mays. Mol Cell Proteomics (2013) 0.95

Structure of two melon regions reveals high microsynteny with sequenced plant species. Mol Genet Genomics (2007) 0.95

Genus-Wide Comparative Genomics of Malassezia Delineates Its Phylogeny, Physiology, and Niche Adaptation on Human Skin. PLoS Genet (2015) 0.93

Genome sequencing of the plant pathogen Taphrina deformans, the causal agent of peach leaf curl. MBio (2013) 0.93

A physical map for the Amborella trichopoda genome sheds light on the evolution of angiosperm genome structure. Genome Biol (2011) 0.93

The genome and development-dependent transcriptomes of Pyronema confluens: a window into fungal evolution. PLoS Genet (2013) 0.93

Annotation and comparative analysis of the glycoside hydrolase genes in Brachypodium distachyon. BMC Genomics (2010) 0.93

Gene loss rather than gene gain is associated with a host jump from monocots to dicots in the Smut Fungus Melanopsichium pennsylvanicum. Genome Biol Evol (2014) 0.93

MetWAMer: eukaryotic translation initiation site prediction. BMC Bioinformatics (2008) 0.92

The venus kinase receptor (VKR) family: structure and evolution. BMC Genomics (2013) 0.91

Exploring the rice dispensable genome using a metagenome-like assembly strategy. Genome Biol (2015) 0.91

WormBase: Annotating many nematode genomes. Worm (2012) 0.91

Genome Sequence and Transcriptome Analyses of Chrysochromulina tobin: Metabolic Tools for Enhanced Algal Fitness in the Prominent Order Prymnesiales (Haptophyceae). PLoS Genet (2015) 0.91

Improvement of whole-genome annotation of cereals through comparative analyses. Genome Res (2007) 0.90

Insights on the evolution of mycoparasitism from the genome of Clonostachys rosea. Genome Biol Evol (2015) 0.90

Companion: a web server for annotation and analysis of parasite genomes. Nucleic Acids Res (2016) 0.90

The genome and transcriptome of the enteric parasite Entamoeba invadens, a model for encystation. Genome Biol (2013) 0.90

Genomic organization, transcriptomic analysis, and functional characterization of avian α- and β-keratins in diverse feather forms. Genome Biol Evol (2014) 0.89

Two patched protein subtypes and a conserved domain of group I proteins that regulates turnover. J Biol Chem (2008) 0.89

Evidence for suppression of immunity as a driver for genomic introgressions and host range expansion in races of Albugo candida, a generalist parasite. Elife (2015) 0.89

Translation in giant viruses: a unique mixture of bacterial and eukaryotic termination schemes. PLoS Genet (2012) 0.89

Draft genome sequencing and comparative analysis of Aspergillus sojae NBRC4239. DNA Res (2011) 0.88

Reranking candidate gene models with cross-species comparison for improved gene prediction. BMC Bioinformatics (2008) 0.88

In silico identification of opossum cytokine genes suggests the complexity of the marsupial immune system rivals that of eutherian mammals. Immunome Res (2006) 0.87

The Schistosoma mansoni transcriptome: an update. Exp Parasitol (2007) 0.87

HMMCONVERTER 1.0: a toolbox for hidden Markov models. Nucleic Acids Res (2009) 0.87

Complete genomes of Hairstreak butterflies, their speciation, and nucleo-mitochondrial incongruence. Sci Rep (2016) 0.87

Finding genes in Schistosoma japonicum: annotating novel genomes with help of extrinsic evidence. Nucleic Acids Res (2009) 0.87

The diversification of the LIM superclass at the base of the metazoa increased subcellular complexity and promoted multicellular specialization. PLoS One (2012) 0.86

Speciation in Cloudless Sulphurs Gleaned from Complete Genomes. Genome Biol Evol (2016) 0.86

Comprehensive evaluation of Toxoplasma gondii VEG and Neospora caninum LIV genomes with tachyzoite stage transcriptome and proteome defines novel transcript features. PLoS One (2015) 0.86

Re-annotation of the woodland strawberry (Fragaria vesca) genome. BMC Genomics (2015) 0.86

IMA Genome-F 4: Draft genome sequences of Chrysoporthe austroafricana, Diplodia scrobiculata, Fusarium nygamai, Leptographium lundbergii, Limonomyces culmigenus, Stagonosporopsis tanaceti, and Thielaviopsis punctulata. IMA Fungus (2015) 0.85

LAceP: lysine acetylation site prediction using logistic regression classifiers. PLoS One (2014) 0.84

The genome of the basal agaricomycete Xanthophyllomyces dendrorhous provides insights into the organization of its acetyl-CoA derived pathways and the evolution of Agaricomycotina. BMC Genomics (2015) 0.84

OMIGA: Optimized Maker-Based Insect Genome Annotation. Mol Genet Genomics (2014) 0.83

First Draft Genome Sequence of a UK Strain (UK99) of Fusarium culmorum. Genome Announc (2016) 0.83

Skipper genome sheds light on unique phenotypic traits and phylogeny. BMC Genomics (2015) 0.83

Draft Genome Sequence of the Yeast Pseudozyma antarctica Type Strain JCM10317, a Producer of the Glycolipid Biosurfactants, Mannosylerythritol Lipids. Genome Announc (2014) 0.83

Genomic Sequence of the Yeast Kluyveromyces marxianus CCT 7735 (UFV-3), a Highly Lactose-Fermenting Yeast Isolated from the Brazilian Dairy Industry. Genome Announc (2014) 0.82

A whole transcriptomal linkage analysis of gene co-regulation in insecticide resistant house flies, Musca domestica. BMC Genomics (2013) 0.81

Genus-Wide Comparative Genome Analyses of Colletotrichum Species Reveal Specific Gene Family Losses and Gains during Adaptation to Specific Infection Lifestyles. Genome Biol Evol (2016) 0.81

StochHMM: a flexible hidden Markov model tool and C++ library. Bioinformatics (2014) 0.81

A set of genes conserved in sequence and expression traces back the establishment of multicellularity in social amoebae. BMC Genomics (2016) 0.81

Complex modular architecture around a simple toolkit of wing pattern genes. Nat Ecol Evol (2017) 0.80

Genomics of Volvocine Algae. Adv Bot Res (2015) 0.80

IPred - integrating ab initio and evidence based gene predictions to improve prediction accuracy. BMC Genomics (2015) 0.80

Draft Genome Sequence of Sorghum Grain Mold Fungus Epicoccum sorghinum, a Producer of Tenuazonic Acid. Genome Announc (2017) 0.80

Articles cited by this

Identification of protein coding regions by database similarity search. Nat Genet (1993) 21.58

GeneWise and Genomewise. Genome Res (2004) 17.87

Gene prediction with a hidden Markov model and a new intron submodel. Bioinformatics (2003) 8.92

Integrating genomic homology into gene structure prediction. Bioinformatics (2001) 7.92

Computational inference of homologous gene structures in the human genome. Genome Res (2001) 6.96

GeneID in Drosophila. Genome Res (2000) 6.61

Multiple DNA and protein sequence alignment based on segment-to-segment comparison. Proc Natl Acad Sci U S A (1996) 6.02

AUGUSTUS: a web server for gene finding in eukaryotes. Nucleic Acids Res (2004) 5.53

An assessment of gene prediction accuracy in large DNA sequences. Genome Res (2000) 4.71

Two methods for improving performance of an HMM and their application for gene finding. Proc Int Conf Intell Syst Mol Biol (1997) 4.55

SLAM: cross-species gene finding and alignment with a generalized pair hidden Markov model. Genome Res (2003) 4.17

Comparative gene prediction in human and mouse. Genome Res (2003) 4.13

Reevaluating human gene annotation: a second-generation analysis of chromosome 22. Genome Res (2003) 3.03

Comparative ab initio prediction of gene structures using pair HMMs. Bioinformatics (2002) 2.99

Fast and sensitive multiple alignment of large genomic sequences. BMC Bioinformatics (2003) 2.94

Computational gene prediction using multiple sources of evidence. Genome Res (2004) 2.58

Using database matches with for HMMGene for automated gene detection in Drosophila. Genome Res (2000) 2.52

Gene finding with a hidden Markov model of genome structure and evolution. Bioinformatics (2003) 2.21

Recent advances in gene structure prediction. Curr Opin Struct Biol (2004) 1.90

ExonHunter: a comprehensive approach to gene finding. Bioinformatics (2005) 1.55

AGenDA: gene prediction by cross-species sequence comparison. Nucleic Acids Res (2004) 1.00

Articles by these authors

Gene prediction with a hidden Markov model and a new intron submodel. Bioinformatics (2003) 8.92

The genome of the model beetle and pest Tribolium castaneum. Nature (2008) 6.50

Phylogenomics revives traditional views on deep animal relationships. Curr Biol (2009) 5.89

AUGUSTUS: a web server for gene finding in eukaryotes. Nucleic Acids Res (2004) 5.53

Comparative analysis of the complete genome sequence of the plant growth-promoting bacterium Bacillus amyloliquefaciens FZB42. Nat Biotechnol (2007) 4.66

AUGUSTUS: a web server for gene prediction in eukaryotes that allows user-defined constraints. Nucleic Acids Res (2005) 4.13

AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Res (2006) 4.11

TICO: a tool for improving predictions of prokaryotic translation initiation sites. Bioinformatics (2005) 3.03

Fast and sensitive multiple alignment of large genomic sequences. BMC Bioinformatics (2003) 2.94

Fast and sensitive alignment of large genomic sequences. Proc IEEE Comput Soc Bioinform Conf (2002) 2.84

AUGUSTUS at EGASP: using EST, protein and genomic alignments for improved gene prediction in the human genome. Genome Biol (2006) 2.39

A jumping profile Hidden Markov Model and applications to recombination sites in HIV and HCV genomes. BMC Bioinformatics (2006) 2.25

DIALIGN-TX: greedy and progressive approaches for segment-based multiple sequence alignment. Algorithms Mol Biol (2008) 2.07

Rhizobium sp. strain NGR234 possesses a remarkable number of secretion systems. Appl Environ Microbiol (2009) 2.07

The CHAOS/DIALIGN WWW server for multiple alignment of genomic sequences. Nucleic Acids Res (2004) 1.88

DIALIGN-T: an improved algorithm for segment-based multiple sequence alignment. BMC Bioinformatics (2005) 1.87

The role of recombination in the emergence of a complex and dynamic HIV epidemic. Retrovirology (2010) 1.69

Gene prediction in metagenomic fragments: a large scale machine learning approach. BMC Bioinformatics (2008) 1.64

Scipio: using protein sequences to determine the precise exon/intron structures of genes and their orthologs in closely related species. BMC Bioinformatics (2008) 1.63

jpHMM: improving the reliability of recombination prediction in HIV-1. Nucleic Acids Res (2009) 1.61

A novel hybrid gene prediction method employing protein multiple sequence alignments. Bioinformatics (2011) 1.60

jpHMM at GOBICS: a web server to detect genomic recombinations in HIV-1. Nucleic Acids Res (2006) 1.58

Oligo kernels for datamining on biological sequences: a case study on prokaryotic translation initiation sites. BMC Bioinformatics (2004) 1.42

OrthoSelect: a protocol for selecting orthologous groups in phylogenomics. BMC Bioinformatics (2009) 1.29

AGenDA: gene prediction by comparative sequence analysis. In Silico Biol (2002) 1.26

Identification of novel plant peroxisomal targeting signals by a combination of machine learning methods and in vivo subcellular targeting analyses. Plant Cell (2011) 1.09

AGenDA: homology-based gene prediction. Bioinformatics (2003) 1.08

MolabIS--an integrated information system for storing and managing molecular genetics data. BMC Bioinformatics (2011) 1.05

WebScipio: an online tool for the determination of gene structures using protein sequences. BMC Genomics (2008) 1.01

AGenDA: gene prediction by cross-species sequence comparison. Nucleic Acids Res (2004) 1.00

jpHMM: recombination analysis in viruses with circular genomes such as the hepatitis B virus. Nucleic Acids Res (2012) 0.99

Divide-and-conquer multiple alignment with segment-based constraints. Bioinformatics (2003) 0.96

GenePainter: a fast tool for aligning gene structures of eukaryotic protein families, visualizing the alignments and mapping gene structures onto protein structures. BMC Bioinformatics (2013) 0.94

Metabolite-based clustering and visualization of mass spectrometry data using one-dimensional self-organizing maps. Algorithms Mol Biol (2008) 0.94

DIALIGN P: fast pair-wise and multiple sequence alignment using parallel processors. BMC Bioinformatics (2004) 0.93

Cross-species protein sequence and gene structure prediction with fine-tuned Webscipio 2.0 and Scipio. BMC Res Notes (2011) 0.90

DIALIGN-TX and multiple protein alignment using secondary structure information at GOBICS. Nucleic Acids Res (2010) 0.89

A min-cut algorithm for the consistency problem in multiple sequence alignment. Bioinformatics (2010) 0.88

DIALIGN at GOBICS--multiple sequence alignment using various sources of external information. Nucleic Acids Res (2013) 0.87

Stability of multiple alignments and phylogenetic trees: an analysis of ABC-transporter proteins family. Algorithms Mol Biol (2008) 0.81

HIV classification using the coalescent theory. Bioinformatics (2010) 0.81

Improved coverage of cDNA-AFLP by sequential digestion of immobilized cDNA. BMC Genomics (2008) 0.80

Multiple alignment of genomic sequences using CHAOS, DIALIGN and ABC. Nucleic Acids Res (2005) 0.79

OrthoSelect: a web server for selecting orthologous gene alignments from EST sequences. Nucleic Acids Res (2009) 0.79

Coupled mutation finder: a new entropy-based method quantifying phylogenetic noise for the detection of compensatory mutations. BMC Bioinformatics (2012) 0.78

Detection of viral sequence fragments of HIV-1 subfamilies yet unknown. BMC Bioinformatics (2011) 0.78

Combining features in a graphical model to predict protein binding sites. Proteins (2015) 0.76

5'TRU: identification and analysis of translationally regulative 5'untranslated regions in amino acid starved yeast cells. Mol Cell Proteomics (2011) 0.76

Quantum coupled mutation finder: predicting functionally or structurally important sites in proteins using quantum Jensen-Shannon divergence and CUDA programming. BMC Bioinformatics (2014) 0.75

P-value based visualization of codon usage data. Algorithms Mol Biol (2006) 0.75

New journal: Algorithms for Molecular Biology. Algorithms Mol Biol (2006) 0.75