SLAM: cross-species gene finding and alignment with a generalized pair hidden Markov model.

PubWeight™: 4.17‹?› | Rank: Top 1%

🔗 View Article (PMC 430255)

Published in Genome Res on March 01, 2003

Authors

Marina Alexandersson1, Simon Cawley, Lior Pachter

Author Affiliations

1: Department of Statistics, University of California, Berkeley, California 94720, USA.

Articles citing this

GeneWise and Genomewise. Genome Res (2004) 17.87

Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes. Proc Natl Acad Sci U S A (2003) 16.58

Identification and characterization of multi-species conserved sequences. Genome Res (2003) 10.18

EGASP: the human ENCODE Genome Annotation Assessment Project. Genome Biol (2006) 7.06

Discovery of tissue-specific exons using comprehensive human exon microarrays. Genome Biol (2007) 5.53

Gene prediction in eukaryotes with a generalized hidden Markov model that uses hints from external sources. BMC Bioinformatics (2006) 5.23

Genome-wide nucleotide-level mammalian ancestor reconstruction. Genome Res (2008) 5.12

AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Res (2006) 4.11

Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biol (2008) 3.73

Genome-wide identification of DNaseI hypersensitive sites using active chromatin sequence libraries. Proc Natl Acad Sci U S A (2004) 2.94

Graemlin: general and robust alignment of multiple large interaction networks. Genome Res (2006) 2.92

Performance and scalability of discriminative metrics for comparative gene identification in 12 Drosophila genomes. PLoS Comput Biol (2008) 2.70

Comparison of human chromosome 21 conserved nongenic sequences (CNGs) with the mouse and dog genomes shows that their selective constraint is independent of their genic environment. Genome Res (2004) 2.58

CONTRAST: a discriminative, phylogeny-free approach to multiple informant de novo gene prediction. Genome Biol (2007) 2.01

Comparative genomics. PLoS Biol (2003) 1.97

Recognition of unknown conserved alternatively spliced exons. PLoS Comput Biol (2005) 1.77

Approaches to Fungal Genome Annotation. Mycology (2011) 1.47

Gene structure conservation aids similarity based gene prediction. Nucleic Acids Res (2004) 1.43

XRate: a fast prototyping, training and annotation tool for phylo-grammars. BMC Bioinformatics (2006) 1.29

Reference based annotation with GeneMapper. Genome Biol (2006) 1.24

Hidden Markov Models and their Applications in Biological Sequence Analysis. Curr Genomics (2009) 1.22

Gene finding in the chicken genome. BMC Bioinformatics (2005) 1.20

Iterative gene prediction and pseudogene removal improves genome annotation. Genome Res (2006) 1.20

SLAM web server for comparative gene finding and alignment. Nucleic Acids Res (2003) 1.15

Parametric inference for biological sequence analysis. Proc Natl Acad Sci U S A (2004) 1.12

Integrating alternative splicing detection into gene prediction. BMC Bioinformatics (2005) 1.09

Efficient decoding algorithms for generalized hidden Markov model gene finders. BMC Bioinformatics (2005) 1.05

Visualization of multiple genome annotations and alignments with the K-BROWSER. Genome Res (2004) 1.05

Accurate identification of novel human genes through simultaneous gene prediction in human, mouse, and rat. Genome Res (2004) 1.03

GeneAlign: a coding exon prediction tool based on phylogenetical comparisons. Nucleic Acids Res (2006) 1.01

Vertebrate gene finding from multiple-species alignments using a two-level strategy. Genome Biol (2006) 0.99

Prediction of small, noncoding RNAs in bacteria using heterogeneous data. J Math Biol (2007) 0.93

Experimental-confirmation and functional-annotation of predicted proteins in the chicken genome. BMC Genomics (2007) 0.92

An empirical analysis of training protocols for probabilistic gene finders. BMC Bioinformatics (2004) 0.91

Automatic generation of gene finders for eukaryotic species. BMC Bioinformatics (2006) 0.90

Genome majority vote improves gene predictions. PLoS Comput Biol (2011) 0.90

The DAWGPAWS pipeline for the annotation of genes and transposable elements in plant genomes. Plant Methods (2009) 0.89

Reranking candidate gene models with cross-species comparison for improved gene prediction. BMC Bioinformatics (2008) 0.88

In silico identification of opossum cytokine genes suggests the complexity of the marsupial immune system rivals that of eutherian mammals. Immunome Res (2006) 0.87

Using several pair-wise informant sequences for de novo prediction of alternatively spliced transcripts. Genome Biol (2006) 0.85

A unified model for yeast transcript definition. Genome Res (2013) 0.84

Bioinformatics resources for pollen. Plant Reprod (2016) 0.83

A phylogenetic generalized hidden Markov model for predicting alternatively spliced exons. Algorithms Mol Biol (2006) 0.78

Recent applications of Hidden Markov Models in computational biology. Genomics Proteomics Bioinformatics (2004) 0.76

Coding exon-structure aware realigner (CESAR) utilizes genome alignments for accurate comparative gene annotation. Nucleic Acids Res (2016) 0.75

Articles cited by this

Basic local alignment search tool. J Mol Biol (1990) 659.07

A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol (1970) 155.96

Prediction of complete gene structures in human genomic DNA. J Mol Biol (1997) 58.76

Alignment of whole genomes. Nucleic Acids Res (1999) 20.02

Conservation, regulation, synteny, and introns in a large-scale C. briggsae-C. elegans genomic alignment. Genome Res (2000) 10.40

AVID: A global alignment program. Genome Res (2003) 10.06

An apolipoprotein influencing triglycerides in humans and mice revealed by comparative sequencing. Science (2001) 8.73

Evaluation of gene structure prediction programs. Genomics (1996) 8.57

Long human-mouse sequence alignments reveal novel regulatory elements: a reason to sequence the mouse genome. Genome Res (1997) 8.49

Human and mouse gene structure: comparative analysis and application to exon prediction. Genome Res (2000) 8.28

Integrating genomic homology into gene structure prediction. Bioinformatics (2001) 7.92

Using GeneWise in the Drosophila annotation experiment. Genome Res (2000) 7.50

Genie--gene finding in Drosophila melanogaster. Genome Res (2000) 7.47

Computational inference of homologous gene structures in the human genome. Genome Res (2001) 6.96

Gene recognition via spliced sequence alignment. Proc Natl Acad Sci U S A (1996) 5.77

An assessment of gene prediction accuracy in large DNA sequences. Genome Res (2000) 4.71

Analysis of conserved noncoding DNA in Drosophila reveals similar constraints in intergenic and intronic sequences. Genome Res (2001) 4.22

Comparative analysis of 1196 orthologous mouse and human full-length mRNA and protein sequences. Genome Res (1996) 4.11

The conserved exon method for gene finding. Proc Int Conf Intell Syst Mol Biol (2000) 2.22

Applications of generalized pair hidden Markov models to alignment and gene finding problems. J Comput Biol (2002) 1.94

Articles by these authors

Initial sequencing and comparative analysis of the mouse genome. Nature (2002) 96.15

TopHat: discovering splice junctions with RNA-Seq. Bioinformatics (2009) 81.13

Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol (2010) 75.21

Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature (2007) 75.09

Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat Protoc (2012) 35.75

Genome sequence of the Brown Norway rat yields insights into mammalian evolution. Nature (2004) 24.40

Integrated detection and population-genetic analysis of SNPs and copy number variation. Nat Genet (2008) 19.55

Evolution of genes and genomes on the Drosophila phylogeny. Nature (2007) 18.01

Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs. Nat Genet (2008) 15.89

Differential analysis of gene regulation at transcript resolution with RNA-seq. Nat Biotechnol (2012) 14.01

VISTA: computational tools for comparative genomics. Nucleic Acids Res (2004) 13.52

Discovery of functional elements in 12 Drosophila genomes using evolutionary signatures. Nature (2007) 11.66

Improving RNA-Seq expression estimates by correcting for fragment bias. Genome Biol (2011) 10.63

A genome-wide map of conserved microRNA targets in C. elegans. Curr Biol (2006) 10.14

AVID: A global alignment program. Genome Res (2003) 10.06

Phylogenetic shadowing of primate sequences to find functional regions of the human genome. Science (2003) 9.93

Genotyping over 100,000 SNPs on a pair of oligonucleotide arrays. Nat Methods (2004) 9.52

Novel RNAs identified from an in-depth analysis of the transcriptome of human chromosomes 21 and 22. Genome Res (2004) 8.35

Identification of novel transcripts in annotated genomes using RNA-Seq. Bioinformatics (2011) 8.05

rVista for comparative sequence-based discovery of functional transcription factor binding sites. Genome Res (2002) 7.33

Analyses of deep mammalian sequence alignments and constraint predictions for 1% of the human genome. Genome Res (2007) 7.05

Disordered microbial communities in asthmatic airways. PLoS One (2010) 6.35

Population genomics: whole-genome analysis of polymorphism and divergence in Drosophila simulans. PLoS Biol (2007) 6.18

Fast statistical alignment. PLoS Comput Biol (2009) 5.92

Viral population estimation using pyrosequencing. PLoS Comput Biol (2008) 5.89

MAVID: constrained ancestral alignment of multiple sequences. Genome Res (2004) 5.83

Comprehensive, Integrative Genomic Analysis of Diffuse Lower-Grade Gliomas. N Engl J Med (2015) 5.71

Dynamic model based algorithms for screening and genotyping over 100 K SNPs on oligonucleotide microarrays. Bioinformatics (2005) 5.01

Strategies and tools for whole-genome alignments. Genome Res (2003) 4.86

Streaming fragment assignment for real-time analysis of sequencing experiments. Nat Methods (2012) 4.43

MAVID multiple alignment server. Nucleic Acids Res (2003) 3.53

Bioinformatics for whole-genome shotgun sequencing of microbial communities. PLoS Comput Biol (2005) 3.39

Multiplexed RNA structure characterization with selective 2'-hydroxyl acylation analyzed by primer extension sequencing (SHAPE-Seq). Proc Natl Acad Sci U S A (2011) 3.30

Description of the data from the Collaborative Study on the Genetics of Alcoholism (COGA) and single-nucleotide polymorphism genotyping for Genetic Analysis Workshop 14. BMC Genet (2005) 3.04

Identification and correction of systematic error in high-throughput sequence data. BMC Bioinformatics (2011) 2.98

Binding site turnover produces pervasive quantitative changes in transcription factor binding between closely related Drosophila species. PLoS Biol (2010) 2.32

Next generation genome-wide association tool: design and coverage of a high-throughput European-optimized SNP array. Genomics (2011) 2.31

Applications of generalized pair hidden Markov models to alignment and gene finding problems. J Comput Biol (2002) 1.94

ANOSVA: a statistical method for detecting splice variation from expression data. Bioinformatics (2005) 1.88

Exon-level microarray analyses identify alternative splicing programs in breast cancer. Mol Cancer Res (2010) 1.84

Mapping and identification of essential gene functions on the X chromosome of Drosophila. EMBO Rep (2001) 1.83

Subtree power analysis and species selection for comparative genomics. Proc Natl Acad Sci U S A (2005) 1.69

Multiple alignment by sequence annealing. Bioinformatics (2007) 1.63

CGAL: computing genome assembly likelihoods. Genome Biol (2013) 1.53

Modeling and automation of sequencing-based characterization of RNA structure. Proc Natl Acad Sci U S A (2011) 1.49

HMM sampling and applications to gene finding and alternative splicing. Bioinformatics (2003) 1.49

Multiple-sequence functional annotation and the generalized hidden Markov phylogeny. Bioinformatics (2004) 1.46

Evolution at the nucleotide level: the problem of multiple whole-genome alignment. Hum Mol Genet (2006) 1.43

Shape-based peak identification for ChIP-Seq. BMC Bioinformatics (2011) 1.38

Analysis of epistatic interactions and fitness landscapes using a new geometric approach. BMC Evol Biol (2007) 1.30

Development of a low bias method for characterizing viral populations using next generation sequencing technology. PLoS One (2010) 1.25

Reference based annotation with GeneMapper. Genome Biol (2006) 1.24

Identification of evolutionary hotspots in the rodent genomes. Genome Res (2004) 1.21

Identification of transposable elements using multiple alignments of related genomes. Genome Res (2005) 1.17

The computational challenges of applying comparative-based computational methods to whole genomes. Brief Bioinform (2002) 1.17

Specific alignment of structured RNA: stochastic grammars and sequence annealing. Bioinformatics (2008) 1.16

Intraspecies sequence comparisons for annotating genomes. Genome Res (2004) 1.15

SLAM web server for comparative gene finding and alignment. Nucleic Acids Res (2003) 1.15

A dynamic alternative splicing program regulates gene expression during terminal erythropoiesis. Nucleic Acids Res (2014) 1.13

Parametric alignment of Drosophila genomes. PLoS Comput Biol (2006) 1.12

Genome methylation in D. melanogaster is found at specific short motifs and is independent of DNMT2 activity. Genome Res (2014) 1.12

Coverage statistics for sequence census methods. BMC Bioinformatics (2010) 1.07

SHAPE-Seq: High-Throughput RNA Structure Analysis. Curr Protoc Chem Biol (2012) 1.06

Visualization of multiple genome annotations and alignments with the K-BROWSER. Genome Res (2004) 1.05

Accurate identification of novel human genes through simultaneous gene prediction in human, mouse, and rat. Genome Res (2004) 1.03

Comparison of pattern detection methods in microarray time series of the segmentation clock. PLoS One (2008) 0.99

Interpreting the unculturable majority. Nat Methods (2007) 0.95

On the optimality of the neighbor-joining algorithm. Algorithms Mol Biol (2008) 0.92

Phyloepigenomic comparison of great apes reveals a correlation between somatic and germline methylation states. Genome Res (2011) 0.91

Combining statistical alignment and phylogenetic footprinting to detect regulatory elements. Bioinformatics (2008) 0.90

Large multiple organism gene finding by collapsed Gibbs sampling. J Comput Biol (2005) 0.89

MetMap enables genome-scale Methyltyping for determining methylation states in populations. PLoS Comput Biol (2010) 0.89

RNA-Seq and find: entering the RNA deep field. Genome Med (2011) 0.87

Beyond pairwise distances: neighbor-joining with phylogenetic diversity estimates. Mol Biol Evol (2005) 0.84

Updating RNA-Seq analyses after re-annotation. Bioinformatics (2013) 0.84

Combinatorics of least-squares trees. Proc Natl Acad Sci U S A (2008) 0.83

A closer look at RNA editing. Nat Biotechnol (2012) 0.80

Differential dropout among SNP genotypes and impacts on association tests. Hum Hered (2007) 0.80

A genome-wide linkage analysis of alcoholism on microsatellite and single-nucleotide polymorphism data, using alcohol dependence phenotypes and electroencephalogram measures. BMC Genet (2005) 0.79

The cyclohedron test for finding periodic genes in time course expression studies. Stat Appl Genet Mol Biol (2007) 0.78

Toward the human genotope. Bull Math Biol (2007) 0.78

Alternative base calling method for resequencing microarrays. Conf Proc IEEE Eng Med Biol Soc (2005) 0.78

Erratum: Near-optimal probabilistic RNA-seq quantification. Nat Biotechnol (2016) 0.77

Tracing the most parsimonious indel history. J Comput Biol (2011) 0.76

Quantifying uniformity of mapped reads. Bioinformatics (2012) 0.76

Estimating intrinsic and extrinsic noise from single-cell gene expression measurements. Stat Appl Genet Mol Biol (2016) 0.76

Patterns of gene duplication and intron loss in the ENCODE regions suggest a confounding factor. Genomics (2007) 0.75

Exploring the genetic basis of variation in gene predictions with a synthetic association study. PLoS One (2010) 0.75

Determining coding CpG islands by identifying regions significant for pattern statistics on Markov chains. Stat Appl Genet Mol Biol (2011) 0.75

Picking alignments from (Steiner) trees. J Comput Biol (2003) 0.75