The Ensembl automatic gene annotation system.

PubWeight™: 12.24‹?› | Rank: Top 0.1% | All-Time Top 10000

🔗 View Article (PMC 479124)

Published in Genome Res on May 01, 2004

Authors

Val Curwen1, Eduardo Eyras, T Daniel Andrews, Laura Clarke, Emmanuel Mongin, Steven M J Searle, Michele Clamp

Author Affiliations

1: The Wellcome Trust Sanger Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.

Articles citing this

(truncated to the top 100)

ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res (2010) 43.51

Automated generation of heuristics for biological sequence comparison. BMC Bioinformatics (2005) 24.54

Ensembl 2007. Nucleic Acids Res (2006) 20.10

DNA methylation profiling of human chromosomes 6, 20 and 22. Nat Genet (2006) 18.08

GeneWise and Genomewise. Genome Res (2004) 17.87

Ensembl 2005. Nucleic Acids Res (2005) 15.13

Ensembl 2012. Nucleic Acids Res (2011) 14.55

Ensembl 2013. Nucleic Acids Res (2012) 11.70

Ensembl 2006. Nucleic Acids Res (2006) 11.66

An overview of Ensembl. Genome Res (2004) 10.35

EGASP: the human ENCODE Genome Annotation Assessment Project. Genome Biol (2006) 7.06

Detection of splice junctions from paired-end RNA-seq data by SpliceMap. Nucleic Acids Res (2010) 6.95

Ensembl BioMarts: a hub for data retrieval across taxonomic space. Database (Oxford) (2011) 6.04

The Ensembl analysis pipeline. Genome Res (2004) 5.90

Genome analysis of the platypus reveals unique signatures of evolution. Nature (2008) 5.74

Antagonism of microRNA-122 in mice by systemically administered LNA-antimiR leads to up-regulation of a large set of predicted target mRNAs in the liver. Nucleic Acids Res (2007) 5.67

Multi-platform next-generation sequencing of the domestic turkey (Meleagris gallopavo): genome assembly and analysis. PLoS Biol (2010) 5.39

Anopheles gambiae immune responses to human and rodent Plasmodium parasite species. PLoS Pathog (2006) 4.67

AutoFACT: an automatic functional annotation and classification tool. BMC Bioinformatics (2005) 4.26

ESTGenes: alternative splicing from ESTs in Ensembl. Genome Res (2004) 3.85

Benchmarking ortholog identification methods using functional genomics data. Genome Biol (2006) 3.54

Orphan CpG islands identify numerous conserved promoters in the mammalian genome. PLoS Genet (2010) 3.38

Analysis of the prostate cancer cell line LNCaP transcriptome using a sequencing-by-synthesis approach. BMC Genomics (2006) 2.98

VectorBase: a home for invertebrate vectors of human pathogens. Nucleic Acids Res (2006) 2.94

Fast and systematic genome-wide discovery of conserved regulatory elements using a non-alignment based approach. Genome Biol (2005) 2.82

Comparative genomic analysis identifies an ADP-ribosylation factor-like gene as the cause of Bardet-Biedl syndrome (BBS3). Am J Hum Genet (2004) 2.67

Erythroid GATA1 function revealed by genome-wide analysis of transcription factor occupancy, histone modifications, and mRNA expression. Genome Res (2009) 2.64

wANNOVAR: annotating genetic variants for personal genomes via the web. J Med Genet (2012) 2.43

A predicted interactome for Arabidopsis. Plant Physiol (2007) 2.41

Genome sequencing and analysis of the Tasmanian devil and its transmissible cancer. Cell (2012) 2.37

A mouse atlas of gene expression: large-scale digital gene-expression profiles from precisely defined developing C57BL/6J mouse tissues and cells. Proc Natl Acad Sci U S A (2005) 2.35

Hymenoptera Genome Database: integrated community resources for insect species of the order Hymenoptera. Nucleic Acids Res (2010) 2.15

JIGSAW, GeneZilla, and GlimmerHMM: puzzling out the features of human genes in the ENCODE regions. Genome Biol (2006) 2.00

What everybody should know about the rat genome and its online resources. Nat Genet (2008) 1.94

PrimerZ: streamlined primer design for promoters, exons and human SNPs. Nucleic Acids Res (2007) 1.91

Incorporating RNA-seq data into the zebrafish Ensembl genebuild. Genome Res (2012) 1.89

A haplome alignment and reference sequence of the highly polymorphic Ciona savignyi genome. Genome Biol (2007) 1.82

Systematic identification of cis-regulatory sequences active in mouse and human embryonic stem cells. PLoS Genet (2007) 1.77

Proteogenomics to discover the full coding content of genomes: a computational perspective. J Proteomics (2010) 1.71

Web Apollo: a web-based genomic annotation editing platform. Genome Biol (2013) 1.69

A versatile computational pipeline for bacterial genome annotation improvement and comparative analysis, with Brucella as a use case. Nucleic Acids Res (2007) 1.68

The transcriptome of equine peripheral blood mononuclear cells. PLoS One (2015) 1.65

OrthoMaM: a database of orthologous genomic markers for placental mammal phylogenetics. BMC Evol Biol (2007) 1.65

Reconstruction of metabolic pathways for the cattle genome. BMC Syst Biol (2009) 1.63

Hair bundles are specialized for ATP delivery via creatine kinase. Neuron (2007) 1.60

SpBase: the sea urchin genome database and web site. Nucleic Acids Res (2008) 1.56

Fast-X on the Z: rapid evolution of sex-linked genes in birds. Genome Res (2007) 1.49

Transcriptome analyses of the human retina identify unprecedented transcript diversity and 3.5 Mb of novel transcribed sequence via significant alternative splicing and novel genes. BMC Genomics (2013) 1.46

Selective integrin endocytosis is driven by interactions between the integrin α-chain and AP2. Nat Struct Mol Biol (2016) 1.41

Dynamic usage of transcription start sites within core promoters. Genome Biol (2006) 1.41

The common marmoset genome provides insight into primate biology and evolution. Nat Genet (2014) 1.38

TICdb: a collection of gene-mapped translocation breakpoints in cancer. BMC Genomics (2007) 1.37

High-coverage sequencing and annotated assemblies of the budgerigar genome. Gigascience (2014) 1.34

The dynein regulatory complex is required for ciliary motility and otolith biogenesis in the inner ear. Nature (2008) 1.31

The Ensembl gene annotation system. Database (Oxford) (2016) 1.31

Functional and evolutionary analysis of alternatively spliced genes is consistent with an early eukaryotic origin of alternative splicing. BMC Evol Biol (2007) 1.30

Detailed analysis of a contiguous 22-Mb region of the maize genome. PLoS Genet (2009) 1.29

Generating and navigating proteome maps using mass spectrometry. Nat Rev Mol Cell Biol (2010) 1.29

Anopheles gambiae genome reannotation through synthesis of ab initio and comparative gene prediction algorithms. Genome Biol (2006) 1.27

Deep analysis of cellular transcriptomes - LongSAGE versus classic MPSS. BMC Genomics (2007) 1.21

The ensembl regulatory build. Genome Biol (2015) 1.21

Gene finding in the chicken genome. BMC Bioinformatics (2005) 1.20

A genome-wide survey demonstrates widespread non-linear mRNA in expressed sequences from multiple species. Nucleic Acids Res (2005) 1.20

Shotgun proteomics aids discovery of novel protein-coding genes, alternative splicing, and "resurrected" pseudogenes in the mouse genome. Genome Res (2011) 1.20

Iterative gene prediction and pseudogene removal improves genome annotation. Genome Res (2006) 1.20

Molecular evolution of Phox-related regulatory subunits for NADPH oxidase enzymes. BMC Evol Biol (2007) 1.17

The Innate Immune Database (IIDB). BMC Immunol (2008) 1.11

Integrating alternative splicing detection into gene prediction. BMC Bioinformatics (2005) 1.09

Comparison of human (and other) genome browsers. Hum Genomics (2006) 1.09

GETPrime: a gene- or transcript-specific primer database for quantitative real-time PCR. Database (Oxford) (2011) 1.08

Protein coding potential of retroviruses and other transposable elements in vertebrate genomes. Nucleic Acids Res (2005) 1.06

Functional analysis of novel SNPs and mutations in human and mouse genomes. BMC Bioinformatics (2008) 1.06

Evidence-based gene predictions in plant genomes. Genome Res (2009) 1.06

Reproductive and developmental toxicity of dioxin in fish. Mol Cell Endocrinol (2011) 1.05

TriAnnot: A Versatile and High Performance Pipeline for the Automated Annotation of Plant Genomes. Front Plant Sci (2012) 1.04

Identifying significant genetic regulatory networks in the prostate cancer from microarray data based on transcription factor analysis and conditional independency. BMC Med Genomics (2009) 1.02

Ashbya Genome Database 3.0: a cross-species genome and transcriptome browser for yeast biologists. BMC Genomics (2007) 1.01

Comparative gene finding in chicken indicates that we are closing in on the set of multi-exonic widely expressed human genes. Nucleic Acids Res (2005) 1.01

fREDUCE: detection of degenerate regulatory elements using correlation with expression. BMC Bioinformatics (2007) 1.01

Analysis of in situ pre-mRNA targets of human splicing factor SF1 reveals a function in alternative splicing. Nucleic Acids Res (2010) 0.99

Refinement of primate copy number variation hotspots identifies candidate genomic regions evolving under positive selection. Genome Biol (2011) 0.99

The UniTrap resource: tools for the biologist enabling optimized use of gene trap clones. Nucleic Acids Res (2007) 0.98

Non-coding sequence retrieval system for comparative genomic analysis of gene regulatory elements. BMC Bioinformatics (2007) 0.97

Long-read sequencing of chicken transcripts and identification of new transcript isoforms. PLoS One (2014) 0.95

Genome and proteome annotation: organization, interpretation and integration. J R Soc Interface (2009) 0.95

Serious limitations of the QTL/microarray approach for QTL gene discovery. BMC Biol (2010) 0.95

Unexpected observations after mapping LongSAGE tags to the human genome. BMC Bioinformatics (2007) 0.95

The draft genome sequence of the ferret (Mustela putorius furo) facilitates study of human respiratory disease. Nat Biotechnol (2014) 0.94

A novel view of the transcriptome revealed from gene trapping in mouse embryonic stem cells. Genome Res (2007) 0.93

Differentiated evolutionary rates in alternative exons and the implications for splicing regulation. BMC Evol Biol (2006) 0.93

Pilot Anopheles gambiae full-length cDNA study: sequencing and initial characterization of 35,575 clones. Genome Biol (2005) 0.93

A cross-species alignment tool (CAT). BMC Bioinformatics (2007) 0.93

Mouse lens connexin23 (Gje1) does not form functional gap junction channels but causes enhanced ATP release from HeLa cells. Eur J Cell Biol (2008) 0.93

Experimental-confirmation and functional-annotation of predicted proteins in the chicken genome. BMC Genomics (2007) 0.92

Quantitative comparison of EST libraries requires compensation for systematic biases in cDNA generation. BMC Bioinformatics (2006) 0.91

Comparative analyses of human single- and multilocus tandem repeats. Genetics (2008) 0.91

Integrating multiple genome annotation databases improves the interpretation of microarray gene expression data. BMC Genomics (2010) 0.91

Bovine Genome Database: integrated tools for genome annotation and discovery. Nucleic Acids Res (2010) 0.90

Systematic identification of pseudogenes through whole genome expression evidence profiling. Nucleic Acids Res (2006) 0.90

Revisiting the missing protein-coding gene catalog of the domestic dog. BMC Genomics (2009) 0.89

Articles cited by this

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res (1997) 665.31

Basic local alignment search tool. J Mol Biol (1990) 659.07

Initial sequencing and analysis of the human genome. Nature (2001) 212.86

tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res (1997) 142.55

BLAT--the BLAST-like alignment tool. Genome Res (2002) 126.78

Initial sequencing and comparative analysis of the mouse genome. Nature (2002) 96.15

Prediction of complete gene structures in human genomic DNA. J Mol Biol (1997) 58.76

The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res (2003) 52.80

dbEST--database for "expressed sequence tags". Nat Genet (1993) 41.10

The UCSC Genome Browser Database. Nucleic Acids Res (2003) 32.84

Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs. Nature (2002) 28.79

The genome sequence of the malaria mosquito Anopheles gambiae. Science (2002) 20.36

Introducing RefSeq and LocusLink: curated human genome resources at the NCBI. Trends Genet (2000) 19.33

GeneWise and Genomewise. Genome Res (2004) 17.87

EnsMart: a generic system for fast and flexible access to biological data. Genome Res (2004) 17.64

EST_GENOME: a program to align spliced DNA sequences to unspliced genomic DNA. Comput Appl Biosci (1997) 17.51

Analysis of compositionally biased regions in sequence databases. Methods Enzymol (1996) 17.11

Evaluation of gene structure prediction programs. Genomics (1996) 8.57

Computational detection and location of transcription start sites in mammalian genomic DNA. Genome Res (2002) 7.89

The DNA sequence and comparative analysis of human chromosome 20. Nature (2002) 7.40

The Ensembl core software libraries. Genome Res (2004) 7.30

The Ensembl analysis pipeline. Genome Res (2004) 5.90

eVOC: a controlled vocabulary for unifying gene expression data. Genome Res (2003) 5.41

WormBase: a cross-species database for comparative genomics. Nucleic Acids Res (2003) 5.28

The DNA sequence and analysis of human chromosome 6. Nature (2003) 4.75

An insect molecular clock dates the origin of the insects and accords with palaeontological and biogeographic landmarks. Mol Biol Evol (2002) 4.22

ESTGenes: alternative splicing from ESTs in Ensembl. Genome Res (2004) 3.85

The DNA sequence of human chromosome 7. Nature (2003) 3.18

The Ensembl computing architecture. Genome Res (2004) 3.06

The Anopheles gambiae genome: an update. Trends Parasitol (2004) 2.10

The DNA sequence and analysis of human chromosome 14. Nature (2003) 2.02

The EMBL Nucleotide Sequence Database. Nucleic Acids Res (1997) 1.71

Identification of human gene structure using linear discriminant functions and dynamic programming. Proc Int Conf Intell Syst Mol Biol (1995) 1.49

Articles by these authors

Initial sequencing and comparative analysis of the mouse genome. Nature (2002) 96.15

Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature (2007) 75.09

Global variation in copy number in the human genome. Nature (2006) 57.50

Genome sequence of the Brown Norway rat yields insights into mammalian evolution. Nature (2004) 24.40

Origins and functional impact of copy number variation in the human genome. Nature (2009) 23.63

Genome sequence, comparative analysis and haplotype structure of the domestic dog. Nature (2005) 23.04

The genome sequence of the malaria mosquito Anopheles gambiae. Science (2002) 20.36

GeneWise and Genomewise. Genome Res (2004) 17.87

A high-resolution survey of deletion polymorphism in the human genome. Nat Genet (2005) 16.99

Evolutionary and biomedical insights from the rhesus macaque genome. Science (2007) 16.21

The genome sequence of Caenorhabditis briggsae: a platform for comparative genomics. PLoS Biol (2003) 13.32

Genome-wide association study of CNVs in 16,000 cases of eight common diseases and 3,000 shared controls. Nature (2010) 12.27

An overview of Ensembl. Genome Res (2004) 10.35

A high-resolution map of human evolutionary constraint using 29 mammals. Nature (2011) 8.67

The genome sequence of taurine cattle: a window to ruminant biology and evolution. Science (2009) 8.23

Genome of the marsupial Monodelphis domestica reveals innovation in non-coding sequences. Nature (2007) 7.91

The Ensembl core software libraries. Genome Res (2004) 7.30

Integrative annotation of 21,037 human genes validated by full-length cDNA clones. PLoS Biol (2004) 7.17

EGASP: the human ENCODE Genome Annotation Assessment Project. Genome Biol (2006) 7.06

Analyses of deep mammalian sequence alignments and constraint predictions for 1% of the human genome. Genome Res (2007) 7.05

The DNA sequence of the human X chromosome. Nature (2005) 6.97

The Ensembl analysis pipeline. Genome Res (2004) 5.90

Genome analysis of the platypus reveals unique signatures of evolution. Nature (2008) 5.74

Initial sequence and comparative analysis of the cat genome. Genome Res (2007) 4.67

Anopheles gambiae immune responses to human and rodent Plasmodium parasite species. PLoS Pathog (2006) 4.67

The functional spectrum of low-frequency coding variation. Genome Biol (2011) 4.42

An initial strategy for the systematic identification of functional elements in the human genome by low-redundancy comparative sequencing. Proc Natl Acad Sci U S A (2005) 4.38

Identifying novel constrained elements by exploiting biased substitution patterns. Bioinformatics (2009) 4.30

The otter annotation system. Genome Res (2004) 3.88

ESTGenes: alternative splicing from ESTs in Ensembl. Genome Res (2004) 3.85

Breaking the waves: improved detection of copy number variation from microarray-based comparative genomic hybridization. Genome Biol (2007) 3.54

Control of alternative splicing through siRNA-mediated transcriptional gene silencing. Nat Struct Mol Biol (2009) 3.47

Biopipe: a flexible framework for protocol-based bioinformatics analysis. Genome Res (2003) 3.45

Databases and tools for browsing genomes. Annu Rev Genomics Hum Genet (2002) 3.40

Molecular evolution and functional characterization of Drosophila insulin-like peptides. PLoS Genet (2010) 2.80

ASTD: The Alternative Splicing and Transcript Diversity database. Genomics (2008) 2.72

Three periods of regulatory innovation during vertebrate evolution. Science (2011) 2.09

Regulation of vertebrate nervous system alternative splicing and development by an SR-related protein. Cell (2009) 1.98

Large-scale comparative analysis of splicing signals and their corresponding splicing factors in eukaryotes. Genome Res (2007) 1.86

DGCR8 HITS-CLIP reveals novel functions for the Microprocessor. Nat Struct Mol Biol (2012) 1.81

Genome annotation techniques: new approaches and challenges. Drug Discov Today (2002) 1.65

Conserved site for neurosteroid modulation of GABA A receptors. Neuropharmacology (2008) 1.56

High-grade glioma formation results from postnatal pten loss or mutant epidermal growth factor receptor expression in a transgenic mouse glioma model. Cancer Res (2006) 1.42

SNPServer: a real-time SNP discovery tool. Nucleic Acids Res (2005) 1.42

Ventral tegmental area BDNF induces an opiate-dependent-like reward state in naive rats. Science (2009) 1.40

Coordinated international action to accelerate genome-to-phenome with FAANG, the Functional Annotation of Animal Genomes project. Genome Biol (2015) 1.40

Key contribution of CPEB4-mediated translational control to cancer progression. Nat Med (2011) 1.37

Nucleosome-driven transcription factor binding and gene regulation. Mol Cell (2012) 1.34

Genome-wide dFOXO targets and topology of the transcriptomic response to stress and insulin signalling. Mol Syst Biol (2011) 1.32

Genome-wide association between branch point properties and alternative splicing. PLoS Comput Biol (2010) 1.24

Deciphering 3'ss selection in the yeast genome reveals an RNA thermosensor that mediates alternative splicing. Mol Cell (2011) 1.22

Methods to study splicing from high-throughput RNA sequencing data. Methods Mol Biol (2014) 1.17

Pyicos: a versatile toolkit for the analysis of high-throughput sequencing data. Bioinformatics (2011) 1.13

B cell survival, surface BCR and BAFFR expression, CD74 metabolism, and CD8- dendritic cells require the intramembrane endopeptidase SPPL2A. J Exp Med (2012) 1.10

The pivotal roles of TIA proteins in 5' splice-site selection of alu exons and across evolution. PLoS Genet (2009) 1.07

The Microprocessor controls the activity of mammalian retrotransposons. Nat Struct Mol Biol (2013) 1.07

Barriers to HPV immunization for African American adolescent females. Vaccine (2012) 1.05

Direct cloning of double-stranded RNAs from RNase protection analysis reveals processing patterns of C/D box snoRNAs and provides evidence for widespread antisense transcript expression. Nucleic Acids Res (2011) 1.01

Comparative gene finding in chicken indicates that we are closing in on the set of multi-exonic widely expressed human genes. Nucleic Acids Res (2005) 1.01

Clonal neural stem cells from human embryonic stem cell colonies. J Neurosci (2012) 1.00

Exon creation and establishment in human genes. Genome Biol (2008) 0.98

Structural basis for the biological relevance of the invariant apical stem in IRES-mediated translation. Nucleic Acids Res (2011) 0.98

Co-evolution of the branch site and SR proteins in eukaryotes. Trends Genet (2008) 0.98

Hog1 bypasses stress-mediated down-regulation of transcription by RNA polymerase II redistribution and chromatin remodeling. Genome Biol (2012) 0.96

Differentiated evolutionary rates in alternative exons and the implications for splicing regulation. BMC Evol Biol (2006) 0.93

Fetal alcohol exposure leads to abnormal olfactory bulb development and impaired odor discrimination in adult mice. Mol Brain (2011) 0.92

BASC: an integrated bioinformatics system for Brassica research. Nucleic Acids Res (2006) 0.91

Rasgrp1 mutation increases naive T-cell CD44 expression and drives mTOR-dependent accumulation of Helios⁺ T cells and autoantibodies. Elife (2013) 0.90

The translational landscape of the splicing factor SRSF1 and its role in mitosis. Elife (2014) 0.89

Islands of euchromatin-like sequence and expressed polymorphic sequences within the short arm of human chromosome 21. Genome Res (2007) 0.89

RNA secondary structure mediates alternative 3'ss selection in Saccharomyces cerevisiae. RNA (2012) 0.88

The RNA-binding protein hnRNPLL induces a T cell alternative splicing program delineated by differential intron retention in polyadenylated RNA. Genome Biol (2014) 0.87

The adult retinal stem cell is a rare cell in the ciliary epithelium whose progeny can differentiate into photoreceptors. Biol Open (2012) 0.87

The 5' untranslated region of the serotonin receptor 2C pre-mRNA generates miRNAs and is expressed in non-neuronal cells. Exp Brain Res (2013) 0.86

Biological database design and implementation. Brief Bioinform (2004) 0.83

Evaluation of the chicken transcriptome by SAGE of B cells and the DT40 cell line. BMC Genomics (2004) 0.83

Zinc-finger protein ZFP318 is essential for expression of IgD, the alternatively spliced Igh product made by mature B lymphocytes. Proc Natl Acad Sci U S A (2014) 0.83

Relationship between genome and epigenome--challenges and requirements for future research. BMC Genomics (2014) 0.82

Predictive models of gene regulation from high-throughput epigenomics data. Comp Funct Genomics (2012) 0.81

Identification of a pathogenic variant in TREX1 in early-onset cerebral systemic lupus erythematosus by Whole-exome sequencing. Arthritis Rheumatol (2014) 0.80

Use of ChIP-Seq data for the design of a multiple promoter-alignment method. Nucleic Acids Res (2012) 0.79

Bone morphogenetic proteins and secreted frizzled related protein 2 maintain the quiescence of adult mammalian retinal stem cells. Stem Cells (2013) 0.78

Approaches to link RNA secondary structures with splicing regulation. Methods Mol Biol (2014) 0.76

Databases and resources for human small non-coding RNAs. Hum Genomics (2011) 0.76

Hydroxyethyl starches should not be used in critically ill patients. J R Army Med Corps (2008) 0.75

Drosha Regulates Gene Expression Independently of RNA Cleavage Function. Cell Rep (2014) 0.75