Automatic clustering of orthologs and in-paralogs from pairwise species comparisons.

PubWeight™: 16.47‹?› | Rank: Top 0.1% | All-Time Top 10000

🔗 View Article (PMID 11743721)

Published in J Mol Biol on December 14, 2001

Authors

M Remm1, C E Storm, E L Sonnhammer

Author Affiliations

1: Center for Genomics and Bioinformatics, Karolinska Institutet, S-17177 Stockholm, Sweden.

Articles citing this

(truncated to the top 100)

The COG database: an updated version includes eukaryotes. BMC Bioinformatics (2003) 60.98

OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res (2003) 33.03

EnsemblCompara GeneTrees: Complete, duplication-aware phylogenetic trees in vertebrates. Genome Res (2008) 12.72

OrthoMCL-DB: querying a comprehensive multi-species collection of ortholog groups. Nucleic Acids Res (2006) 11.43

Phytozome: a comparative platform for green plant genomics. Nucleic Acids Res (2011) 11.32

Inparanoid: a comprehensive database of eukaryotic orthologs. Nucleic Acids Res (2005) 9.90

Assessing performance of orthology detection strategies applied to eukaryotic genomes. PLoS One (2007) 5.62

Arthropod relationships revealed by phylogenomic analysis of nuclear protein-coding sequences. Nature (2010) 5.22

A comprehensive evolutionary classification of proteins encoded in complete eukaryotic genomes. Genome Biol (2004) 4.94

A new measure for functional similarity of gene products based on Gene Ontology. BMC Bioinformatics (2006) 4.82

Genomics of bacteria and archaea: the emerging dynamic view of the prokaryotic world. Nucleic Acids Res (2008) 4.78

Phylogenetic and functional assessment of orthologs inference projects and methods. PLoS Comput Biol (2009) 4.65

RIO: analyzing proteomes by automated phylogenomics using resampled inference of orthologs. BMC Bioinformatics (2002) 4.49

InParanoid 6: eukaryotic ortholog clusters with inparalogs. Nucleic Acids Res (2007) 4.15

Proteinortho: detection of (co-)orthologs in large-scale analysis. BMC Bioinformatics (2011) 4.15

Predicting disease genes using protein-protein interactions. J Med Genet (2006) 4.10

Widespread genome duplications throughout the history of flowering plants. Genome Res (2006) 4.07

WormBase: new content and better access. Nucleic Acids Res (2006) 4.01

The Pristionchus pacificus genome provides a unique perspective on nematode lifestyle and parasitism. Nat Genet (2008) 3.93

HomoMINT: an inferred human network based on orthology mapping of protein interactions discovered in model organisms. BMC Bioinformatics (2005) 3.75

A first-draft human protein-interaction map. Genome Biol (2004) 3.64

Benchmarking ortholog identification methods using functional genomics data. Genome Biol (2006) 3.54

PlnTFDB: updated content and new features of the plant transcription factor database. Nucleic Acids Res (2009) 3.12

Phylogenetic reconstruction of orthology, paralogy, and conserved synteny for dog and human. PLoS Comput Biol (2006) 3.07

PlnTFDB: an integrative plant transcription factor database. BMC Bioinformatics (2007) 3.01

Experimental analysis of the Arabidopsis mitochondrial proteome highlights signaling and regulatory components, provides assessment of targeting prediction programs, and indicates plant-specific mitochondrial proteins. Plant Cell (2003) 2.96

Deciphering protein kinase specificity through large-scale analysis of yeast phosphorylation site motifs. Sci Signal (2010) 2.95

Rational association of genes with traits using a genome-scale gene network for Arabidopsis thaliana. Nat Biotechnol (2010) 2.90

A phenotypic profile of the Candida albicans regulatory network. PLoS Genet (2009) 2.86

Conservation of core gene expression in vertebrate tissues. J Biol (2009) 2.84

Combining bioinformatics and phylogenetics to identify large sets of single-copy orthologous genes (COSII) for comparative, evolutionary and systematic studies: a test case in the euasterid plant clade. Genetics (2006) 2.76

Genome-wide analysis of Notch signalling in Drosophila by transgenic RNAi. Nature (2009) 2.68

A human MAP kinase interactome. Nat Methods (2010) 2.62

Systematic discovery of nonobvious human disease models through orthologous phenotypes. Proc Natl Acad Sci U S A (2010) 2.61

Clusters of orthologous genes for 41 archaeal genomes and implications for evolutionary genomics of archaea. Biol Direct (2007) 2.53

Genome-wide analysis of mRNA decay rates and their determinants in Arabidopsis thaliana. Plant Cell (2007) 2.51

Arabidopsis reactome: a foundation knowledgebase for plant systems biology. Plant Cell (2008) 2.50

Systematic identification of functional orthologs based on protein network comparison. Genome Res (2006) 2.47

A predicted interactome for Arabidopsis. Plant Physiol (2007) 2.41

Algorithm of OMA for large-scale orthology inference. BMC Bioinformatics (2008) 2.37

Comparative gene expression analysis by differential clustering approach: application to the Candida albicans transcription program. PLoS Genet (2005) 2.36

HaMStR: profile hidden markov model based search for orthologs in ESTs. BMC Evol Biol (2009) 2.35

Berkeley PHOG: PhyloFacts orthology group prediction web server. Nucleic Acids Res (2009) 2.35

Orthology prediction at scalable resolution by phylogenetic tree analysis. BMC Bioinformatics (2007) 2.29

What properties characterize the hub proteins of the protein-protein interaction network of Saccharomyces cerevisiae? Genome Biol (2006) 2.27

A protein interaction framework for human mRNA degradation. Genome Res (2004) 2.25

Robust and accurate prediction of residue-residue interactions across protein interfaces using evolutionary information. Elife (2014) 2.23

The Candida genome database incorporates multiple Candida species: multispecies search and analysis tools with curated gene and protein information for Candida albicans and Candida glabrata. Nucleic Acids Res (2011) 2.18

Computational methods for Gene Orthology inference. Brief Bioinform (2011) 2.16

Importance of lineage-specific expansion of plant tandem duplicates in the adaptive response to environmental stimuli. Plant Physiol (2008) 2.14

MINE: Module Identification in Networks. BMC Bioinformatics (2011) 2.08

Functional and evolutionary implications of gene orthology. Nat Rev Genet (2013) 2.07

OrthoList: a compendium of C. elegans genes with human orthologs. PLoS One (2011) 2.05

Evolutionary conservation and selection of human disease gene orthologs in the rat and mouse genomes. Genome Biol (2004) 2.05

Improving the specificity of high-throughput ortholog prediction. BMC Bioinformatics (2006) 2.01

Accelerating the reconstruction of genome-scale metabolic networks. BMC Bioinformatics (2006) 1.97

FIGENIX: intelligent automation of genomic annotation: expertise integration in a new software platform. BMC Bioinformatics (2005) 1.97

Duplicated genes evolve slower than singletons despite the initial rate increase. BMC Evol Biol (2004) 1.95

The Aspergillus Genome Database, a curated comparative genomics resource for gene, protein and sequence information for the Aspergillus research community. Nucleic Acids Res (2009) 1.94

PanOCT: automated clustering of orthologs using conserved gene neighborhood for pan-genomic analysis of bacterial strains and closely related species. Nucleic Acids Res (2012) 1.91

Computationally driven, quantitative experiments discover genes required for mitochondrial biogenesis. PLoS Genet (2009) 1.90

Automated protein subfamily identification and classification. PLoS Comput Biol (2007) 1.89

Detecting non-orthology in the COGs database and other approaches grouping orthologs using genome-specific best hits. Nucleic Acids Res (2006) 1.87

Evolution and topology in the yeast protein interaction network. Genome Res (2004) 1.86

ALF--a simulation framework for genome evolution. Mol Biol Evol (2011) 1.86

Birth and death of protein domains: a simple model of evolution explains power law behavior. BMC Evol Biol (2002) 1.86

Ancestral paralogs and pseudoparalogs and their role in the emergence of the eukaryotic cell. Nucleic Acids Res (2005) 1.77

The genome of Eucalyptus grandis. Nature (2014) 1.76

InParanoid 8: orthology analysis between 273 proteomes, mostly eukaryotic. Nucleic Acids Res (2014) 1.75

Broad network-based predictability of Saccharomyces cerevisiae gene loss-of-function phenotypes. Genome Biol (2007) 1.73

YOGY: a web-based, integrated database to retrieve protein orthologs and associated Gene Ontology terms. Nucleic Acids Res (2006) 1.73

Comparative genomics of the social amoebae Dictyostelium discoideum and Dictyostelium purpureum. Genome Biol (2011) 1.72

Protein abundances are more conserved than mRNA abundances across diverse taxa. Proteomics (2010) 1.71

De novo assembly and validation of planaria transcriptome by massive parallel sequencing and shotgun proteomics. Genome Res (2011) 1.70

Comparative proteogenomics: combining mass spectrometry and comparative genomics to analyze multiple genomes. Genome Res (2008) 1.68

mGene: accurate SVM-based gene finding with an application to nematode genomes. Genome Res (2009) 1.67

A low-polynomial algorithm for assembling clusters of orthologous groups from intergenomic symmetric best matches. Bioinformatics (2010) 1.67

A practical approach to phylogenomics: the phylogeny of ray-finned fish (Actinopterygii) as a case study. BMC Evol Biol (2007) 1.66

Whole genome assembly of a natto production strain Bacillus subtilis natto from very short read data. BMC Genomics (2010) 1.66

OrthoMaM: a database of orthologous genomic markers for placental mammal phylogenetics. BMC Evol Biol (2007) 1.65

Predicting genetic modifier loci using functional gene networks. Genome Res (2010) 1.59

ChlamyCyc: an integrative systems biology database and web-portal for Chlamydomonas reinhardtii. BMC Genomics (2009) 1.58

AnoEST: toward A. gambiae functional genomics. Genome Res (2005) 1.58

MIPSPlantsDB--plant database resource for integrative and comparative plant genome research. Nucleic Acids Res (2007) 1.58

Towards complete sets of farnesylated and geranylgeranylated proteins. PLoS Comput Biol (2007) 1.57

Exploring the symbiotic pangenome of the nitrogen-fixing bacterium Sinorhizobium meliloti. BMC Genomics (2011) 1.55

Correlation between sequence conservation and the genomic context after gene duplication. Nucleic Acids Res (2005) 1.54

Inferring mouse gene functions from genomic-scale data using a combined functional network/classification strategy. Genome Biol (2008) 1.54

Global analysis of mRNA stability in the archaeon Sulfolobus. Genome Biol (2006) 1.53

A map of human protein interactions derived from co-expression of human mRNAs and their orthologs. Mol Syst Biol (2008) 1.53

Sequence resources at the Candida Genome Database. Nucleic Acids Res (2006) 1.52

Hierarchical clustering algorithm for comprehensive orthologous-domain classification in multiple genomes. Nucleic Acids Res (2006) 1.51

Global alignment of protein-protein interaction networks by graph matching methods. Bioinformatics (2009) 1.49

Comparative bacterial proteomics: analysis of the core genome concept. PLoS One (2008) 1.49

Mapping metabolic and transcript temporal switches during germination in rice highlights specific transcription factors and the role of RNA instability in the germination process. Plant Physiol (2008) 1.45

Stable evolutionary signal in a yeast protein interaction network. BMC Evol Biol (2006) 1.45

PlantMarkers--a database of predicted molecular markers from plants. Nucleic Acids Res (2005) 1.44

The rough guide to in silico function prediction, or how to use sequence and structure information to predict protein function. PLoS Comput Biol (2008) 1.44

The ANISEED database: digital representation, formalization, and elucidation of a chordate developmental program. Genome Res (2010) 1.43

GreenPhylDB: a database for plant comparative genomics. Nucleic Acids Res (2007) 1.43