Association of genes to genetically inherited diseases using data mining.

PubWeight™: 5.78‹?› | Rank: Top 1%

🔗 View Article (PMID 12006977)

Published in Nat Genet on May 13, 2002

Authors

Carolina Perez-Iratxeta1, Peer Bork, Miguel A Andrade

Author Affiliations

1: European Molecular Biology Laboratory, Meyerhofstr.1, Heidelberg 69012, Germany.

Articles citing this

(truncated to the top 100)

The Gene Ontology (GO) project in 2006. Nucleic Acids Res (2006) 13.79

The Human Phenotype Ontology: a tool for annotating and analyzing human hereditary disease. Am J Hum Genet (2008) 8.78

Identifying relationships among genomic disease regions: predicting genes at pathogenic SNP associations and rare deletions. PLoS Genet (2009) 8.39

Walking the interactome for prioritization of candidate disease genes. Am J Hum Genet (2008) 7.03

Genetic variation in an individual human exome. PLoS Genet (2008) 6.68

Network-based global inference of human disease genes. Mol Syst Biol (2008) 5.01

Speeding disease gene discovery by sequence based candidate prioritization. BMC Bioinformatics (2005) 2.99

POCUS: mining genomic sequence annotation to predict disease genes. Genome Biol (2003) 2.93

Computational tools for prioritizing candidate genes: boosting disease gene discovery. Nat Rev Genet (2012) 2.92

Philosophy of science. Machine science. Science (2010) 2.91

Genome-wide identification of genes likely to be involved in human genetic disease. Nucleic Acids Res (2004) 2.86

OntoBlast function: From sequence similarities directly to potential functional annotations by ontology terms. Nucleic Acids Res (2003) 2.57

Integration of text- and data-mining using ontologies successfully selects disease gene candidates. Nucleic Acids Res (2005) 2.46

Genome-wide prioritization of disease genes and identification of disease-disease associations from an integrated human functional linkage network. Genome Biol (2009) 2.43

PASBio: predicate-argument structures for event extraction in molecular biology. BMC Bioinformatics (2004) 2.41

Automatic extraction of mutations from Medline and cross-validation with OMIM. Nucleic Acids Res (2004) 2.40

Identifying gene-disease associations using centrality on a literature mined gene-interaction network. Bioinformatics (2008) 2.32

Analysis of protein sequence and interaction data for candidate disease gene prediction. Nucleic Acids Res (2006) 2.30

Systematic association of genes to phenotypes by genome and literature mining. PLoS Biol (2005) 2.25

Anni 2.0: a multipurpose text-mining tool for the life sciences. Genome Biol (2008) 2.04

Towards precise classification of cancers based on robust gene functional expression profiles. BMC Bioinformatics (2005) 2.01

Computational disease gene identification: a concert of methods prioritizes type 2 diabetes and obesity candidate genes. Nucleic Acids Res (2006) 1.95

Building disease-specific drug-protein connectivity maps from molecular interaction networks and PubMed abstracts. PLoS Comput Biol (2009) 1.90

G2D: a tool for mining genes associated with disease. BMC Genet (2005) 1.85

Prediction of human disease genes by human-mouse conserved coexpression analysis. PLoS Comput Biol (2008) 1.81

An integrated approach to inferring gene-disease associations in humans. Proteins (2008) 1.80

FunSimMat: a comprehensive functional similarity database. Nucleic Acids Res (2007) 1.75

TXTGate: profiling gene groups with text-based information. Genome Biol (2004) 1.72

Text-mining solutions for biomedical research: enabling integrative biology. Nat Rev Genet (2012) 1.68

Ontologies in quantitative biology: a basis for comparison, integration, and discovery. PLoS Biol (2010) 1.62

Computational approaches to phenotyping: high-throughput phenomics. Proc Am Thorac Soc (2007) 1.57

A pathway-based view of human diseases and disease relationships. PLoS One (2009) 1.52

Candidate gene identification approach: progress and challenges. Int J Biol Sci (2007) 1.51

Phenotypic information in genomic variant databases enhances clinical care and research: the International Standards for Cytogenomic Arrays Consortium experience. Hum Mutat (2012) 1.51

Update of the G2D tool for prioritization of gene candidates to inherited diseases. Nucleic Acids Res (2007) 1.42

Improving disease gene prioritization using the semantic similarity of Gene Ontology terms. Bioinformatics (2010) 1.37

Prioritization of disease microRNAs through a human phenome-microRNAome network. BMC Syst Biol (2010) 1.37

Evaluation and integration of 49 genome-wide experiments and the prediction of previously unknown obesity-related genes. Bioinformatics (2007) 1.34

Uncover disease genes by maximizing information flow in the phenome-interactome network. Bioinformatics (2011) 1.32

Phenotype ontologies for mouse and man: bridging the semantic gap. Dis Model Mech (2010) 1.28

GLAD4U: deriving and prioritizing gene lists from PubMed literature. BMC Genomics (2012) 1.17

Target SNP selection in complex disease association studies. BMC Bioinformatics (2004) 1.16

Linking genes to diseases: it's all in the data. Genome Med (2009) 1.14

Network-based Identification of novel cancer genes. Mol Cell Proteomics (2009) 1.12

Hypothesis-driven candidate gene association studies: practical design and analytical considerations. Am J Epidemiol (2009) 1.12

New methods for finding disease-susceptibility genes: impact and potential. Genome Biol (2003) 1.11

WBSMDA: Within and Between Score for MiRNA-Disease Association prediction. Sci Rep (2016) 1.11

Complex diseases, complex genes: keeping pathways on the right track. Epidemiology (2009) 1.10

In silico gene prioritization by integrating multiple data sources. PLoS One (2011) 1.10

Genestrace: phenomic knowledge discovery via structured terminology. Pac Symp Biocomput (2005) 1.09

Dragon TF Association Miner: a system for exploring transcription factor associations through text-mining. Nucleic Acids Res (2004) 1.09

Commonality of functional annotation: a method for prioritization of candidate genes from genome-wide linkage studies. Nucleic Acids Res (2008) 1.07

Dragon Plant Biology Explorer. A text-mining tool for integrating associations between genetic and biochemical entities with genome annotation and biochemical terms lists. Plant Physiol (2005) 1.06

Approaches for recognizing disease genes based on network. Biomed Res Int (2014) 1.06

Chapter 15: disease gene prioritization. PLoS Comput Biol (2013) 1.05

Gendoo: functional profiling of gene and disease features using MeSH vocabulary. Nucleic Acids Res (2009) 1.01

ProDiGe: Prioritization Of Disease Genes with multitask machine learning from positive and unlabeled examples. BMC Bioinformatics (2011) 1.00

Evaluation of genome-wide association study results through development of ontology fingerprints. Bioinformatics (2009) 0.99

Exploiting protein-protein interaction networks for genome-wide disease-gene prioritization. PLoS One (2012) 0.98

Integrating multiple protein-protein interaction networks to prioritize disease genes: a Bayesian regression approach. BMC Bioinformatics (2011) 0.96

A machine learning approach for genome-wide prediction of morbid and druggable human genes based on systems-level data. BMC Genomics (2010) 0.95

The Autoimmune Disease Database: a dynamically compiled literature-derived database. BMC Bioinformatics (2006) 0.94

Data-Driven Approach To Determine Popular Proteins for Targeted Proteomics Translation of Six Organ Systems. J Proteome Res (2016) 0.93

Walking the interactome for candidate prioritization in exome sequencing studies of Mendelian diseases. Bioinformatics (2014) 0.93

Inferring higher functional information for RIKEN mouse full-length cDNA clones with FACTS. Genome Res (2003) 0.92

Integration of multiple data sources to prioritize candidate genes using discounted rating system. BMC Bioinformatics (2010) 0.90

Disease gene identification by random walk on multigraphs merging heterogeneous genomic and phenotype data. BMC Genomics (2012) 0.89

Comparison of automated candidate gene prediction systems using genes implicated in type 2 diabetes by genome-wide association studies. BMC Bioinformatics (2009) 0.89

A survey of data mining methods for linkage disequilibrium mapping. Hum Genomics (2006) 0.88

Integrating human omics data to prioritize candidate genes. BMC Med Genomics (2013) 0.88

Retracted Candidate gene prioritization. Mol Genet Genomics (2012) 0.88

Transactional database transformation and its application in prioritizing human disease genes. IEEE/ACM Trans Comput Biol Bioinform (2011) 0.87

Pathway and network approaches for identification of cancer signature markers from omics data. J Cancer (2015) 0.86

Candidate gene prioritization based on spatially mapped gene expression: an application to XLMR. Bioinformatics (2010) 0.85

The laboratory-clinician team: a professional call to action to improve communication and collaboration for optimal patient care in chromosomal microarray testing. J Genet Couns (2012) 0.85

Identification of highly related references about gene-disease association. BMC Bioinformatics (2014) 0.85

Prioritization of candidate disease genes by topological similarity between disease and protein diffusion profiles. BMC Bioinformatics (2013) 0.85

Text mining in cancer gene and pathway prioritization. Cancer Inform (2014) 0.85

Novel analytical methods applied to type 1 diabetes genome-scan data. Am J Hum Genet (2004) 0.85

Bioinformatics methods for identifying candidate disease genes. Hum Genomics (2006) 0.84

Ensemble positive unlabeled learning for disease gene identification. PLoS One (2014) 0.84

Genetic region characterization (Gene RECQuest) - software to assist in identification and selection of candidate genes from genomic regions. BMC Res Notes (2009) 0.83

Mining breast cancer genes with a network based noise-tolerant approach. BMC Syst Biol (2013) 0.82

BICEPP: an example-based statistical text mining method for predicting the binary characteristics of drugs. BMC Bioinformatics (2011) 0.82

Gene-disease relationship discovery based on model-driven data integration and database view definition. Bioinformatics (2008) 0.82

DomainRBF: a Bayesian regression approach to the prioritization of candidate domains for complex diseases. BMC Syst Biol (2011) 0.82

Gene prioritization of resistant rice gene against Xanthomas oryzae pv. oryzae by using text mining technologies. Biomed Res Int (2013) 0.81

An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods. Artif Intell Med (2014) 0.81

A web tool for finding gene candidates associated with experimentally induced arthritis in the rat. Arthritis Res Ther (2005) 0.80

Identification and analysis of co-occurrence networks with NetCutter. PLoS One (2008) 0.80

Inference of gene-phenotype associations via protein-protein interaction and orthology. PLoS One (2013) 0.79

A literature search tool for intelligent extraction of disease-associated genes. J Am Med Inform Assoc (2013) 0.79

Application of a new probabilistic model for mining implicit associated cancer genes from OMIM and medline. Cancer Inform (2007) 0.79

Prediction of candidate primary immunodeficiency disease genes using a support vector machine learning approach. DNA Res (2009) 0.78

A framework for comparing phenotype annotations of orthologous genes. Stud Health Technol Inform (2010) 0.78

Identification of the causative gene for Simmental arachnomelia syndrome using a network-based disease gene prioritization approach. PLoS One (2013) 0.77

In silico prioritisation of candidate genes for prokaryotic gene function discovery: an application of phylogenetic profiles. BMC Bioinformatics (2009) 0.77

Pinpointing disease genes through phenomic and genomic data fusion. BMC Genomics (2015) 0.77

Systematic Analysis of Integrated Gene Functional Network of Four Chronic Stress-related Lifestyle Disorders. Genome Integr (2015) 0.76

Systematic enrichment analysis of microRNA expression profiling studies in endometriosis. Iran J Basic Med Sci (2015) 0.76

Articles by these authors

Initial sequencing and comparative analysis of the mouse genome. Nature (2002) 96.15

A method and server for predicting damaging missense mutations. Nat Methods (2010) 78.53

Human non-synonymous SNPs: server and survey. Nucleic Acids Res (2002) 50.45

Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature (2002) 45.19

A human gut microbial gene catalogue established by metagenomic sequencing. Nature (2010) 43.63

Comparative metagenomics of microbial communities. Science (2005) 25.88

InterPro: the integrative protein signature database. Nucleic Acids Res (2008) 25.07

Recent improvements to the SMART domain-based sequence annotation resource. Nucleic Acids Res (2002) 25.06

The InterPro Database, 2003 brings increased coverage and new features. Nucleic Acids Res (2003) 24.72

Genome sequence of the Brown Norway rat yields insights into mammalian evolution. Nature (2004) 24.40

Enterotypes of the human gut microbiome. Nature (2011) 24.36

Comparative assessment of large-scale data sets of protein-protein interactions. Nature (2002) 24.25

Proteome survey reveals modularity of the yeast cell machinery. Nature (2006) 20.77

STRING 8--a global view on proteins and their functional interactions in 630 organisms. Nucleic Acids Res (2008) 20.62

The genome sequence of the malaria mosquito Anopheles gambiae. Science (2002) 20.36

SMART 4.0: towards genomic data integration. Nucleic Acids Res (2004) 19.37

The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored. Nucleic Acids Res (2010) 18.73

STRING v9.1: protein-protein interaction networks, with increased coverage and integration. Nucleic Acids Res (2012) 18.26

InterPro, progress and status in 2005. Nucleic Acids Res (2005) 17.53

SMART 5: domains in the context of genomes and networks. Nucleic Acids Res (2006) 17.13

The HUPO PSI's molecular interaction format--a community standard for the representation of protein interaction data. Nat Biotechnol (2004) 16.08

Interactive Tree Of Life (iTOL): an online tool for phylogenetic tree display and annotation. Bioinformatics (2006) 14.96

Toward automatic reconstruction of a highly resolved tree of life. Science (2006) 14.96

InterPro in 2011: new developments in the family and domain prediction database. Nucleic Acids Res (2011) 13.45

New developments in the InterPro database. Nucleic Acids Res (2007) 12.49

STRING 7--recent developments in the integration and prediction of protein interactions. Nucleic Acids Res (2006) 12.16

Interactive Tree Of Life v2: online annotation and display of phylogenetic trees made easy. Nucleic Acids Res (2011) 10.82

STRING: known and predicted protein-protein associations, integrated and transferred across organisms. Nucleic Acids Res (2005) 10.44

SMART 6: recent updates and new developments. Nucleic Acids Res (2008) 9.80

STRING: a database of predicted functional associations between proteins. Nucleic Acids Res (2003) 9.45

Comparative genome and proteome analysis of Anopheles gambiae and Drosophila melanogaster. Science (2002) 9.43

Drug target identification using side-effect similarity. Science (2008) 9.24

A mitochondria-K+ channel axis is suppressed in cancer and its normalization promotes apoptosis and inhibits cancer growth. Cancer Cell (2007) 9.20

SMART 7: recent updates to the protein domain annotation resource. Nucleic Acids Res (2011) 9.15

Bioinformatics in the post-sequence era. Nat Genet (2003) 8.83

mRNA degradation by miRNAs and GW182 requires both CCR4:NOT deadenylase and DCP1:DCP2 decapping complexes. Genes Dev (2006) 8.78

PAL2NAL: robust conversion of protein sequence alignments into the corresponding codon alignments. Nucleic Acids Res (2006) 8.36

Literature mining for the biologist: from information retrieval to biological discovery. Nat Rev Genet (2006) 8.23

Protein disorder prediction: implications for structural proteomics. Structure (2003) 7.93

Minimum information about a marker gene sequence (MIMARKS) and minimum information about any (x) sequence (MIxS) specifications. Nat Biotechnol (2011) 7.53

Alternative splicing and genome complexity. Nat Genet (2001) 7.30

The genome sequence of Bifidobacterium longum reflects its adaptation to the human gastrointestinal tract. Proc Natl Acad Sci U S A (2002) 7.21

Systematic discovery of in vivo phosphorylation networks. Cell (2007) 6.94

Richness of human gut microbiome correlates with metabolic markers. Nature (2013) 6.93

ELM server: A new resource for investigating short functional sites in modular eukaryotic proteins. Nucleic Acids Res (2003) 6.86

A side effect resource to capture phenotypic effects of drugs. Mol Syst Biol (2010) 6.75

The genome of the choanoflagellate Monosiga brevicollis and the origin of metazoans. Nature (2008) 6.69

The ecoresponsive genome of Daphnia pulex. Science (2011) 6.55

The genome of the model beetle and pest Tribolium castaneum. Nature (2008) 6.50

Functional and evolutionary insights from the genomes of three parasitoid Nasonia species. Science (2010) 5.56

Immunity-related genes and gene families in Anopheles gambiae. Science (2002) 5.47

Dynamic complex formation during the yeast cell cycle. Science (2005) 5.11

STITCH: interaction networks of chemicals and proteins. Nucleic Acids Res (2007) 4.88

Genomes in flux: the evolution of archaeal and proteobacterial gene content. Genome Res (2002) 4.85

eggNOG: automated construction and annotation of orthologous groups of genes. Nucleic Acids Res (2007) 4.84

Transcriptome complexity in a genome-reduced bacterium. Science (2009) 4.64

Update on XplorMed: A web server for exploring scientific literature. Nucleic Acids Res (2003) 4.42

Genomic variation landscape of the human gut microbiome. Nature (2012) 4.38

Genome-wide experimental determination of barriers to horizontal gene transfer. Science (2007) 4.37

A genome-wide survey of human pseudogenes. Genome Res (2003) 4.34

KEGG Atlas mapping for global analysis of metabolic pathways. Nucleic Acids Res (2008) 4.14

iPath: interactive exploration of biochemical pathways and networks. Trends Biochem Sci (2008) 4.03

Proteome organization in a genome-reduced bacterium. Science (2009) 3.97

Molecular eco-systems biology: towards an understanding of community function. Nat Rev Microbiol (2008) 3.95

eggNOG v3.0: orthologous groups covering 1133 organisms at 41 different taxonomic ranges. Nucleic Acids Res (2011) 3.94

Target-specific requirements for enhancers of decapping in miRNA-mediated gene silencing. Genes Dev (2007) 3.92

SuperTarget and Matador: resources for exploring drug-target relationships. Nucleic Acids Res (2007) 3.82

eggNOG v4.0: nested orthology inference across 3686 organisms. Nucleic Acids Res (2013) 3.77

Linear motif atlas for phosphorylation-dependent signaling. Sci Signal (2008) 3.77

Medusa: a simple tool for interaction graph analysis. Bioinformatics (2005) 3.74

Prediction of effective genome size in metagenomic samples. Genome Biol (2007) 3.73

Nonsense-mediated mRNA decay in Drosophila: at the intersection of the yeast and mammalian pathways. EMBO J (2003) 3.68

Systematic identification of novel protein domain families associated with nuclear functions. Genome Res (2002) 3.50

SmashCommunity: a metagenomic annotation and analysis tool. Bioinformatics (2010) 3.48

Function prediction and protein networks. Curr Opin Cell Biol (2003) 3.46

Impact of genome reduction on bacterial metabolism and its regulation. Science (2009) 3.45

Extraction of regulatory gene/protein networks from Medline. Bioinformatics (2005) 3.43

A temporal map of transcription factor activity: mef2 directly regulates target genes at all stages of muscle development. Dev Cell (2006) 3.29

Co-evolution of transcriptional and post-translational cell-cycle regulation. Nature (2006) 3.28

The DNA sequence of human chromosome 7. Nature (2003) 3.18

Get the most out of your metagenome: computational analysis of environmental sequence data. Curr Opin Microbiol (2007) 3.15

Accurate and universal delineation of prokaryotic species. Nat Methods (2013) 3.14

STITCH 2: an interaction network database for small molecules and proteins. Nucleic Acids Res (2009) 3.06

Environments shape the nucleotide composition of genomes. EMBO Rep (2005) 2.96

Structure-based assembly of protein complexes in yeast. Science (2004) 2.89

Quantifying environmental adaptation of metabolic pathways in metagenomics. Proc Natl Acad Sci U S A (2009) 2.89

Comparison of computational methods for the identification of cell cycle-regulated genes. Bioinformatics (2004) 2.83

Analysis of genomic context: prediction of functional associations from conserved bidirectionally transcribed gene pairs. Nat Biotechnol (2004) 2.83

SHOT: a web server for the construction of genome phylogenies. Trends Genet (2002) 2.79

ASTD: The Alternative Splicing and Transcript Diversity database. Genomics (2008) 2.72

Worldwide scientific publishing activity. Science (2002) 2.72

A holistic approach to marine eco-systems biology. PLoS Biol (2011) 2.71

The identification of a conserved domain in both spartin and spastin, mutated in hereditary spastic paraplegia. Genomics (2003) 2.69

NetworKIN: a resource for exploring cellular phosphorylation networks. Nucleic Acids Res (2007) 2.69

Evaluation of annotation strategies using an entire genome sequence. Bioinformatics (2003) 2.66

InterPro: an integrated documentation resource for protein families, domains and functional sites. Brief Bioinform (2002) 2.66

Information extraction from full text scientific articles: where are the keywords? BMC Bioinformatics (2003) 2.64