A new disease-specific machine learning approach for the prediction of cancer-causing missense variants.

PubWeight™: 1.27‹?› | Rank: Top 10%

🔗 View Article (PMC 3371640)

Published in Genomics on July 07, 2011

Authors

Emidio Capriotti1, Russ B Altman

Author Affiliations

1: Department of Bioengineering, Stanford University, Stanford, CA 94305, USA. emidio@stanford.edu

Articles citing this

Predicting the functional, molecular, and phenotypic consequences of amino acid substitutions using hidden Markov models. Hum Mutat (2012) 3.60

Predicting the functional consequences of cancer-associated amino acid substitutions. Bioinformatics (2013) 2.12

Identifying Mendelian disease genes with the variant effect scoring tool. BMC Genomics (2013) 1.64

Computational approaches to identify functional genetic variants in cancer genomes. Nat Methods (2013) 1.64

UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches. Bioinformatics (2014) 1.56

Bioinformatics for personal genome interpretation. Brief Bioinform (2012) 1.51

A Molecular Evolutionary Reference for the Human Variome. Mol Biol Evol (2015) 1.40

Collective judgment predicts disease-associated single nucleotide variants. BMC Genomics (2013) 1.20

WS-SNPs&GO: a web server for predicting the deleterious effect of human protein variants using functional annotation. BMC Genomics (2013) 1.14

Use of long term molecular dynamics simulation in predicting cancer associated SNPs. PLoS Comput Biol (2014) 1.05

Bioinformatics and variability in drug response: a protein structural perspective. J R Soc Interface (2012) 1.01

Identifying Highly Penetrant Disease Causal Mutations Using Next Generation Sequencing: Guide to Whole Process. Biomed Res Int (2015) 0.93

Assessing the Pathogenicity of Insertion and Deletion Variants with the Variant Effect Scoring Tool (VEST-Indel). Hum Mutat (2015) 0.85

Computational methods and resources for the interpretation of genomic variants in cancer. BMC Genomics (2015) 0.85

Predicting cancer-associated germline variations in proteins. BMC Genomics (2012) 0.81

In-silico screening of cancer associated mutation on PLK1 protein and its structural consequences. J Mol Model (2013) 0.80

ContrastRank: a new method for ranking putative cancer driver genes and classification of tumor samples. Bioinformatics (2014) 0.79

Interaction-based discovery of functionally important genes in cancers. Nucleic Acids Res (2013) 0.79

An NGS Workflow Blueprint for DNA Sequencing Data and Its Application in Individualized Molecular Oncology. Cancer Inform (2016) 0.78

Blind Prediction of Deleterious Amino Acid Variations with SNPs&GO. Hum Mutat (2017) 0.77

Status quo of annotation of human disease variants. BMC Bioinformatics (2013) 0.76

Knowledge discovery in variant databases using inductive logic programming. Bioinform Biol Insights (2013) 0.75

Oncodomains: A protein domain-centric framework for analyzing rare variants in tumor samples. PLoS Comput Biol (2017) 0.75

Articles cited by this

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res (1997) 665.31

Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet (2000) 336.52

Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature (2007) 144.95

A map of human genome variation from population-scale sequencing. Nature (2010) 121.13

A second generation human haplotype map of over 3.1 million SNPs. Nature (2007) 85.39

dbSNP: the NCBI database of genetic variation. Nucleic Acids Res (2001) 76.97

The consensus coding sequences of human breast and colorectal cancers. Science (2006) 60.02

The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res (2003) 52.80

SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Res (2003) 52.26

An integrated genomic analysis of human glioblastoma multiforme. Science (2008) 51.36

Human non-synonymous SNPs: server and survey. Nucleic Acids Res (2002) 50.45

The genomic landscapes of human breast and colorectal cancers. Science (2007) 38.12

The Pfam protein families database. Nucleic Acids Res (2009) 37.98

COSMIC: mining complete cancer genomes in the Catalogue of Somatic Mutations in Cancer. Nucleic Acids Res (2010) 25.55

Characterization of single-nucleotide polymorphisms in coding regions of human genes. Nat Genet (1999) 24.24

GO::TermFinder--open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes. Bioinformatics (2004) 20.23

Large-scale identification, mapping, and genotyping of single-nucleotide polymorphisms in the human genome. Science (1998) 18.23

Predicting the insurgence of human genetic diseases associated to single point protein mutations with support vector machines and evolutionary information. Bioinformatics (2006) 17.28

SNAP predicts effect of mutations on protein function. Bioinformatics (2008) 15.39

Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics (2000) 11.75

PMUT: a web-based tool for the annotation of pathological mutations on proteins. Bioinformatics (2005) 9.88

Recent advances in neuroblastoma. N Engl J Med (2010) 7.86

A DNA polymorphism discovery resource for research on human genetic variation. Genome Res (1998) 7.44

PANTHER: a browsable database of gene products organized by biological function, using curated protein family and subfamily classification. Nucleic Acids Res (2003) 7.21

Next generation tools for the annotation of human SNPs. Brief Bioinform (2009) 6.61

Cancer-specific high-throughput annotation of somatic mutations: computational prediction of driver missense mutations. Cancer Res (2009) 4.92

AL2CO: calculation of positional conservation in a protein sequence alignment. Bioinformatics (2001) 4.79

I-Mutant2.0: predicting stability changes upon mutation from the protein sequence or structure. Nucleic Acids Res (2005) 4.32

Automated inference of molecular mechanisms of disease from amino acid substitutions. Bioinformatics (2009) 4.31

Performance of mutation pathogenicity prediction methods on missense variants. Hum Mutat (2011) 4.01

Functional annotations improve the predictive score of human disease-related mutations in proteins. Hum Mutat (2009) 3.14

LS-SNP: large-scale annotation of coding non-synonymous SNPs based on multiple information sources. Bioinformatics (2005) 3.05

Coding single-nucleotide polymorphisms associated with complex vs. Mendelian disease: evolutionary evidence for differences in molecular effects. Proc Natl Acad Sci U S A (2004) 3.03

CanPredict: a computational tool for predicting cancer-associated missense mutations. Nucleic Acids Res (2007) 2.86

In silico analysis of missense substitutions using sequence-alignment based methods. Hum Mutat (2008) 2.73

Identification and analysis of deleterious human SNPs. J Mol Biol (2005) 2.67

Bioinformatics approaches and resources for single nucleotide polymorphism functional analysis. Brief Bioinform (2005) 2.60

Distinguishing cancer-associated missense mutations from common polymorphisms. Cancer Res (2007) 2.57

A comparative study of machine-learning methods to predict the effects of single nucleotide polymorphisms on protein function. Bioinformatics (2003) 2.54

Bioinformatics challenges for personalized medicine. Bioinformatics (2011) 2.28

Prediction of the phenotypic effects of non-synonymous single nucleotide polymorphisms using structural and evolutionary information. Bioinformatics (2005) 2.21

A neural-network-based method for predicting protein stability changes upon single point mutations. Bioinformatics (2004) 2.16

Architecture of inherited susceptibility to common cancer. Nat Rev Cancer (2010) 2.14

Over-optimism in bioinformatics research. Bioinformatics (2009) 1.64

Clinical Cancer Advances 2009: major research advances in cancer treatment, prevention, and screening--a report from the American Society of Clinical Oncology. J Clin Oncol (2009) 1.53

Annotating single amino acid polymorphisms in the UniProt/Swiss-Prot knowledgebase. Hum Mutat (2008) 1.50

Predicting deleterious nsSNPs: an analysis of sequence and structural attributes. BMC Bioinformatics (2006) 1.32

Deleterious SNP prediction: be mindful of your training data! Bioinformatics (2007) 1.29

Use of estimated evolutionary strength at the codon level improves the prediction of disease-related protein mutations in humans. Hum Mutat (2008) 1.11

GENETICS. The Human Variome Project. Science (2008) 1.07

MuD: an interactive web server for the prediction of non-neutral substitutions using protein structural data. Nucleic Acids Res (2010) 0.96

Articles by these authors

(truncated to the top 100)

Personal omics profiling reveals dynamic molecular and medical phenotypes. Cell (2012) 12.32

Clinical assessment incorporating a personal genome. Lancet (2010) 10.18

The chemical genomic portrait of yeast: uncovering a phenotype for all genes. Science (2008) 8.52

The incidentalome: a threat to genomic medicine. JAMA (2006) 7.24

Genetics. Genomic research and human subject privacy. Science (2004) 6.56

PharmGKB: the Pharmacogenetics Knowledge Base. Nucleic Acids Res (2002) 6.40

A Bayesian framework for combining heterogeneous data sources for gene function prediction (in Saccharomyces cerevisiae). Proc Natl Acad Sci U S A (2003) 6.19

From pharmacogenomic knowledge acquisition to clinical applications: the PharmGKB as a clinical pharmacogenomic biomarker resource. Biomark Med (2011) 4.05

Knowledge acquisition, consistency checking and concurrency control for Gene Ontology (GO). Bioinformatics (2003) 3.97

SAFA: semi-automated footprinting analysis software for high-throughput quantification of nucleic acid footprinting experiments. RNA (2005) 3.95

Associating genes with gene ontology codes using a maximum entropy analysis of biomedical literature. Genome Res (2002) 3.77

Nonparametric methods for identifying differentially expressed genes in microarray data. Bioinformatics (2002) 3.65

Health-information altruists--a potentially critical resource. N Engl J Med (2005) 3.61

The RNA Ontology Consortium: an open invitation to the RNA community. RNA (2006) 3.60

Creating an online dictionary of abbreviations from MEDLINE. J Am Med Inform Assoc (2002) 3.32

Genetic variants associated with warfarin dose in African-American individuals: a genome-wide association study. Lancet (2013) 3.21

Phased whole-genome genetic risk in a family quartet using a major allele reference sequence. PLoS Genet (2011) 3.20

Data-driven prediction of drug effects and interactions. Sci Transl Med (2012) 2.93

Challenges in the clinical application of whole-genome sequencing. Lancet (2010) 2.92

Eukaryotic regulatory element conservation analysis and identification using comparative genomics. Genome Res (2004) 2.84

PharmGKB: a logical home for knowledge relating genotype to drug response phenotype. Nat Genet (2007) 2.65

Coarse-grained modeling of large RNA molecules with knowledge-based potentials and structural filters. RNA (2009) 2.63

A novel signal detection algorithm for identifying hidden drug-drug interactions in adverse event reports. J Am Med Inform Assoc (2011) 2.53

Clinical implementation of pharmacogenomics: overcoming genetic exceptionalism. Lancet Oncol (2010) 2.52

The pharmacogenetics and pharmacogenomics knowledge base: accentuating the knowledge. Nucleic Acids Res (2007) 2.45

The Pharmacogenomics Research Network Translational Pharmacogenetics Program: overcoming challenges of real-world implementation. Clin Pharmacol Ther (2013) 2.39

Bioinformatics challenges for personalized medicine. Bioinformatics (2011) 2.28

Time to organize the bioinformatics resourceome. PLoS Comput Biol (2005) 2.25

MScanner: a classifier for retrieving Medline citations. BMC Bioinformatics (2008) 2.24

Using binning to maintain confidentiality of medical data. Proc AMIA Symp (2002) 2.23

Web-scale pharmacovigilance: listening to signals from the crowd. J Am Med Inform Assoc (2013) 2.19

Using text to build semantic networks for pharmacogenomics. J Biomed Inform (2010) 2.19

Pharmspresso: a text mining tool for extraction of pharmacogenomic concepts and relationships from full text. BMC Bioinformatics (2009) 2.15

PharmGKB: understanding the effects of individual genetic variants. Drug Metab Rev (2008) 2.09

GAPSCORE: finding gene and protein names one word at a time. Bioinformatics (2004) 2.05

Pharmacogenomics: challenges and opportunities. Ann Intern Med (2006) 1.96

Computational analysis of Plasmodium falciparum metabolism: organizing genomic information to facilitate drug discovery. Genome Res (2004) 1.92

Pharmacogenomics and bioinformatics: PharmGKB. Pharmacogenomics (2010) 1.91

Recent progress in automatically extracting information from the pharmacogenomic literature. Pharmacogenomics (2010) 1.88

Automating data acquisition into ontologies from pharmacogenetics relational data sources using declarative object definitions and XML. Pac Symp Biocomput (2002) 1.81

PharmGKB: the Pharmacogenomics Knowledge Base. Methods Mol Biol (2013) 1.81

PharmGKB: the pharmacogenetics and pharmacogenomics knowledge base. Methods Mol Biol (2005) 1.80

High-throughput single-nucleotide structural mapping by capillary automated footprinting analysis. Nucleic Acids Res (2008) 1.78

Using text analysis to identify functionally coherent gene groups. Genome Res (2002) 1.71

The FEATURE framework for protein function annotation: modeling new functions, improving performance, and extending to novel applications. BMC Genomics (2008) 1.64

Finding haplotype tagging SNPs by use of principal components analysis. Am J Hum Genet (2004) 1.61

MutDB: annotating human variation with functionally relevant data. Bioinformatics (2003) 1.54

Structural inference of native and partially folded RNA by high-throughput contact mapping. Proc Natl Acad Sci U S A (2008) 1.54

Doxorubicin pathways: pharmacodynamics and adverse effects. Pharmacogenet Genomics (2011) 1.53

Discovery and explanation of drug-drug interactions via text mining. Pac Symp Biocomput (2012) 1.53

Training the next generation of informaticians: the impact of "BISTI" and bioinformatics--a report from the American College of Medical Informatics. J Am Med Inform Assoc (2004) 1.53

The computational analysis of scientific literature to define and recognize gene expression clusters. Nucleic Acids Res (2003) 1.44

Independent component analysis: mining microarray data for fundamental human gene expression modules. J Biomed Inform (2010) 1.39

Local kinetic measures of macromolecular structure reveal partitioning among multiple parallel pathways from the earliest steps in the folding of a large RNA molecule. J Mol Biol (2006) 1.38

Metformin pathways: pharmacokinetics and pharmacodynamics. Pharmacogenet Genomics (2012) 1.38

A statistical approach to scanning the biomedical literature for pharmacogenetics knowledge. J Am Med Inform Assoc (2004) 1.37

WebFEATURE: An interactive web tool for identifying and visualizing functional sites on macromolecular structures. Nucleic Acids Res (2003) 1.35

A literature-based method for assessing the functional coherence of a gene group. Bioinformatics (2003) 1.33

A call for the creation of personalized medicine databases. Nat Rev Drug Discov (2006) 1.30

Extracting and characterizing gene-drug relationships from the literature. Pharmacogenetics (2004) 1.30

Using ODIN for a PharmGKB revalidation experiment. Database (Oxford) (2012) 1.29

Integration and publication of heterogeneous text-mined relationships on the Semantic Web. J Biomed Semantics (2011) 1.29

Biomedical term mapping databases. Nucleic Acids Res (2005) 1.29

Very important pharmacogene summary: ABCB1 (MDR1, P-glycoprotein). Pharmacogenet Genomics (2011) 1.29

Informatics confronts drug-drug interactions. Trends Pharmacol Sci (2013) 1.28

Turning limited experimental information into 3D models of RNA. RNA (2010) 1.27

Semiautomated and rapid quantification of nucleic acid footprinting and structure mapping experiments. Nat Protoc (2008) 1.27

PharmGKB summary: methotrexate pathway. Pharmacogenet Genomics (2011) 1.27

Distinct contribution of electrostatics, initial conformational ensemble, and macromolecular stability in RNA folding. Proc Natl Acad Sci U S A (2007) 1.26

Pharmacogenomics: The relevance of emerging genotyping technologies. MLO Med Lab Obs (2006) 1.26

Predicting drug side-effects by chemical systems biology. Genome Biol (2009) 1.26

Large scale study of protein domain distribution in the context of alternative splicing. Nucleic Acids Res (2003) 1.26

PharmGKB summary: very important pharmacogene information for cytochrome P450, family 2, subfamily C, polypeptide 19. Pharmacogenet Genomics (2012) 1.25

Robust recognition of zinc binding sites in proteins. Protein Sci (2007) 1.23

Tools for loading MEDLINE into a local relational database. BMC Bioinformatics (2004) 1.23

Using Petri Net tools to study properties and dynamics of biological systems. J Am Med Inform Assoc (2004) 1.23

Extracting subject demographic information from abstracts of randomized clinical trial reports. Stud Health Technol Inform (2007) 1.22

Cytochrome P450 2D6. Pharmacogenet Genomics (2009) 1.22

The Simbios National Center: Systems Biology in Motion. Proc IEEE Inst Electr Electron Eng (2008) 1.20

Collective judgment predicts disease-associated single nucleotide variants. BMC Genomics (2013) 1.20

Improving the prediction of disease-related variants using protein three-dimensional structure. BMC Bioinformatics (2011) 1.20

Cooperative transcription factor associations discovered using regulatory variation. Proc Natl Acad Sci U S A (2011) 1.18

PharmGKB summary: fluoropyrimidine pathways. Pharmacogenet Genomics (2011) 1.16

Cytochrome P450 2C9-CYP2C9. Pharmacogenet Genomics (2010) 1.16

Improving the prediction of pharmacogenes using text-derived drug-gene relationships. Pac Symp Biocomput (2010) 1.15

Modelling biological processes using workflow and Petri Net models. Bioinformatics (2002) 1.14

Extending and evaluating a warfarin dosing algorithm that includes CYP4F2 and pooled rare variants of CYP2C9. Pharmacogenet Genomics (2010) 1.14

WS-SNPs&GO: a web server for predicting the deleterious effect of human protein variants using functional annotation. BMC Genomics (2013) 1.14

Predicting RNA structure by multiple template homology modeling. Pac Symp Biocomput (2010) 1.12

Platinum pathway. Pharmacogenet Genomics (2009) 1.12

PharmGKB and the International Warfarin Pharmacogenetics Consortium: the changing role for pharmacogenomic databases and single-drug pharmacogenetics. Hum Mutat (2008) 1.11

Identification of promoter regions in the human genome by using a retroviral plasmid library-based functional reporter gene assay. Genome Res (2003) 1.10

Microenvironment analysis and identification of magnesium binding sites in RNA. Nucleic Acids Res (2003) 1.10

Predicting allosteric communication in myosin via a pathway of conserved residues. J Mol Biol (2007) 1.10

Choosing SNPs using feature selection. Proc IEEE Comput Syst Bioinform Conf (2005) 1.09

Coplanar and coaxial orientations of RNA bases and helices. RNA (2007) 1.07

Content-based microarray search using differential expression profiles. BMC Bioinformatics (2010) 1.07

The utility of general purpose versus specialty clinical databases for research: warfarin dose estimation from extracted clinical variables. J Biomed Inform (2010) 1.07

Confidentiality in genome research. Science (2006) 1.06

Knowledge-based instantiation of full atomic detail into coarse-grain RNA 3D structural models. Bioinformatics (2009) 1.06