A framework for variation discovery and genotyping using next-generation DNA sequencing data.

PubWeight™: 59.36‹?› | Rank: Top 0.01% | All-Time Top 1000

🔗 View Article (PMC 3083463)

Published in Nat Genet on April 10, 2011

Authors

Mark A DePristo1, Eric Banks, Ryan Poplin, Kiran V Garimella, Jared R Maguire, Christopher Hartl, Anthony A Philippakis, Guillermo del Angel, Manuel A Rivas, Matt Hanna, Aaron McKenna, Tim J Fennell, Andrew M Kernytsky, Andrey Y Sivachenko, Kristian Cibulskis, Stacey B Gabriel, David Altshuler, Mark J Daly

Author Affiliations

1: Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA. depristo@broadinstitute.org

Articles citing this

(truncated to the top 100)

The mutational landscape of head and neck squamous cell carcinoma. Science (2011) 16.88

Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat Biotechnol (2013) 16.13

Sporadic autism exomes reveal a highly interconnected protein network of de novo mutations. Nature (2012) 14.76

Exome sequencing as a tool for Mendelian disease gene discovery. Nat Rev Genet (2011) 14.29

Patterns and rates of exonic de novo mutations in autism spectrum disorders. Nature (2012) 13.71

A systematic survey of loss-of-function variants in human protein-coding genes. Science (2012) 12.25

Analysis of protein-coding genetic variation in 60,706 humans. Nature (2016) 11.83

TREM2 variants in Alzheimer's disease. N Engl J Med (2012) 11.35

SF3B1 and other novel cancer genes in chronic lymphocytic leukemia. N Engl J Med (2011) 11.07

Evolution and impact of subclonal mutations in chronic lymphocytic leukemia. Cell (2013) 9.24

Accurate and comprehensive sequencing of personal genomes. Genome Res (2011) 8.99

From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline. Curr Protoc Bioinformatics (2013) 8.79

Genotype and SNP calling from next-generation sequencing data. Nat Rev Genet (2011) 8.34

A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics (2011) 8.19

Development and validation of a clinical cancer genomic profiling test based on massively parallel DNA sequencing. Nat Biotechnol (2013) 7.97

A high-coverage genome sequence from an archaic Denisovan individual. Science (2012) 7.89

Non-invasive analysis of acquired resistance to cancer therapy by sequencing of plasma DNA. Nature (2013) 7.42

Discovery and genotyping of genome structural polymorphism by sequencing on a population scale. Nat Genet (2011) 6.67

Genome sequencing of pediatric medulloblastoma links catastrophic DNA rearrangements with TP53 mutations. Cell (2012) 6.07

Efficient de novo assembly of large genomes using compressed data structures. Genome Res (2011) 6.05

A polygenic burden of rare disruptive mutations in schizophrenia. Nature (2014) 5.99

Repetitive DNA and next-generation sequencing: computational challenges and solutions. Nat Rev Genet (2011) 5.58

ACMG clinical laboratory standards for next-generation sequencing. Genet Med (2013) 5.30

Rapid whole-genome sequencing for genetic disease diagnosis in neonatal intensive care units. Sci Transl Med (2012) 5.16

Recurrent R-spondin fusions in colon cancer. Nature (2012) 5.10

De novo mutations in the autophagy gene WDR45 cause static encephalopathy of childhood with neurodegeneration in adulthood. Nat Genet (2013) 4.71

Comprehensive genomic analysis identifies SOX2 as a frequently amplified gene in small-cell lung cancer. Nat Genet (2012) 4.70

Double digest RADseq: an inexpensive method for de novo SNP discovery and genotyping in model and non-model species. PLoS One (2012) 4.52

De novo germline and postzygotic mutations in AKT3, PIK3R2 and PIK3CA cause a spectrum of related megalencephaly syndromes. Nat Genet (2012) 4.51

The Subread aligner: fast, accurate and scalable read mapping by seed-and-vote. Nucleic Acids Res (2013) 4.45

The functional spectrum of low-frequency coding variation. Genome Biol (2011) 4.42

Mutational analysis reveals the origin and therapy-driven evolution of recurrent glioma. Science (2013) 4.42

Characterizing and measuring bias in sequence data. Genome Biol (2013) 4.39

Genomic variation landscape of the human gut microbiome. Nature (2012) 4.38

Artemis: an integrated platform for visualization and analysis of high-throughput sequence-based experimental data. Bioinformatics (2011) 4.31

Exome sequencing and the genetic basis of complex traits. Nat Genet (2012) 4.11

Integrating human sequence data sets provides a resource of benchmark SNP and indel genotype calls. Nat Biotechnol (2014) 4.07

Clonal evolution of preleukemic hematopoietic stem cells precedes human acute myeloid leukemia. Sci Transl Med (2012) 4.05

ESR1 ligand-binding domain mutations in hormone-resistant breast cancer. Nat Genet (2013) 4.03

Whole-genome sequencing in autism identifies hot spots for de novo germline mutation. Cell (2012) 4.03

A form of the metabolic syndrome associated with mutations in DYRK1B. N Engl J Med (2014) 4.01

A framework for the interpretation of de novo mutation in human disease. Nat Genet (2014) 4.00

Association of Arrhythmia-Related Genetic Variants With Phenotypes Documented in Electronic Medical Records. JAMA (2016) 3.97

Noninvasive whole-genome sequencing of a human fetus. Sci Transl Med (2012) 3.94

Low concordance of multiple variant-calling pipelines: practical implications for exome and genome sequencing. Genome Med (2013) 3.90

Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome. Nat Biotechnol (2012) 3.83

Using exome sequencing to reveal mutations in TREM2 presenting as a frontotemporal dementia-like syndrome without bone involvement. JAMA Neurol (2013) 3.70

A survey of tools for variant analysis of next-generation genome sequencing data. Brief Bioinform (2013) 3.60

Sequencing depth and coverage: key considerations in genomic analyses. Nat Rev Genet (2014) 3.52

Discovery and statistical genotyping of copy-number variation from whole-exome sequencing depth. Am J Hum Genet (2012) 3.50

TASSEL-GBS: a high capacity genotyping by sequencing analysis pipeline. PLoS One (2014) 3.45

Genome mapping on nanochannel arrays for structural variation analysis and sequence assembly. Nat Biotechnol (2012) 3.43

Predicting immunogenic tumour mutations by combining mass spectrometry and exome sequencing. Nature (2014) 3.42

Rare-variant association analysis: study designs and statistical tests. Am J Hum Genet (2014) 3.42

Genomic epidemiology of the Escherichia coli O104:H4 outbreaks in Europe, 2011. Proc Natl Acad Sci U S A (2012) 3.42

An integrative variant analysis suite for whole exome next-generation sequencing data. BMC Bioinformatics (2012) 3.41

Effects of the absence of apolipoprotein e on lipoproteins, neurocognitive function, and retinal function. JAMA Neurol (2014) 3.39

The next-generation sequencing revolution and its impact on genomics. Cell (2013) 3.35

The advantages of SMRT sequencing. Genome Biol (2013) 3.33

Next-generation sequencing data interpretation: enhancing reproducibility and accessibility. Nat Rev Genet (2012) 3.29

Exome sequencing identifies rare LDLR and APOA5 alleles conferring risk for myocardial infarction. Nature (2014) 3.29

Effectiveness of exome and genome sequencing guided by acuity of illness for diagnosis of neurodevelopmental disorders. Sci Transl Med (2014) 3.28

JointSNVMix: a probabilistic model for accurate detection of somatic mutations in normal/tumour paired next-generation sequencing data. Bioinformatics (2012) 3.21

Genomic sequencing of meningiomas identifies oncogenic SMO and AKT1 mutations. Nat Genet (2013) 3.19

Identification of multipotent luminal progenitor cells in human prostate organoid cultures. Cell (2014) 3.05

Great ape genetic diversity and population history. Nature (2013) 2.95

Emergence of constitutively active estrogen receptor-α mutations in pretreated advanced estrogen receptor-positive breast cancer. Clin Cancer Res (2014) 2.92

Pacific biosciences sequencing technology for genotyping and variation discovery in human data. BMC Genomics (2012) 2.92

Prions are a common mechanism for phenotypic inheritance in wild yeasts. Nature (2012) 2.89

Assessment of 2q23.1 microdeletion syndrome implicates MBD5 as a single causal locus of intellectual disability, epilepsy, and autism spectrum disorder. Am J Hum Genet (2011) 2.85

Integrating mapping-, assembly- and haplotype-based approaches for calling variants in clinical sequencing applications. Nat Genet (2014) 2.84

Comment on "Widespread RNA and DNA sequence differences in the human transcriptome". Science (2012) 2.79

Comparison of solution-based exome capture methods for next generation sequencing. Genome Biol (2011) 2.78

Upper Palaeolithic Siberian genome reveals dual ancestry of Native Americans. Nature (2013) 2.75

Whole-exome sequencing of circulating tumor cells provides a window into metastatic prostate cancer. Nat Biotechnol (2014) 2.75

Exome sequencing resolves apparent incidental findings and reveals further complexity of SH3TC2 variant alleles causing Charcot-Marie-Tooth neuropathy. Genome Med (2013) 2.72

Exome sequencing can improve diagnosis and alter patient management. Sci Transl Med (2012) 2.70

Lanosterol reverses protein aggregation in cataracts. Nature (2015) 2.69

Exome sequencing identifies mutation in CNOT3 and ribosomal genes RPL5 and RPL10 in T-cell acute lymphoblastic leukemia. Nat Genet (2012) 2.64

Comprehensive genomic analysis of rhabdomyosarcoma reveals a landscape of alterations affecting a common genetic axis in fusion-positive and fusion-negative tumors. Cancer Discov (2014) 2.64

Mutations in SWI/SNF chromatin remodeling complex gene ARID1B cause Coffin-Siris syndrome. Nat Genet (2012) 2.63

Landscape of genomic alterations in cervical carcinomas. Nature (2013) 2.61

Dynamic population changes in Mycobacterium tuberculosis during acquisition and fixation of drug resistance in patients. J Infect Dis (2012) 2.53

Evolution of Darwin's finches and their beaks revealed by genome sequencing. Nature (2015) 2.50

TGFB2 mutations cause familial thoracic aortic aneurysms and dissections associated with mild systemic features of Marfan syndrome. Nat Genet (2012) 2.50

Extremely low-coverage sequencing and imputation increases power for genome-wide association studies. Nat Genet (2012) 2.50

Exploring single-sample SNP and INDEL calling with whole-genome de novo assembly. Bioinformatics (2012) 2.46

TET2 mutations predict response to hypomethylating agents in myelodysplastic syndrome patients. Blood (2014) 2.44

Estimating the human mutation rate using autozygosity in a founder population. Nat Genet (2012) 2.44

The malaria parasite Plasmodium vivax exhibits greater genetic diversity than Plasmodium falciparum. Nat Genet (2012) 2.43

Long-term culture of genome-stable bipotent stem cells from adult human liver. Cell (2014) 2.43

Comparing strategies to fine-map the association of common SNPs at chromosome 9p21 with type 2 diabetes and myocardial infarction. Nat Genet (2011) 2.42

A de novo mutation in the β-tubulin gene TUBB4A results in the leukoencephalopathy hypomyelination with atrophy of the basal ganglia and cerebellum. Am J Hum Genet (2013) 2.42

Sequence kernel association tests for the combined effect of rare and common variants. Am J Hum Genet (2013) 2.41

Genome-wide evidence for speciation with gene flow in Heliconius butterflies. Genome Res (2013) 2.40

Relapse-specific mutations in NT5C2 in childhood acute lymphoblastic leukemia. Nat Genet (2013) 2.38

Toward better understanding of artifacts in variant calling from high-coverage samples. Bioinformatics (2014) 2.37

Launching genomics into the cloud: deployment of Mercury, a next generation sequence analysis pipeline. BMC Bioinformatics (2014) 2.37

cn.MOPS: mixture of Poissons for discovering copy number variations in next-generation sequencing data with a low false discovery rate. Nucleic Acids Res (2012) 2.36

Natural variation in genome architecture among 205 Drosophila melanogaster Genetic Reference Panel lines. Genome Res (2014) 2.35

Articles cited by this

Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol (2009) 235.12

The Sequence Alignment/Map format and SAMtools. Bioinformatics (2009) 232.39

Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics (2009) 190.94

Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res (2008) 157.44

A map of human genome variation from population-scale sequencing. Nature (2010) 121.13

Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res (1998) 106.16

The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res (2010) 97.51

Accurate whole human genome sequencing using reversible terminator chemistry. Nature (2008) 90.20

The complete genome of an individual by massively parallel DNA sequencing. Nature (2008) 52.81

SSAHA: a fast search method for large DNA databases. Genome Res (2001) 48.64

SOAP2: an improved ultrafast tool for short read alignment. Bioinformatics (2009) 39.47

Targeted capture and massively parallel sequencing of 12 human exomes. Nature (2009) 33.96

Exome sequencing identifies the cause of a mendelian disorder. Nat Genet (2009) 32.06

The landscape of somatic copy-number alteration across human cancers. Nature (2010) 31.88

Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing. Nat Biotechnol (2009) 27.17

Substantial biases in ultra-short read data sets from high-throughput DNA sequencing. Nucleic Acids Res (2008) 26.36

A comprehensive catalogue of somatic mutations from a human cancer genome. Nature (2009) 24.27

Human genome sequencing using unchained base reads on self-assembling DNA nanoarrays. Science (2009) 21.24

A draft sequence of the Neandertal genome. Science (2010) 19.55

Analysis of genetic inheritance in a family quartet by whole-genome sequencing. Science (2010) 18.45

VarScan: variant detection in massively parallel sequencing of individual and pooled samples. Bioinformatics (2009) 16.04

SNP detection for massively parallel whole-genome resequencing. Genome Res (2009) 15.96

Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two-base encoding. Genome Res (2009) 15.15

Sequencing of 50 human exomes reveals adaptation to high altitude. Science (2010) 11.27

Quality scores and SNP detection in sequencing-by-synthesis systems. Genome Res (2008) 10.49

Searching for SNPs with cloud computing. Genome Biol (2009) 10.12

The mutation spectrum revealed by paired genome sequences from a lung cancer patient. Nature (2010) 10.04

Variation in genome-wide mutation rates within and between human families. Nat Genet (2011) 8.84

Mapping human genetic diversity in Asia. Science (2009) 7.40

Discovery and genotyping of genome structural polymorphism by sequencing on a population scale. Nat Genet (2011) 6.67

Genomewide comparison of DNA sequences between humans and chimpanzees. Am J Hum Genet (2002) 5.80

Simultaneous genotype calling and haplotype phasing improves genotype accuracy and reduces false-positive associations for genome-wide association studies. Am J Hum Genet (2009) 5.65

A SNP discovery method to assess variant allele probability from next-generation resequencing data. Genome Res (2009) 4.78

Accurate SNP and mutation detection by targeted custom microarray-based genomic enrichment of short-fragment sequencing libraries. Nucleic Acids Res (2010) 3.23

High quality SNP calling using Illumina data at shallow coverage. Bioinformatics (2010) 2.82

Adjust quality scores from alignment and improve sequencing accuracy. Nucleic Acids Res (2004) 2.76

Single nucleotide variation analysis in 65 candidate genes for CNS disorders in a representative sample of the European population. Genome Res (2003) 2.75

A probabilistic approach for SNP discovery in high-throughput human resequencing data. Genome Res (2009) 2.55

Articles by these authors

PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet (2007) 209.92

A map of human genome variation from population-scale sequencing. Nature (2010) 121.13

The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res (2010) 97.51

A second generation human haplotype map of over 3.1 million SNPs. Nature (2007) 85.39

PGC-1alpha-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nat Genet (2003) 53.59

Genome-wide association analysis identifies loci for type 2 diabetes and triglyceride levels. Science (2007) 51.70

The structure of haplotype blocks in the human genome. Science (2002) 50.88

Meta-analysis of genome-wide association data and large-scale replication identifies additional susceptibility loci for type 2 diabetes. Nat Genet (2008) 35.06

Genome-wide association studies for common diseases and complex traits. Nat Rev Genet (2005) 33.96

Integrating common and rare genetic variation in diverse human populations. Nature (2010) 32.30

The landscape of somatic copy-number alteration across human cancers. Nature (2010) 31.88

The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature (2012) 31.78

Genome-wide association defines more than 30 distinct susceptibility loci for Crohn's disease. Nat Genet (2008) 30.20

Somatic mutations affect key pathways in lung adenocarcinoma. Nature (2008) 30.02

Biological, clinical and population relevance of 95 loci for blood lipids. Nature (2010) 28.21

A genome-wide association study identifies IL23R as an inflammatory bowel disease gene. Science (2006) 27.49

The variant call format and VCFtools. Bioinformatics (2011) 25.88

Efficiency and power in genetic association studies. Nat Genet (2005) 25.56

Genome sequence, comparative analysis and haplotype structure of the domestic dog. Nature (2005) 23.04

Six new loci associated with body mass index highlight a neuronal influence on body weight regulation. Nat Genet (2008) 22.35

Detecting recent positive selection in the human genome from haplotype structure. Nature (2002) 22.00

Common variants at 30 loci contribute to polygenic dyslipidemia. Nat Genet (2008) 20.66

Association between microdeletion and microduplication at 16p11.2 and autism. N Engl J Med (2008) 19.71

Integrated detection and population-genetic analysis of SNPs and copy number variation. Nat Genet (2008) 19.55

Genome-wide association study identifies new susceptibility loci for Crohn disease and implicates autophagy in disease pathogenesis. Nat Genet (2007) 19.08

New genetic loci implicated in fasting glucose homeostasis and their impact on type 2 diabetes risk. Nat Genet (2010) 17.89

Genome-wide meta-analysis increases to 71 the number of confirmed Crohn's disease susceptibility loci. Nat Genet (2010) 17.38

Initial genome sequencing and analysis of multiple myeloma. Nature (2011) 17.28

Genome-wide detection and characterization of positive selection in human populations. Nature (2007) 17.27

Evolution and functional impact of rare coding variation from deep sequencing of human exomes. Science (2012) 17.12

Risk alleles for multiple sclerosis identified by a genomewide study. N Engl J Med (2007) 17.06

Twelve type 2 diabetes susceptibility loci identified through large-scale association analysis. Nat Genet (2010) 16.96

The mutational landscape of head and neck squamous cell carcinoma. Science (2011) 16.88

Characterizing the cancer genome in lung adenocarcinoma. Nature (2007) 16.48

Assessing the impact of population stratification on genetic association studies. Nat Genet (2004) 16.28

Host-microbe interactions have shaped the genetic architecture of inflammatory bowel disease. Nature (2012) 16.13

Replicating genotype-phenotype associations. Nature (2007) 16.11

Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs. Nat Genet (2008) 15.89

Common deletion polymorphisms in the human genome. Nat Genet (2006) 15.66

A high-resolution HLA and SNP haplotype map for disease association studies in the extended human MHC. Nat Genet (2006) 15.63

Genetic mapping in human disease. Science (2008) 15.12

Calibrating a coalescent simulation of human genome sequence variation. Genome Res (2005) 15.04

Evaluating and improving power in whole-genome association studies using fixed marker sets. Nat Genet (2006) 14.76

Multiple regions within 8q24 independently affect risk for prostate cancer. Nat Genet (2007) 14.37

The genomic complexity of primary human prostate cancer. Nature (2011) 14.06

Patterns and rates of exonic de novo mutations in autism spectrum disorders. Nature (2012) 13.71

Large-scale association analysis identifies 13 new susceptibility loci for coronary artery disease. Nat Genet (2011) 13.25

Genetic variants in novel pathways influence blood pressure and cardiovascular disease risk. Nature (2011) 13.25

Genetic risk and a primary role for cell-mediated immune mechanisms in multiple sclerosis. Nature (2011) 13.23

A landscape of driver mutations in melanoma. Cell (2012) 12.61

Genome-wide association scan in women with systemic lupus erythematosus identifies susceptibility variants in ITGAM, PXK, KIAA1542 and other loci. Nat Genet (2008) 12.51

Genome-wide association study identifies eight loci associated with blood pressure. Nat Genet (2009) 12.44

Efficient control of population structure in model organism association mapping. Genetics (2008) 12.32

A systematic survey of loss-of-function variants in human protein-coding genes. Science (2012) 12.25

TRAF1-C5 as a risk locus for rheumatoid arthritis--a genomewide study. N Engl J Med (2007) 12.24

Prepublication data sharing. Nature (2009) 12.24