A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase.

PubWeight™: 28.32‹?› | Rank: Top 0.01% | All-Time Top 10000

🔗 View Article (PMC 1424677)

Published in Am J Hum Genet on February 17, 2006

Authors

Paul Scheet1, Matthew Stephens

Author Affiliations

1: Department of Statistics, University of Washington, Seattle, 98195-4322, USA. pscheet@alum.wustl.edu

Articles citing this

(truncated to the top 100)

A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet (2009) 30.09

A map of recent positive selection in the human genome. PLoS Biol (2006) 29.19

MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes. Genet Epidemiol (2010) 26.41

Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. Am J Hum Genet (2007) 24.68

Genotype imputation. Annu Rev Genomics Hum Genet (2009) 18.64

A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals. Am J Hum Genet (2009) 17.80

Imputation-based analysis of association studies: candidate regions and quantitative traits. PLoS Genet (2007) 15.55

Genotype imputation for genome-wide association studies. Nat Rev Genet (2010) 14.59

Fast and accurate genotype imputation in genome-wide association studies through pre-phasing. Nat Genet (2012) 11.29

Family-based association tests for genomewide association scans. Am J Hum Genet (2007) 10.67

Detection of sharing by descent, long-range phasing and haplotype imputation. Nat Genet (2008) 9.69

Convergent adaptation of human lactase persistence in Africa and Europe. Nat Genet (2006) 9.44

Genotype imputation with thousands of genomes. G3 (Bethesda) (2011) 8.77

Genotype and SNP calling from next-generation sequencing data. Nat Rev Genet (2011) 8.34

A linear complexity phasing method for thousands of genomes. Nat Methods (2011) 8.30

Genome-wide association studies: potential next steps on a genetic journey. Hum Mol Genet (2008) 7.54

Genotype-imputation accuracy across worldwide human populations. Am J Hum Genet (2009) 7.28

Practical issues in imputation-based association mapping. PLoS Genet (2008) 6.76

Designing genome-wide association studies: sample size, power, imputation, and the choice of genotyping chip. PLoS Genet (2009) 6.61

Sensitive detection of chromosomal segments of distinct ancestry in admixed populations. PLoS Genet (2009) 6.36

Common variants at 7p21 are associated with frontotemporal lobar degeneration with TDP-43 inclusions. Nat Genet (2010) 5.52

Improved whole-chromosome phasing for disease and population genetic studies. Nat Methods (2013) 5.43

Prioritizing GWAS results: A review of statistical methods and recommendations for their application. Am J Hum Genet (2010) 5.39

Genome-wide association study of leaf architecture in the maize nested association mapping population. Nat Genet (2011) 5.16

Multilocus association mapping using variable-length Markov chains. Am J Hum Genet (2006) 5.13

A powerful and flexible multilocus association test for quantitative traits. Am J Hum Genet (2008) 4.96

Inference of population structure using dense haplotype data. PLoS Genet (2012) 4.87

Haplotype phasing: existing methods and new developments. Nat Rev Genet (2011) 4.66

A statistical method for predicting classical HLA alleles from SNP data. Am J Hum Genet (2008) 4.65

Simple and efficient analysis of disease association with missing genotype data. Am J Hum Genet (2008) 4.52

Common variants at 19p13 are associated with susceptibility to ovarian cancer. Nat Genet (2010) 4.51

Fast and flexible simulation of DNA sequence data. Genome Res (2008) 4.45

A genome-wide association study identifies a new ovarian cancer susceptibility locus on 9p22.2. Nat Genet (2009) 4.38

Genome-wide and fine-resolution association analysis of malaria in West Africa. Nat Genet (2009) 4.30

The impact of genetic relationship information on genomic breeding values in German Holstein cattle. Genet Sel Evol (2010) 3.87

Missing data imputation and haplotype phase inference for genome-wide association studies. Hum Genet (2008) 3.65

The genomic signature of dog domestication reveals adaptation to a starch-rich diet. Nature (2013) 3.61

A variant of mitochondrial protein LOC387715/ARMS2, not HTRA1, is strongly associated with age-related macular degeneration. Proc Natl Acad Sci U S A (2007) 3.50

A simple genetic architecture underlies morphological variation in dogs. PLoS Biol (2010) 3.46

High-resolution detection of identity by descent in unrelated individuals. Am J Hum Genet (2010) 3.41

Analyses and comparison of accuracy of different genotype imputation methods. PLoS One (2008) 3.41

A candidate gene approach identifies the CHRNA5-A3-B4 region as a risk factor for age-dependent nicotine addiction. PLoS Genet (2008) 3.25

Genetic architecture of complex traits and accuracy of genomic prediction: coat colour, milk-fat percentage, and type in Holstein cattle as contrasting model traits. PLoS Genet (2010) 3.21

Accuracy of genomic breeding values in multi-breed dairy cattle populations. Genet Sel Evol (2009) 3.14

Integrated study of copy number states and genotype calls using high-density SNP arrays. Nucleic Acids Res (2009) 3.09

Geographical structure and differential natural selection among North European populations. Genome Res (2009) 3.03

Genetic structure and domestication history of the grape. Proc Natl Acad Sci U S A (2011) 2.89

The history of African gene flow into Southern Europeans, Levantines, and Jews. PLoS Genet (2011) 2.81

Detecting rare variants for complex traits using family and unrelated data. Genet Epidemiol (2010) 2.80

Accurate prediction of genetic values for complex traits by whole-genome resequencing. Genetics (2010) 2.79

Genome-wide interrogation of germline genetic variation associated with treatment response in childhood acute lymphoblastic leukemia. JAMA (2009) 2.72

Identification, replication, and functional fine-mapping of expression quantitative trait loci in primary human liver tissue. PLoS Genet (2011) 2.68

Inferring human colonization history using a copying model. PLoS Genet (2008) 2.64

To identify associations with rare variants, just WHaIT: Weighted haplotype and imputation-based tests. Am J Hum Genet (2010) 2.63

An imputed genotype resource for the laboratory mouse. Mamm Genome (2008) 2.54

An improved genotyping by sequencing (GBS) approach offering increased versatility and efficiency of SNP discovery and genotyping. PLoS One (2013) 2.54

Rare versus common variants in pharmacogenetics: SLCO1B1 variation and methotrexate disposition. Genome Res (2011) 2.52

Dissecting the regulatory architecture of gene expression QTLs. Genome Biol (2012) 2.51

Phasing of many thousands of genotyped samples. Am J Hum Genet (2012) 2.47

Molecular phylogenetics of Candida albicans. Eukaryot Cell (2007) 2.41

STrengthening the REporting of Genetic Association Studies (STREGA): an extension of the STROBE statement. PLoS Med (2009) 2.39

A general approach for haplotype phasing across the full spectrum of relatedness. PLoS Genet (2014) 2.38

Genome-wide nested association mapping of quantitative resistance to northern leaf blight in maize. Proc Natl Acad Sci U S A (2011) 2.38

A hidden markov model combining linkage and linkage disequilibrium information for haplotype reconstruction and quantitative trait locus fine mapping. Genetics (2009) 2.37

High-resolution haplotype block structure in the cattle genome. BMC Genet (2009) 2.35

A practical genome scan for population-specific strong selective sweeps that have reached fixation. PLoS One (2007) 2.35

Evolutionary history of GS3, a gene conferring grain length in rice. Genetics (2009) 2.33

Analysis of genomic diversity in Mexican Mestizo populations to develop genomic medicine in Mexico. Proc Natl Acad Sci U S A (2009) 2.32

Fast and accurate inference of local ancestry in Latino populations. Bioinformatics (2012) 2.30

Improving the accuracy and efficiency of identity-by-descent detection in population data. Genetics (2013) 2.26

The importance of phase information for human genomics. Nat Rev Genet (2011) 2.25

Identification of genomic regions associated with phenotypic variation between dog breeds using selection mapping. PLoS Genet (2011) 2.24

Germline genetic variation in an organic anion transporter polypeptide associated with methotrexate pharmacokinetics and clinical effects. J Clin Oncol (2009) 2.20

Rapid evolutionary response to a transmissible cancer in Tasmanian devils. Nat Commun (2016) 2.19

A comparison of approaches to account for uncertainty in analysis of imputed genotypes. Genet Epidemiol (2011) 2.19

Identifying positive selection candidate loci for high-altitude adaptation in Andean populations. Hum Genomics (2009) 2.18

Modeling recent human evolution in mice by expression of a selected EDAR variant. Cell (2013) 2.17

Accuracy of genome-wide imputation of untyped markers and impacts on statistical power for association studies. BMC Genet (2009) 2.16

Enhancements to the ADMIXTURE algorithm for individual ancestry estimation. BMC Bioinformatics (2011) 2.16

Familial aggregation of common sequence variants on 15q24-25.1 in lung cancer. J Natl Cancer Inst (2008) 2.15

Effect of genetic divergence in identifying ancestral origin using HAPAA. Genome Res (2008) 2.15

Controls of nucleosome positioning in the human genome. PLoS Genet (2012) 2.14

Genetic variations in microRNA-related genes are novel susceptibility loci for esophageal cancer risk. Cancer Prev Res (Phila) (2008) 2.11

HaploRec: efficient and accurate large-scale reconstruction of haplotypes. BMC Bioinformatics (2006) 2.08

Population differentiation as a test for selective sweeps. Genome Res (2010) 2.08

Interpretation of association signals and identification of causal variants from genome-wide association studies. Am J Hum Genet (2010) 2.06

Gene-centric association signals for lipids and apolipoproteins identified via the HumanCVD BeadChip. Am J Hum Genet (2009) 2.04

Methods to impute missing genotypes for population data. Hum Genet (2007) 1.98

A combined long-range phasing and long haplotype imputation method to impute phase for SNP genotypes. Genet Sel Evol (2011) 1.97

Large-scale gene-centric meta-analysis across 32 studies identifies multiple lipid loci. Am J Hum Genet (2012) 1.96

Length distributions of identity by descent reveal fine-scale demographic history. Am J Hum Genet (2012) 1.91

Understanding the accuracy of statistical haplotype inference with sequence data of known phase. Genet Epidemiol (2007) 1.87

The genomics of selection in dogs and the parallel evolution between dogs and humans. Nat Commun (2013) 1.86

A comprehensively molecular haplotype-resolved genome of a European individual. Genome Res (2011) 1.83

STrengthening the REporting of Genetic Association studies (STREGA)--an extension of the STROBE statement. Eur J Clin Invest (2009) 1.82

A phasing and imputation method for pedigreed populations that results in a single-stage genomic evaluation. Genet Sel Evol (2012) 1.80

Historical genomics of North American maize. Proc Natl Acad Sci U S A (2012) 1.79

Resolving the evolution of extant and extinct ruminants with high-throughput phylogenomics. Proc Natl Acad Sci U S A (2009) 1.79

Accuracies of genomic breeding values in American Angus beef cattle using K-means clustering for cross-validation. Genet Sel Evol (2011) 1.78

Genome-wide meta-analysis points to CTC1 and ZNF676 as genes regulating telomere homeostasis in humans. Hum Mol Genet (2012) 1.77

Articles cited by this

Inference of population structure using multilocus genotype data. Genetics (2000) 147.76

A haplotype map of the human genome. Nature (2005) 105.70

A new statistical method for haplotype reconstruction from population data. Am J Hum Genet (2001) 59.30

Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics (2003) 53.11

A comparison of bayesian methods for haplotype reconstruction from population genotype data. Am J Hum Genet (2003) 26.59

Detecting immigration by using multilocus genotypes. Proc Natl Acad Sci U S A (1997) 22.45

Modeling linkage disequilibrium and identifying recombination hotspots using single-nucleotide polymorphism data. Genetics (2003) 17.73

Detecting disease associations due to linkage disequilibrium using haplotype tags: a class of tests and the determinants of statistical power. Hum Hered (2003) 14.37

Accounting for decay of linkage disequilibrium in haplotype inference and missing-data imputation. Am J Hum Genet (2005) 14.09

A comparison of phasing algorithms for trios and unrelated individuals. Am J Hum Genet (2006) 12.45

Accuracy of haplotype frequency estimation for biallelic loci, via the expectation-maximization algorithm for unphased diploid genotype data. Am J Hum Genet (2000) 11.47

Haplotype inference in random population samples. Am J Hum Genet (2002) 8.07

Haplotype reconstruction from genotype data using Imperfect Phylogeny. Bioinformatics (2004) 6.02

Coalescent-based association mapping and fine mapping of complex trait loci. Genetics (2004) 5.26

GERBIL: Genotype resolution and block identification using likelihood. Proc Natl Acad Sci U S A (2004) 4.36

Model-based inference of haplotype block variation. J Comput Biol (2004) 3.07

A block-free hidden Markov model for genotypes and its application to disease association. J Comput Biol (2005) 2.52

An MDL method for finding haplotype blocks and for estimating the strength of haplotype block boundaries. Pac Symp Biocomput (2003) 2.45

Articles by these authors

A second generation human haplotype map of over 3.1 million SNPs. Nature (2007) 85.39

RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Res (2008) 62.07

Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics (2003) 53.11

Modeling linkage disequilibrium and identifying recombination hotspots using single-nucleotide polymorphism data. Genetics (2003) 17.73

Genome-wide detection and characterization of positive selection in human populations. Nature (2007) 17.27

Understanding mechanisms underlying human gene expression variation with RNA sequencing. Nature (2010) 16.86

Imputation-based analysis of association studies: candidate regions and quantitative traits. PLoS Genet (2007) 15.55

Genes mirror geography within Europe. Nature (2008) 14.23

A comparison of phasing algorithms for trios and unrelated individuals. Am J Hum Genet (2006) 12.45

Traces of human migrations in Helicobacter pylori populations. Science (2003) 11.92

Fast and accurate genotype imputation in genome-wide association studies through pre-phasing. Nat Genet (2012) 11.29

Inferring weak population structure with the assistance of sample group information. Mol Ecol Resour (2009) 10.81

Inference of population structure using multilocus genotype data: dominant markers and null alleles. Mol Ecol Notes (2007) 10.11

High-resolution mapping of expression-QTLs yields insight into human gene regulation. PLoS Genet (2008) 9.68

Genotype imputation with thousands of genomes. G3 (Bethesda) (2011) 8.77

Interpreting principal component analyses of spatial population genetic variation. Nat Genet (2008) 8.49

Evidence for substantial fine-scale variation in recombination rates across the human genome. Nat Genet (2004) 6.99

Practical issues in imputation-based association mapping. PLoS Genet (2008) 6.76

Genome-wide efficient mixed-model analysis for association studies. Nat Genet (2012) 6.62

DNase I sensitivity QTLs are a major determinant of human expression variation. Nature (2012) 6.17

Polymorphisms of the HNF1A gene encoding hepatocyte nuclear factor-1 alpha are associated with C-reactive protein. Am J Hum Genet (2008) 3.61

Sex-specific and lineage-specific alternative splicing in primates. Genome Res (2009) 3.61

Assigning African elephant DNA to geographic region of origin: applications to the ivory trade. Proc Natl Acad Sci U S A (2004) 3.17

Automating resequencing-based detection of insertion-deletion polymorphisms. Nat Genet (2006) 2.61

Polygenic modeling with bayesian sparse linear mixed models. PLoS Genet (2013) 2.59

Dissecting the regulatory architecture of gene expression QTLs. Genome Biol (2012) 2.51

A statin-dependent QTL for GATM expression is associated with statin-induced myopathy. Nature (2013) 2.46

A statistical framework for joint eQTL analysis in multiple tissues. PLoS Genet (2013) 2.46

msHOT: modifying Hudson's ms simulator to incorporate crossover and gene conversion hotspots. Bioinformatics (2006) 2.45

Genome-wide association of lipid-lowering response to statins in combined study populations. PLoS One (2010) 2.43

Absence of the TAP2 human recombination hotspot in chimpanzees. PLoS Biol (2004) 2.22

Conservation of hotspots for recombination in low-copy repeats associated with the NF1 microdeletion. Nat Genet (2006) 2.11

Analysis of population structure: a unifying framework and novel methods based on sparse factor analysis. PLoS Genet (2010) 2.06

Global effect of PEG-IFN-alpha and ribavirin on gene expression in PBMC in vitro. J Interferon Cytokine Res (2004) 1.98

Using DNA to track the origin of the largest ivory seizure since the 1989 trade ban. Proc Natl Acad Sci U S A (2007) 1.56

Interactions between glucocorticoid treatment and cis-regulatory polymorphisms contribute to cellular response phenotypes. PLoS Genet (2011) 1.49

Next generation analytic tools for large scale genetic epidemiology studies of complex diseases. Genet Epidemiol (2011) 1.47

Variation in human recombination rates and its genetic determinants. PLoS One (2011) 1.39

Comparative RNA sequencing reveals substantial genetic variation in endangered primates. Genome Res (2011) 1.36

Combating the illegal trade in African elephant ivory with DNA forensics. Conserv Biol (2008) 1.34

Fast and accurate estimation of the population-scaled mutation rate, theta, from microsatellite genotype data. Genetics (2007) 1.32

The contribution of RNA decay quantitative trait loci to inter-individual variation in steady-state gene expression levels. PLoS Genet (2012) 1.25

Genome-wide association study of d-amphetamine response in healthy volunteers identifies putative associations, including cadherin 13 (CDH13). PLoS One (2012) 1.14

Integrated enrichment analysis of variants and pathways in genome-wide association studies indicates central role for IL-2 signaling genes in type 1 diabetes, and cytokine signaling genes in Crohn's disease. PLoS Genet (2013) 1.14

USING LINEAR PREDICTORS TO IMPUTE ALLELE FREQUENCIES FROM SUMMARY OR POOLED GENOTYPE DATA. Ann Appl Stat (2010) 1.12

Functional comparison of innate immune signaling pathways in primates. PLoS Genet (2010) 1.09

Linkage disequilibrium-based quality control for large-scale genetic studies. PLoS Genet (2008) 1.03

Insights into recombination from population genetic variation. Curr Opin Genet Dev (2006) 1.01

Exon-specific QTLs skew the inferred distribution of expression QTLs detected using gene expression array data. PLoS One (2012) 0.98

The effects of genotype-dependent recombination, and transmission asymmetry, on linkage disequilibrium. Genetics (2005) 0.91

STATISTICAL INFERENCE OF TRANSMISSION FIDELITY OF DNA METHYLATION PATTERNS OVER SOMATIC CELL DIVISIONS IN MAMMALS. Ann Appl Stat (2010) 0.90

Epigenetic modifications are associated with inter-species gene expression variation in primates. Genome Biol (2014) 0.90

Genetic, functional and molecular features of glucocorticoid receptor binding. PLoS One (2013) 0.85

Probabilistic segmentation and intensity estimation for microarray images. Biostatistics (2005) 0.84

Statistical inference of in vivo properties of human DNA methyltransferases from double-stranded methylation patterns. PLoS One (2012) 0.80

Mapping gene-environment interactions at regulatory polymorphisms: insights into mechanisms of phenotypic variation. Transcription (2012) 0.78

False discovery rates: a new deal. Biostatistics (2016) 0.77

Response to Cavalli-Sforza interview [Human Biology 82(3):245-266 (June 2010)]. Hum Biol (2010) 0.76

Identification of biological relationships from text documents using efficient computational methods. J Bioinform Comput Biol (2003) 0.75

Correction: Visualizing the structure of RNA-seq expression data using grade of membership models. PLoS Genet (2017) 0.75

A multi-level text mining method to extract biological relationships. Proc IEEE Comput Soc Bioinform Conf (2002) 0.75