Haplotype block partitioning and tag SNP selection using genotype data and their applications to association studies.

PubWeight™: 1.74‹?› | Rank: Top 3%

🔗 View Article (PMC 479119)

Published in Genome Res on April 12, 2004

Authors

Kui Zhang1, Zhaohui S Qin, Jun S Liu, Ting Chen, Michael S Waterman, Fengzhu Sun

Author Affiliations

1: Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, Los Angeles, California 90089-1113, USA.

Articles citing this

The future of association studies: gene-based analysis and replication. Am J Hum Genet (2004) 5.44

A fast method for computing high-significance disease association in large population-based studies. Am J Hum Genet (2006) 2.84

Finding haplotype tagging SNPs by use of principal components analysis. Am J Hum Genet (2004) 1.61

Detection of genes for ordinal traits in nuclear families and a unified approach for association studies. Genetics (2005) 1.38

Multipoint linkage-disequilibrium mapping with haplotype-block structure. Am J Hum Genet (2006) 1.29

Optimal haplotype block-free selection of tagging SNPs for genome-wide association studies. Genome Res (2004) 1.28

Fingerprinting Soybean Germplasm and Its Utility in Genomic Research. G3 (Bethesda) (2015) 1.22

Intra- and interpopulation genotype reconstruction from tagging SNPs. Genome Res (2006) 1.21

How well do HapMap SNPs capture the untyped SNPs? BMC Genomics (2006) 1.11

A model-based approach to capture genetic variation for future association studies. Genome Res (2006) 1.05

A new method for detecting human recombination hotspots and its applications to the HapMap ENCODE data. Am J Hum Genet (2006) 1.04

An overview of population genetic data simulation. J Comput Biol (2011) 1.01

New genetic evidence for involvement of the dopamine system in migraine with aura. Hum Genet (2009) 0.93

Fine haplotype structure of a chromosome 17 region in the laboratory and wild mouse. Genetics (2008) 0.93

Choosing blindly but wisely: differentially private solicitation of DNA datasets for disease marker discovery. J Am Med Inform Assoc (2014) 0.89

Adaptive tests for detecting gene-gene and gene-environment interactions. Hum Hered (2011) 0.88

FastTagger: an efficient algorithm for genome-wide tag SNP selection using multi-marker linkage disequilibrium. BMC Bioinformatics (2010) 0.87

A variable-sized sliding-window approach for genetic association studies via principal component analysis. Ann Hum Genet (2009) 0.86

Selecting additional tag SNPs for tolerating missing data in genotyping. BMC Bioinformatics (2005) 0.86

Assessing the power of tag SNPs in the mapping of quantitative trait loci (QTL) with extremal and random samples. BMC Genet (2005) 0.86

A survey of current Bayesian gene mapping methods. Hum Genomics (2004) 0.83

Polymorphisms of the IGF1R gene and their genetic effects on chicken early growth and carcass traits. BMC Genet (2008) 0.82

Tagging single nucleotide polymorphisms in the IRF1 and IRF8 genes and tuberculosis susceptibility. PLoS One (2012) 0.80

Genetic variants in the cocaine- and amphetamine-regulated transcript gene (CARTPT) and cocaine dependence. Neurosci Lett (2008) 0.79

Efficient haplotype block partitioning and tag SNP selection algorithms under various constraints. Biomed Res Int (2013) 0.78

Association of CRHR1 variants and posttraumatic stress symptoms in hurricane exposed adults. J Anxiety Disord (2013) 0.78

Genetic variations and haplotype diversity of the UGT1 gene cluster in the Chinese population. PLoS One (2012) 0.77

CGTS: a site-clustering graph based tagSNP selection algorithm in genotype data. BMC Bioinformatics (2009) 0.75

Probability theory-based SNP association study method for identifying susceptibility loci and genetic disease models in human case-control data. IEEE Trans Nanobioscience (2010) 0.75

Discovering Genome-Wide Tag SNPs Based on the Mutual Information of the Variants. PLoS One (2016) 0.75

Identification of rheumatoid arthritis biomarkers based on single nucleotide polymorphisms and haplotype blocks: A systematic review and meta-analysis. J Adv Res (2015) 0.75

The Genetic Diversity and Structure of Linkage Disequilibrium of the MTHFR Gene in Populations of Northern Eurasia. Acta Naturae (2012) 0.75

The Bos taurus-Bos indicus balance in fertility and milk related genes. PLoS One (2017) 0.75

Genetic Polymorphisms of SLCO1B1, CYP2E1 and UGT1A1 and Susceptibility to Anti-Tuberculosis Drug-Induced Hepatotoxicity: A Chinese Population-Based Prospective Case-Control Study. Clin Drug Investig (2017) 0.75

Articles cited by this

The future of genetic studies of complex human diseases. Science (1996) 64.76

A new statistical method for haplotype reconstruction from population data. Am J Hum Genet (2001) 59.30

Transmission test for linkage disequilibrium: the insulin gene region and insulin-dependent diabetes mellitus (IDDM). Am J Hum Genet (1993) 51.42

The structure of haplotype blocks in the human genome. Science (2002) 50.88

The Interaction of Selection and Linkage. I. General Considerations; Heterotic Models. Genetics (1964) 35.22

Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population. Mol Biol Evol (1995) 30.55

Properties of a neutral allele model with intragenic recombination. Theor Popul Biol (1983) 28.09

Prospects for whole-genome linkage disequilibrium mapping of common disease genes. Nat Genet (1999) 25.93

High-resolution haplotype structure in the human genome. Nat Genet (2001) 20.51

Inference of haplotypes from PCR-amplified samples of diploid populations. Mol Biol Evol (1990) 16.18

Blocks of limited haplotype diversity revealed by high-resolution scanning of human chromosome 21. Science (2001) 15.54

Intensely punctate meiotic recombination in the class II region of the major histocompatibility complex. Nat Genet (2001) 15.53

An E-M algorithm and testing strategy for multiple-locus haplotypes. Am J Hum Genet (1995) 13.84

HAPLO: a program using the EM algorithm to estimate the frequencies of multi-site haplotypes. J Hered (1995) 13.52

Accuracy of haplotype frequency estimation for biallelic loci, via the expectation-maximization algorithm for unphased diploid genotype data. Am J Hum Genet (2000) 11.47

Haplotype tagging for the identification of common disease genes. Nat Genet (2001) 11.27

Bayesian haplotype inference for multiple linked single-nucleotide polymorphisms. Am J Hum Genet (2001) 10.48

Partition-ligation-expectation-maximization algorithm for haplotype inference with single-nucleotide polymorphisms. Am J Hum Genet (2002) 10.12

GOLD--graphical overview of linkage disequilibrium. Bioinformatics (2000) 9.43

Testing association of statistically inferred haplotypes with discrete and continuous traits in samples of unrelated individuals. Hum Hered (2002) 8.64

Haplotype inference in random population samples. Am J Hum Genet (2002) 8.07

A first-generation linkage disequilibrium map of human chromosome 22. Nature (2002) 7.03

DNA Pooling: a tool for large-scale association studies. Nat Rev Genet (2002) 5.66

Juxtaposed regions of extensive and minimal linkage disequilibrium in human Xq25 and Xq28. Nat Genet (2000) 5.65

Haplotype blocks and linkage disequilibrium in the human genome. Nat Rev Genet (2003) 5.16

A dynamic programming algorithm for haplotype block partitioning. Proc Natl Acad Sci U S A (2002) 5.12

Chromosome-wide distribution of haplotype blocks and the role of recombination hot spots. Nat Genet (2003) 5.03

Linkage disequilibrium: what history has to tell us. Trends Genet (2002) 4.33

Distribution of recombination crossovers and the origin of haplotype blocks: the interplay of population history, recombination, and mutation. Am J Hum Genet (2002) 4.29

The accuracy of statistical methods for estimation of haplotype frequencies: an example from the CD4 locus. Am J Hum Genet (2000) 4.07

Molecular haplotyping of genetic markers 10 kb apart by allele-specific long-range PCR. Nucleic Acids Res (1996) 3.92

The use of sample genealogies for studying a selectively neutral m-loci model with recombination. Theor Popul Biol (1985) 3.72

Linkage disequilibrium and the mapping of complex human traits. Trends Genet (2002) 3.38

Conversion of diploidy to haploidy. Nature (2000) 3.28

Human genome sequence variation and the influence of gene history, mutation and recombination. Nat Genet (2002) 3.24

Experimentally-derived haplotypes substantially increase the efficiency of linkage disequilibrium studies. Nat Genet (2001) 3.20

Haplotype block structure and its applications to association studies: power and study designs. Am J Hum Genet (2002) 2.88

The impact of genotyping error on haplotype reconstruction and frequency estimation. Eur J Hum Genet (2002) 2.51

Transmission/disequilibrium tests using multiple tightly linked markers. Am J Hum Genet (2000) 2.45

An MDL method for finding haplotype blocks and for estimating the strength of haplotype block boundaries. Pac Symp Biocomput (2003) 2.45

The extent of linkage disequilibrium in four populations with distinct demographic histories. Am J Hum Genet (2000) 2.03

Finding haplotype block boundaries by using the minimum-description-length principle. Am J Hum Genet (2003) 1.71

Monoallelic mutation analysis (MAMA) for identifying germline mutations. Nat Genet (1995) 1.68

Direct measurement of the male recombination fraction in the human beta-globin hot spot. Hum Mol Genet (2002) 1.37

Inference of haplotypes from samples of diploid populations: complexity and algorithms. J Comput Biol (2001) 1.35

Design and sample-size considerations in the detection of linkage disequilibrium with a disease locus. Am J Hum Genet (1994) 1.33

On the use of DNA pooling to estimate haplotype frequencies. Genet Epidemiol (2003) 1.33

Haplotype tagging single nucleotide polymorphisms and association studies. Hum Hered (2003) 1.29

Long-range sequence composition mirrors linkage disequilibrium pattern in a 1.13 Mb region of human chromosome 22. Hum Mol Genet (2001) 0.98

Haplotype block partition with limited resources and applications to human chromosome 21 haplotype data. Am J Hum Genet (2003) 0.96

Articles by these authors

Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature (2012) 18.23

A comparison of phasing algorithms for trios and unrelated individuals. Am J Hum Genet (2006) 12.45

Bayesian haplotype inference for multiple linked single-nucleotide polymorphisms. Am J Hum Genet (2001) 10.48

An algorithm for finding protein-DNA binding sites with applications to chromatin-immunoprecipitation microarray experiments. Nat Biotechnol (2002) 10.23

Partition-ligation-expectation-maximization algorithm for haplotype inference with single-nucleotide polymorphisms. Am J Hum Genet (2002) 10.12

Methylation of histone H3 Lys 4 in coding regions of active genes. Proc Natl Acad Sci U S A (2002) 7.67

Model-based analysis of two-color arrays (MA2C). Genome Biol (2007) 7.28

p53-mediated activation of miRNA34 candidate tumor-suppressor genes. Curr Biol (2007) 6.87

An integrated network of androgen receptor, polycomb, and TMPRSS2-ERG gene fusions in prostate cancer progression. Cancer Cell (2010) 6.76

Whole-genome shotgun assembly and comparison of human genome assemblies. Proc Natl Acad Sci U S A (2004) 6.08

A two-step mechanism for stem cell activation during hair regeneration. Cell Stem Cell (2009) 5.40

A dynamic programming algorithm for haplotype block partitioning. Proc Natl Acad Sci U S A (2002) 5.12

A critical assessment of Mus musculus gene function prediction using integrated genomic evidence. Genome Biol (2008) 4.78

Integrating regulatory motif discovery and genome-wide expression analysis. Proc Natl Acad Sci U S A (2003) 4.74

Inferring domain-domain interactions from protein-protein interactions. Genome Res (2002) 4.36

Bayesian inference of epistatic interactions in case-control studies. Nat Genet (2007) 4.31

The Spo0A regulon of Bacillus subtilis. Mol Microbiol (2003) 4.19

SAINT: probabilistic scoring of affinity purification-mass spectrometry data. Nat Methods (2010) 3.50

The program of gene transcription for a single differentiating cell type during sporulation in Bacillus subtilis. PLoS Biol (2004) 3.33

Genomic sequence is highly predictive of local nucleosome depletion. PLoS Comput Biol (2007) 3.24

Gene density, transcription, and insulators contribute to the partition of the Drosophila genome into physical domains. Mol Cell (2012) 3.08

Haplotype block structure and its applications to association studies: power and study designs. Am J Hum Genet (2002) 2.88

Mapping Gene Ontology to proteins based on protein-protein interaction data. Bioinformatics (2004) 2.63

The sigmaE regulon and the identification of additional sporulation genes in Bacillus subtilis. J Mol Biol (2003) 2.59

Recursive SVM feature selection and sample classification for mass-spectrometry and microarray data. BMC Bioinformatics (2006) 2.53

High-resolution human genome structure by single-molecule analysis. Proc Natl Acad Sci U S A (2010) 2.47

Marine bacterial, archaeal and protistan association networks reveal ecological linkages. ISME J (2011) 2.40

Diploid genome reconstruction of Ciona intestinalis and comparative analysis with Ciona savignyi. Genome Res (2007) 2.37

A single molecule scaffold for the maize genome. PLoS Genet (2009) 2.33

Modeling within-motif dependence for transcription factor binding site predictions. Bioinformatics (2004) 2.29

Taq DNA polymerase slippage mutation rates measured by PCR and quasi-likelihood analysis: (CA/GT)n and (A/T)n microsatellites. Nucleic Acids Res (2003) 2.28

GFOLD: a generalized fold change for ranking differentially expressed genes from RNA-seq data. Bioinformatics (2012) 2.21

De novo cis-regulatory module elicitation for eukaryotic genomes. Proc Natl Acad Sci U S A (2005) 2.20

PerM: efficient mapping of short sequencing reads with periodic full sensitive spaced seeds. Bioinformatics (2009) 2.20

A data-driven clustering method for time course gene expression data. Nucleic Acids Res (2006) 2.20

Distributional regimes for the number of k-word matches between two random sequences. Proc Natl Acad Sci U S A (2002) 2.20

Genomic androgen receptor-occupied regions with different functions, defined by histone acetylation, coregulators and transcriptional capacity. PLoS One (2008) 2.18

Alignment-free sequence comparison (I): statistics and power. J Comput Biol (2009) 2.18

HAPLORE: a program for haplotype reconstruction in general pedigrees without recombination. Bioinformatics (2004) 2.16

Local similarity analysis reveals unique associations among marine bacterioplankton species and environmental factors. Bioinformatics (2006) 2.15

Prediction of protein function using protein-protein interaction data. J Comput Biol (2003) 2.15

Phylogenomics of nonavian reptiles and the structure of the ancestral amniote genome. Proc Natl Acad Sci U S A (2007) 2.09

Global gene expression analysis reveals evidence for decreased lipid biosynthesis and increased innate immunity in uninvolved psoriatic skin. J Invest Dermatol (2009) 2.06

The relationship between microsatellite slippage mutation rate and the number of repeat units. Mol Biol Evol (2003) 2.01

Assessment of the reliability of protein-protein interactions and protein function prediction. Pac Symp Biocomput (2003) 1.89

Alignment of optical maps. J Comput Biol (2006) 1.88

HapBlock: haplotype block partitioning and tag SNP selection software using a set of dynamic programming algorithms. Bioinformatics (2004) 1.86

Periodically Aligned Si Nanopillar Arrays as Efficient Antireflection Layers for Solar Cell Applications. Nanoscale Res Lett (2010) 1.85

Clustering 16S rRNA for OTU prediction: a method of unsupervised Bayesian clustering. Bioinformatics (2011) 1.82

An integrative approach for causal gene identification and gene regulatory pathway inference. Bioinformatics (2006) 1.82

HiCNorm: removing biases in Hi-C data via Poisson regression. Bioinformatics (2012) 1.77

Pathogenesis of emerging severe fever with thrombocytopenia syndrome virus in C57/BL6 mouse model. Proc Natl Acad Sci U S A (2012) 1.71

Decoding human regulatory circuits. Genome Res (2004) 1.70

Insulator function and topological domain border strength scale with architectural protein occupancy. Genome Biol (2014) 1.70

Further understanding human disease genes by comparing with housekeeping genes and other genes. BMC Genomics (2006) 1.68

Information flow analysis of interactome networks. PLoS Comput Biol (2009) 1.68

Maternal influence on blood pressure suggests involvement of mitochondrial DNA in the pathogenesis of hypertension: the Framingham Heart Study. J Hypertens (2007) 1.67

An animal model of MERS produced by infection of rhesus macaques with MERS coronavirus. J Infect Dis (2013) 1.67

An integrated probabilistic model for functional prediction of proteins. J Comput Biol (2004) 1.67

On the detection and refinement of transcription factor binding sites using ChIP-Seq data. Nucleic Acids Res (2010) 1.67

Bayesian inference of spatial organizations of chromosomes. PLoS Comput Biol (2013) 1.66

A Bayesian partition method for detecting pleiotropic and epistatic eQTL modules. PLoS Comput Biol (2010) 1.66

Incorporating genotyping uncertainty in haplotype inference for single-nucleotide polymorphisms. Am J Hum Genet (2004) 1.66

Statistical resynchronization and Bayesian detection of periodically expressed genes. Nucleic Acids Res (2004) 1.64

An algorithm for assembly of ordered restriction maps from single DNA molecules. Proc Natl Acad Sci U S A (2006) 1.63

Diffusion kernel-based logistic regression models for protein function prediction. OMICS (2006) 1.63

Deep sequencing reveals distinct patterns of DNA methylation in prostate cancer. Genome Res (2011) 1.63

Accurate genome relative abundance estimation based on shotgun metagenomic reads. PLoS One (2011) 1.62

Molecular docking and three-dimensional quantitative structure-activity relationship studies on the binding modes of herbicidal 1-(substituted phenoxyacetoxy)alkylphosphonates to the E1 component of pyruvate dehydrogenase. J Agric Food Chem (2007) 1.61

Systematic discovery of functional modules and context-specific functional annotation of human genome. Bioinformatics (2007) 1.61

A graph-based approach to systematically reconstruct human transcriptional regulatory modules. Bioinformatics (2007) 1.59

Clustering analysis of SAGE data using a Poisson approach. Genome Biol (2004) 1.58

Broadly heterogeneous activation of the master regulator for sporulation in Bacillus subtilis. Proc Natl Acad Sci U S A (2010) 1.56

BioOptimizer: a Bayesian scoring function approach to motif discovery. Bioinformatics (2004) 1.55

Direct electrochemistry and electrocatalysis of heme proteins entrapped in agarose hydrogel films in room-temperature ionic liquids. Langmuir (2005) 1.55

CGI: a new approach for prioritizing genes by combining gene expression and protein-protein interaction data. Bioinformatics (2006) 1.55

A boosting approach for motif modeling using ChIP-chip data. Bioinformatics (2005) 1.53

Alignment-free sequence comparison (II): theoretical power of comparison statistics. J Comput Biol (2010) 1.52

An integrated approach to the prediction of domain-domain interactions. BMC Bioinformatics (2006) 1.52