Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data.

PubWeight™: 5.20‹?› | Rank: Top 1%

🔗 View Article (PMID 17189563)

Published in Biostatistics on December 22, 2006

Authors

Benilton Carvalho1, Henrik Bengtsson, Terence P Speed, Rafael A Irizarry

Author Affiliations

1: Department of Biostatistics, Johns Hopkins University, Baltimore, MD 21205, USA.

Articles citing this

(truncated to the top 100)

A genotype calling algorithm for the Illumina BeadArray platform. Bioinformatics (2007) 8.03

Insertional mutagenesis combined with acquired somatic mutations causes leukemogenesis following gene therapy of SCID-X1 patients. J Clin Invest (2008) 7.05

Simultaneous genotype calling and haplotype phasing improves genotype accuracy and reduces false-positive associations for genome-wide association studies. Am J Hum Genet (2009) 5.65

PICNIC: an algorithm to predict absolute allelic copy number variation with microarray cancer data. Biostatistics (2009) 4.31

A framework for oligonucleotide microarray preprocessing. Bioinformatics (2010) 3.97

Removing technical variability in RNA-seq data using conditional quantile normalization. Biostatistics (2012) 3.74

A single-array preprocessing method for estimating full-resolution raw copy numbers from all Affymetrix genotyping arrays including GenomeWideSNP 5 & 6. Bioinformatics (2009) 2.90

Normalization of Illumina Infinium whole-genome SNP data improves copy number estimates and allelic intensity ratios. BMC Bioinformatics (2008) 2.66

Assessing batch effects of genotype calling algorithm BRLMM for the Affymetrix GeneChip Human Mapping 500 K array set using 270 HapMap samples. BMC Bioinformatics (2008) 1.98

Large-scale data integration framework provides a comprehensive view on glioblastoma multiforme. Genome Med (2010) 1.94

R/Bioconductor software for Illumina's Infinium whole-genome genotyping BeadChips. Bioinformatics (2009) 1.83

Hidden Markov models for the assessment of chromosomal alterations using high-throughput SNP arrays. Ann Appl Stat (2008) 1.75

Quantifying uncertainty in genotype calls. Bioinformatics (2009) 1.74

TumorBoost: normalization of allele-specific tumor copy numbers from a single pair of tumor-normal genotyping microarrays. BMC Bioinformatics (2010) 1.70

An analytical pipeline for genomic representations used for cytosine methylation studies. Bioinformatics (2008) 1.64

Validation and extension of an empirical Bayes method for SNP calling on Affymetrix microarrays. Genome Biol (2008) 1.61

Genome-wide association and meta-analysis in populations from Starr County, Texas, and Mexico City identify type 2 diabetes susceptibility loci and enrichment for expression quantitative trait loci in top signals. Diabetologia (2011) 1.59

The pitfalls of platform comparison: DNA copy number array technologies assessed. BMC Genomics (2009) 1.55

A focal epilepsy and intellectual disability syndrome is due to a mutation in TBC1D24. Am J Hum Genet (2010) 1.48

Comparing genotyping algorithms for Illumina's Infinium whole-genome SNP BeadChips. BMC Bioinformatics (2011) 1.45

Using the R Package crlmm for Genotyping and Copy Number Estimation. J Stat Softw (2011) 1.41

An oligo-based microarray offers novel transcriptomic approaches for the analysis of pathogen resistance and fruit quality traits in melon (Cucumis melo L.). BMC Genomics (2009) 1.39

Discovery of novel variants in genotyping arrays improves genotype retention and reduces ascertainment bias. BMC Genomics (2012) 1.39

Detection of genome-wide polymorphisms in the AT-rich Plasmodium falciparum genome using a high-density microarray. BMC Genomics (2008) 1.38

Spoiling the whole bunch: quality control aimed at preserving the integrity of high-throughput genotyping. Am J Hum Genet (2010) 1.35

Genetic architecture of regulatory variation in Arabidopsis thaliana. Genome Res (2011) 1.32

Estimating genome-wide copy number using allele-specific mixture models. J Comput Biol (2008) 1.23

ALCHEMY: a reliable method for automated SNP genotype calling for small batch sizes and highly homozygous populations. Bioinformatics (2010) 1.16

Regulation of xylose metabolism in recombinant Saccharomyces cerevisiae. Microb Cell Fact (2008) 1.15

Genome-wide association study of d-amphetamine response in healthy volunteers identifies putative associations, including cadherin 13 (CDH13). PLoS One (2012) 1.14

Recovering unused information in genome-wide association studies: the benefit of analyzing SNPs out of Hardy-Weinberg equilibrium. Eur J Hum Genet (2009) 1.12

Hypothesis-driven candidate gene association studies: practical design and analytical considerations. Am J Epidemiol (2009) 1.12

Empirical Bayes analysis of single nucleotide polymorphisms. BMC Bioinformatics (2008) 1.11

An integrative multi-dimensional genetic and epigenetic strategy to identify aberrant genes and pathways in cancer. BMC Syst Biol (2010) 1.10

M(3): an improved SNP calling algorithm for Illumina BeadArray data. Bioinformatics (2011) 1.09

Genome-wide association of serum uric acid concentration: replication of sequence variants in an island population of the Adriatic coast of Croatia. Ann Hum Genet (2012) 1.08

COLT-Cancer: functional genetic screening resource for essential genes in human cancer cell lines. Nucleic Acids Res (2011) 1.08

DeMix: deconvolution for mixed cancer transcriptomes using raw measured data. Bioinformatics (2013) 1.08

CopywriteR: DNA copy number detection from off-target sequence data. Genome Biol (2015) 1.05

Identification of rare DNA variants in mitochondrial disorders with improved array-based sequencing. Nucleic Acids Res (2010) 1.03

Detecting copy number status and uncovering subclonal markers in heterogeneous tumor biopsies. BMC Genomics (2011) 1.02

Statistical issues in the analysis of DNA Copy Number Variations. Int J Comput Biol Drug Des (2008) 1.02

The Annotation, Mapping, Expression and Network (AMEN) suite of tools for molecular systems biology. BMC Bioinformatics (2008) 1.01

Statistics and bioinformatics in nutritional sciences: analysis of complex data in the era of systems biology. J Nutr Biochem (2010) 1.00

Biased inheritance of the protein PatN frees vegetative cells to initiate patterned heterocyst differentiation. Proc Natl Acad Sci U S A (2012) 0.99

Improved detection of global copy number variation using high density, non-polymorphic oligonucleotide probes. BMC Genet (2008) 0.96

Assessing the Causal Relationship of Maternal Height on Birth Size and Gestational Age at Birth: A Mendelian Randomization Analysis. PLoS Med (2015) 0.95

Brain transcriptomic response of threespine sticklebacks to cues of a predator. Brain Behav Evol (2011) 0.94

Co-regulatory expression quantitative trait loci mapping: method and application to endometrial cancer. BMC Med Genomics (2011) 0.94

A computational framework for the analysis of peptide microarray antibody binding data with application to HIV vaccine profiling. J Immunol Methods (2013) 0.93

Smarter clustering methods for SNP genotype calling. Bioinformatics (2008) 0.93

Gene profiling and signaling pathways of Candida albicans keratitis. Mol Vis (2008) 0.92

A microarray platform and novel SNP calling algorithm to evaluate Plasmodium falciparum field samples of low DNA quantity. BMC Genomics (2014) 0.90

Genome-wide association study in bipolar patients stratified by co-morbidity. PLoS One (2011) 0.89

Finding missing heritability in less significant Loci and allelic heterogeneity: genetic variation in human height. PLoS One (2012) 0.89

Genotyping and inflated type I error rate in genome-wide association case/control studies. BMC Bioinformatics (2009) 0.89

Redundancy in genotyping arrays. PLoS One (2007) 0.89

Performance assessment of copy number microarray platforms using a spike-in experiment. Bioinformatics (2011) 0.88

Effects of BRCA2 cis-regulation in normal breast and cancer risk amongst BRCA2 mutation carriers. Breast Cancer Res (2012) 0.88

Robust methods for population stratification in genome wide association studies. BMC Bioinformatics (2013) 0.87

A model of higher accuracy for the individual haplotyping problem based on weighted SNP fragments and genotype with errors. Bioinformatics (2008) 0.86

Characterization of the DNA methylome and its interindividual variation in human peripheral blood monocytes. Epigenomics (2013) 0.86

ASPN and GJB2 Are Implicated in the Mechanisms of Invasion of Ductal Breast Carcinomas. J Cancer (2012) 0.86

Sleep is not just for the brain: transcriptional responses to sleep in peripheral tissues. BMC Genomics (2013) 0.86

Hybridization modeling of oligonucleotide SNP arrays for accurate DNA copy number estimation. Nucleic Acids Res (2009) 0.85

Non-random error in genotype calling procedures: implications for family-based and case-control genome-wide association studies. Am J Med Genet B Neuropsychiatr Genet (2008) 0.85

Insight into Genotype-Phenotype Associations through eQTL Mapping in Multiple Cell Types in Health and Immune-Mediated Disease. PLoS Genet (2016) 0.85

Reconstructing DNA copy number by joint segmentation of multiple sequences. BMC Bioinformatics (2012) 0.84

PAIR: paired allelic log-intensity-ratio-based normalization method for SNP-CGH arrays. Bioinformatics (2012) 0.83

Copy number polymorphisms near SLC2A9 are associated with serum uric acid concentrations. BMC Genet (2014) 0.83

A review of software for microarray genotyping. Hum Genomics (2011) 0.83

Hardy-Weinberg analysis of a large set of published association studies reveals genotyping error and a deficit of heterozygotes across multiple loci. Hum Genomics (2008) 0.83

A lincRNA connected to cell mortality and epigenetically-silenced in most common human cancers. Epigenetics (2015) 0.82

Large-scale analysis of differential gene expression in coffee genotypes resistant and susceptible to leaf miner-toward the identification of candidate genes for marker assisted-selection. BMC Genomics (2014) 0.81

Integrative genomics and transcriptomics analysis of human embryonic and induced pluripotent stem cells. BioData Min (2014) 0.81

Evaluating the influence of quality control decisions and software algorithms on SNP calling for the affymetrix 6.0 SNP array platform. Hum Hered (2011) 0.81

The GA and the GWAS: using genetic algorithms to search for multilocus associations. IEEE/ACM Trans Comput Biol Bioinform (2011) 0.80

Systems pharmacology of adiposity reveals inhibition of EP300 as a common therapeutic mechanism of caloric restriction and resveratrol for obesity. Front Pharmacol (2015) 0.80

Extent of height variability explained by known height-associated genetic variants in an isolated population of the Adriatic coast of Croatia. PLoS One (2011) 0.80

Mismatch and G-stack modulated probe signals on SNP microarrays. PLoS One (2009) 0.80

Assessing the utility of whole-genome amplified serum DNA for array-based high throughput genotyping. BMC Genet (2009) 0.80

R classes and methods for SNP array data. Methods Mol Biol (2010) 0.79

An imputation approach for oligonucleotide microarrays. PLoS One (2013) 0.79

Identifying novel hypoxia-associated markers of chemoresistance in ovarian cancer. BMC Cancer (2015) 0.78

EP300 Protects from Light-Induced Retinopathy in Zebrafish. Front Pharmacol (2016) 0.78

Acute Viral Respiratory Infection Rapidly Induces a CD8+ T Cell Exhaustion-like Phenotype. J Immunol (2015) 0.77

Data integration workflow for search of disease driving genes and genetic variants. PLoS One (2011) 0.77

Downregulation of GSTK1 Is a Common Mechanism Underlying Hypertrophic Cardiomyopathy. Front Pharmacol (2016) 0.77

SGDI: system for genomic data integration. Pac Symp Biocomput (2008) 0.77

Combined Analysis of SNP Array Data Identifies Novel CNV Candidates and Pathways in Ependymoma and Mesothelioma. Biomed Res Int (2015) 0.76

Genome-wide association analysis identifies genetic variations in subjects with myalgic encephalomyelitis/chronic fatigue syndrome. Transl Psychiatry (2016) 0.76

Human Lacrimal Gland Gene Expression. PLoS One (2017) 0.75

DNA methylation signature (SAM40) identifies subgroups of the Luminal A breast cancer samples with distinct survival. Oncotarget (2016) 0.75

A new model calling procedure for Illumina BeadArray data. BMC Genet (2016) 0.75

A ketogenic diet rescues hippocampal memory defects in a mouse model of Kabuki syndrome. Proc Natl Acad Sci U S A (2016) 0.75

BCRgt: a Bayesian cluster regression-based genotyping algorithm for the samples with copy number alterations. BMC Bioinformatics (2014) 0.75

M(3)-S: a genotype calling method incorporating information from samples with known genotypes. BMC Bioinformatics (2015) 0.75

Reliable single chip genotyping with semi-parametric log-concave mixtures. PLoS One (2012) 0.75

KRLMM: an adaptive genotype calling method for common and low frequency variants. BMC Bioinformatics (2014) 0.75

Quality Visualization of Microarray Datasets Using Circos. Microarrays (Basel) (2012) 0.75

Articles by these authors

Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics (2003) 100.88

Summaries of Affymetrix GeneChip probe level data. Nucleic Acids Res (2003) 52.74

Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Res (2002) 40.03

Integrated genomic analysis identifies clinically relevant subtypes of glioblastoma characterized by abnormalities in PDGFRA, IDH1, EGFR, and NF1. Cancer Cell (2010) 39.09

affy--analysis of Affymetrix GeneChip data at the probe level. Bioinformatics (2004) 32.08

GOstat: find statistically overrepresented Gene Ontologies within a group of genes. Bioinformatics (2004) 30.14

International network of cancer genome projects. Nature (2010) 20.35

Identification of a CpG island methylator phenotype that defines a distinct subgroup of glioma. Cancer Cell (2010) 16.12

Tackling the widespread and critical impact of batch effects in high-throughput data. Nat Rev Genet (2010) 11.82

A genotype calling algorithm for affymetrix SNP arrays. Bioinformatics (2005) 10.43

Increased methylation variation in epigenetic domains across cancer types. Nat Genet (2011) 8.92

Genome of the marsupial Monodelphis domestica reveals innovation in non-coding sequences. Nature (2007) 7.91

A simple spreadsheet-based, MIAME-supportive format for microarray data: MAGE-TAB. BMC Bioinformatics (2006) 7.53

The MicroArray Quality Control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models. Nat Biotechnol (2010) 7.08

Transcription factors bind thousands of active and inactive regions in the Drosophila blastoderm. PLoS Biol (2008) 6.30

Intra-individual change over time in DNA methylation with familial clustering. JAMA (2008) 6.17

Comprehensive methylome map of lineage commitment from haematopoietic progenitors. Nature (2010) 6.06

A benchmark for Affymetrix GeneChip expression measures. Bioinformatics (2004) 5.99

Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays. Bioinformatics (2014) 5.93

Stochastic models inspired by hybridization theory for short oligonucleotide arrays. J Comput Biol (2005) 5.63

Frozen robust multiarray analysis (fRMA). Biostatistics (2010) 5.57

Large histone H3 lysine 9 dimethylated chromatin blocks distinguish differentiated from embryonic stem cells. Nat Genet (2009) 5.52

Travelling waves in the occurrence of dengue haemorrhagic fever in Thailand. Nature (2004) 5.05

Summarizing and correcting the GC content bias in high-throughput sequencing. Nucleic Acids Res (2012) 4.94

Preprocessing of oligonucleotide array data. Nat Biotechnol (2004) 4.70

Subtype and pathway specific responses to anticancer compounds in breast cancer. Proc Natl Acad Sci U S A (2011) 4.55

Evolution in health and medicine Sackler colloquium: Stochastic epigenetic variation as a driving force of development, evolutionary adaptation, and disease. Proc Natl Acad Sci U S A (2009) 4.40

Expression profiling in primates reveals a rapid evolution of human transcription factors. Nature (2006) 4.26

The External RNA Controls Consortium: a progress report. Nat Methods (2005) 4.24

A gene expression bar code for microarray data. Nat Methods (2007) 4.20

A framework for oligonucleotide microarray preprocessing. Bioinformatics (2010) 3.97

Accounting for cellular heterogeneity is critical in epigenome-wide association studies. Genome Biol (2014) 3.89

Personalized epigenomic signatures that are stable over time and covary with body mass index. Sci Transl Med (2010) 3.81

Removing technical variability in RNA-seq data using conditional quantile normalization. Biostatistics (2012) 3.74

Bump hunting to identify differentially methylated regions in epigenetic epidemiology studies. Int J Epidemiol (2012) 3.62

Lineage-specific expansion of proteins exported to erythrocytes in malaria parasites. Genome Biol (2006) 3.45

Nucleocytosolic acetyl-coenzyme a synthetase is required for histone acetylation and global transcription. Mol Cell (2006) 3.22

Sequencing technology does not eliminate biological variability. Nat Biotechnol (2011) 3.20

BSmooth: from whole genome bisulfite sequencing reads to differentially methylated regions. Genome Biol (2012) 3.13

Spotted long oligonucleotide arrays for human gene expression analysis. Genome Res (2003) 2.95

A single-array preprocessing method for estimating full-resolution raw copy numbers from all Affymetrix genotyping arrays including GenomeWideSNP 5 & 6. Bioinformatics (2009) 2.90

PIK3CA mutations associated with gene signature of low mTORC1 signaling and better outcomes in estrogen receptor-positive breast cancer. Proc Natl Acad Sci U S A (2010) 2.79

Using control genes to correct for unwanted variation in microarray data. Biostatistics (2011) 2.70

Model-based quality assessment and base-calling for second-generation sequencing data. Biometrics (2010) 2.70

Functional genomic analysis of oligodendrocyte differentiation. J Neurosci (2006) 2.64

The Gene Expression Barcode: leveraging public data repositories to begin cataloging the human and murine transcriptomes. Nucleic Acids Res (2011) 2.42

Feature-level exploration of a published Affymetrix GeneChip control dataset. Genome Biol (2006) 2.39

Consolidation of the cancer genome into domains of repressive chromatin by long-range epigenetic silencing (LRES) reduces transcriptional plasticity. Nat Cell Biol (2010) 2.38

SNP-specific array-based allele-specific expression analysis. Genome Res (2008) 2.30

Mining a tandem mass spectrometry database to determine the trends and global factors influencing peptide fragmentation. Anal Chem (2003) 2.28

Statistical modeling of sequencing errors in SAGE libraries. Bioinformatics (2004) 2.25

Recommendations for the design and analysis of epigenome-wide association studies. Nat Methods (2013) 2.19

Genome sequence of an Australian kangaroo, Macropus eugenii, provides insight into the evolution of mammalian reproduction and development. Genome Biol (2011) 2.18

Reconstructing an ancestral mammalian immune supercomplex from a marsupial major histocompatibility complex. PLoS Biol (2006) 2.17

Integrative analysis of RUNX1 downstream pathways and target genes. BMC Genomics (2008) 2.14

Accurate genome-scale percentage DNA methylation estimates from microarray data. Biostatistics (2010) 2.11

DNA hypomethylation arises later in prostate cancer progression than CpG island hypermethylation and contributes to metastatic tumor heterogeneity. Cancer Res (2008) 2.08

Consolidated strategy for the analysis of microarray spike-in data. Nucleic Acids Res (2008) 2.04

Redefining CpG islands using hidden Markov models. Biostatistics (2010) 2.03

Transgenic expression of Cre recombinase from the tyrosine hydroxylase locus. Genesis (2004) 1.99

Network inference using informative priors. Proc Natl Acad Sci U S A (2008) 1.96

Identification and stoichiometry of glycosylphosphatidylinositol-anchored membrane proteins of the human malaria parasite Plasmodium falciparum. Mol Cell Proteomics (2006) 1.86

A summarization approach for Affymetrix GeneChip data using a reference training set from a large, biologically diverse database. BMC Bioinformatics (2006) 1.85

R/Bioconductor software for Illumina's Infinium whole-genome genotyping BeadChips. Bioinformatics (2009) 1.83

Global synthetic-lethality analysis and yeast functional profiling. Trends Genet (2005) 1.82

Gene set enrichment analysis made simple. Stat Methods Med Res (2009) 1.79

Regulation of apicomplexan actin-based motility. Nat Rev Microbiol (2006) 1.78

Orthologous gene-expression profiling in multi-species models: search for candidate genes. Genome Biol (2004) 1.77

Evaluation of affinity-based genome-wide DNA methylation data: effects of CpG density, amplification bias, and copy number variation. Genome Res (2010) 1.77

A multilevel model to address batch effects in copy number estimation using SNP arrays. Biostatistics (2010) 1.76

Overcoming bias and systematic errors in next generation sequencing data. Genome Med (2010) 1.75

Quantifying uncertainty in genotype calls. Bioinformatics (2009) 1.74

Significance analysis and statistical dissection of variably methylated regions. Biostatistics (2011) 1.74

A quartet of PIF bHLH factors provides a transcriptionally centered signaling hub that regulates seedling morphogenesis through differential expression-patterning of shared target genes in Arabidopsis. PLoS Genet (2013) 1.73

Sir2 paralogues cooperate to regulate virulence genes and antigenic variation in Plasmodium falciparum. PLoS Biol (2009) 1.72

A subset of Plasmodium falciparum SERA genes are expressed and appear to play an important role in the erythrocytic cycle. J Biol Chem (2002) 1.71

TumorBoost: normalization of allele-specific tumor copy numbers from a single pair of tumor-normal genotyping microarrays. BMC Bioinformatics (2010) 1.70

A Tetrahymena Piwi bound to mature tRNA 3' fragments activates the exonuclease Xrn2 for RNA processing in the nucleus. Mol Cell (2012) 1.65

DNA methylation alterations exhibit intraindividual stability and interindividual heterogeneity in prostate cancer metastases. Sci Transl Med (2013) 1.63

A new rodent model to assess blood stage immunity to the Plasmodium falciparum antigen merozoite surface protein 119 reveals a protective role for invasion inhibitory antibodies. J Exp Med (2003) 1.61

Validation and extension of an empirical Bayes method for SNP calling on Affymetrix microarrays. Genome Biol (2008) 1.61

Identification of candidate growth promoting genes in ovarian cancer through integrated copy number and expression analysis. PLoS One (2010) 1.60

Effects of everyday life events on glucose, insulin, and glucagon dynamics in continuous subcutaneous insulin infusion-treated type 1 diabetes: collection of clinical data for glucose modeling. Diabetes Technol Ther (2011) 1.55

Gene expression analysis of ischemic and nonischemic cardiomyopathy: shared and distinct genes in the development of heart failure. Physiol Genomics (2005) 1.54

A single-sample method for normalizing and combining full-resolution copy numbers from multiple platforms, labs and analysis methods. Bioinformatics (2009) 1.53

A comparison of Affymetrix gene expression arrays. BMC Bioinformatics (2007) 1.53

Reticulocyte and erythrocyte binding-like proteins function cooperatively in invasion of human erythrocytes by malaria parasites. Infect Immun (2010) 1.52

Identification of SOX3 as an XX male sex reversal gene in mice and humans. J Clin Invest (2010) 1.50

Identification of a gene expression profile that differentiates between ischemic and nonischemic cardiomyopathy. Circulation (2004) 1.47

Response to Shields: 'MIAME, we have a problem'. Trends Genet (2006) 1.46

Parent-specific copy number in paired tumor-normal studies using circular binary segmentation. Bioinformatics (2011) 1.45

Comparing genotyping algorithms for Illumina's Infinium whole-genome SNP BeadChips. BMC Bioinformatics (2011) 1.45

The comparative roles of suppressor of cytokine signaling-1 and -3 in the inhibition and desensitization of cytokine signaling. J Biol Chem (2006) 1.45

High-quality DNA sequence capture of 524 disease candidate genes. Proc Natl Acad Sci U S A (2011) 1.41

Using the R Package crlmm for Genotyping and Copy Number Estimation. J Stat Softw (2011) 1.41

Processing of Agilent microRNA array data. BMC Res Notes (2010) 1.39

In vitro identification and in silico utilization of interspecies sequence similarities using GeneChip technology. BMC Genomics (2005) 1.39