Gapped alignment of protein sequence motifs through Monte Carlo optimization of a hidden Markov model.

PubWeight™: 1.12‹?› | Rank: Top 10%

🔗 View Article (PMC 538276)

Published in BMC Bioinformatics on October 25, 2004

Authors

Andrew F Neuwald1, Jun S Liu

Author Affiliations

1: Cold Spring Harbor Laboratory, 1 Bungtown Road, P,O, Box 100, Cold Spring Harbor, NY 11724, USA. neuwald@cshl.org

Articles citing this

Discovering sequence motifs with arbitrary insertions and deletions. PLoS Comput Biol (2008) 4.10

Structural and functional diversity of the microbial kinome. PLoS Biol (2007) 3.74

The construction and use of log-odds substitution scores for multiple sequence alignment. PLoS Comput Biol (2010) 1.54

Evolution of allostery in the cyclic nucleotide binding module. Genome Biol (2007) 1.44

Automated hierarchical classification of protein domain subfamilies based on functionally-divergent residue signatures. BMC Bioinformatics (2012) 1.29

Sequence and structure signatures of cancer mutation hotspots in protein kinases. PLoS One (2009) 1.14

Rapid detection, classification and accurate alignment of up to a million or more related protein sequences. Bioinformatics (2009) 1.08

Surveying the manifold divergence of an entire protein class for statistical clues to underlying biochemical mechanisms. Stat Appl Genet Mol Biol (2011) 0.93

A typhus group-specific protease defies reductive evolution in rickettsiae. J Bacteriol (2009) 0.89

Evolutionary clues to eukaryotic DNA clamp-loading mechanisms: analysis of the functional constraints imposed on replication factor C AAA+ ATPases. Nucleic Acids Res (2005) 0.89

Galpha Gbetagamma dissociation may be due to retraction of a buried lysine and disruption of an aromatic cluster by a GTP-sensing Arg Trp pair. Protein Sci (2007) 0.87

The glycine brace: a component of Rab, Rho, and Ran GTPases associated with hinge regions of guanine- and phosphate-binding loops. BMC Struct Biol (2009) 0.86

A Bayesian sampler for optimization of protein domain hierarchies. J Comput Biol (2014) 0.84

The charge-dipole pocket: a defining feature of signaling pathway GTPase on/off switches. J Mol Biol (2009) 0.84

Detailed protein sequence alignment based on Spectral Similarity Score (SSS). BMC Bioinformatics (2005) 0.83

Bayesian Top-Down Protein Sequence Alignment with Inferred Position-Specific Gap Penalties. PLoS Comput Biol (2016) 0.77

Hypothesis: bacterial clamp loader ATPase activation through DNA-dependent repositioning of the catalytic base and of a trans-acting catalytic threonine. Nucleic Acids Res (2006) 0.76

Inference of Functionally-Relevant N-acetyltransferase Residues Based on Statistical Correlations. PLoS Comput Biol (2016) 0.75

Articles cited by this

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res (1997) 665.31

CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res (1994) 392.47

MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res (2004) 168.89

Optimization by simulated annealing. Science (1983) 71.02

T-Coffee: A novel method for fast and accurate multiple sequence alignment. J Mol Biol (2000) 57.88

Profile hidden Markov models. Bioinformatics (1998) 56.04

MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics (2004) 50.89

MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res (2002) 47.62

Position-based sequence weights. J Mol Biol (1994) 24.41

Hidden Markov models for detecting remote protein homologies. Bioinformatics (1998) 21.29

AAA+: A class of chaperone-like ATPases associated with the assembly, operation, and disassembly of protein complexes. Genome Res (1999) 11.30

The alpha/beta hydrolase fold. Protein Eng (1992) 9.94

BAliBASE (Benchmark Alignment dataBASE): enhancements for repeats, transmembrane sequences and circular permutations. Nucleic Acids Res (2001) 6.67

Supercooled liquids and the glass transition. Nature (2001) 6.00

Gibbs motif sampling: detection of bacterial outer membrane protein repeats. Protein Sci (1995) 5.96

AAA+ superfamily ATPases: common structure--diverse function. Genes Cells (2001) 5.83

Evolutionary history and higher order classification of AAA+ ATPases. J Struct Biol (2004) 4.68

Hidden Markov models for sequence analysis: extension and analysis of the basic method. Comput Appl Biosci (1996) 4.58

Alpha/beta hydrolase fold enzymes: the family keeps growing. Curr Opin Struct Biol (1999) 4.12

The AAA team: related ATPases with diverse functions. Trends Cell Biol (1998) 3.48

Generalized affine gap costs for protein sequence alignment. Proteins (1998) 3.42

Structure of the AAA ATPase p97. Mol Cell (2000) 3.37

Extracting protein alignment models from the sequence database. Nucleic Acids Res (1997) 3.17

HEAT repeats associated with condensins, cohesins, and other complexes involved in chromosome-related functions. Genome Res (2000) 2.66

A 200-amino acid ATPase module in search of a basic function. Bioessays (1995) 2.51

Molecular perspectives on p97-VCP: progress in understanding its structure and diverse biological functions. J Struct Biol (2004) 2.35

A highly conserved ATPase protein as a mediator between acidic activation domains and the TATA-binding protein. Nature (1995) 1.83

Evolutionary constraints associated with functional specificity of the CMGC protein kinases MAPK, CDK, GSK, SRPK, DYRK, and CK2alpha. Protein Sci (2004) 1.33

PSI-BLAST searches using hidden markov models of structural repeats: prediction of an unusual sliding DNA clamp and of beta-propellers in UV-damaged DNA-binding protein. Nucleic Acids Res (2000) 1.29

Ran's C-terminal, basic patch, and nucleotide exchange mechanisms in light of a canonical structure for Rab, Rho, Ras, and Ran GTPases. Genome Res (2003) 1.20

Evolutionary clues to DNA polymerase III beta clamp structural mechanisms. Nucleic Acids Res (2003) 0.91

Articles by these authors

Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature (2012) 18.23

Bayesian haplotype inference for multiple linked single-nucleotide polymorphisms. Am J Hum Genet (2001) 10.48

An algorithm for finding protein-DNA binding sites with applications to chromatin-immunoprecipitation microarray experiments. Nat Biotechnol (2002) 10.23

Partition-ligation-expectation-maximization algorithm for haplotype inference with single-nucleotide polymorphisms. Am J Hum Genet (2002) 10.12

Methylation of histone H3 Lys 4 in coding regions of active genes. Proc Natl Acad Sci U S A (2002) 7.67

Model-based analysis of two-color arrays (MA2C). Genome Biol (2007) 7.28

Integrating regulatory motif discovery and genome-wide expression analysis. Proc Natl Acad Sci U S A (2003) 4.74

Bayesian inference of epistatic interactions in case-control studies. Nat Genet (2007) 4.31

The Spo0A regulon of Bacillus subtilis. Mol Microbiol (2003) 4.19

The program of gene transcription for a single differentiating cell type during sporulation in Bacillus subtilis. PLoS Biol (2004) 3.33

Genomic sequence is highly predictive of local nucleosome depletion. PLoS Comput Biol (2007) 3.24

The sigmaE regulon and the identification of additional sporulation genes in Bacillus subtilis. J Mol Biol (2003) 2.59

Recursive SVM feature selection and sample classification for mass-spectrometry and microarray data. BMC Bioinformatics (2006) 2.53

Modeling within-motif dependence for transcription factor binding site predictions. Bioinformatics (2004) 2.29

GFOLD: a generalized fold change for ranking differentially expressed genes from RNA-seq data. Bioinformatics (2012) 2.21

De novo cis-regulatory module elicitation for eukaryotic genomes. Proc Natl Acad Sci U S A (2005) 2.20

A data-driven clustering method for time course gene expression data. Nucleic Acids Res (2006) 2.20

Phylogenomics of nonavian reptiles and the structure of the ancestral amniote genome. Proc Natl Acad Sci U S A (2007) 2.09

HapBlock: haplotype block partitioning and tag SNP selection software using a set of dynamic programming algorithms. Bioinformatics (2004) 1.86

HiCNorm: removing biases in Hi-C data via Poisson regression. Bioinformatics (2012) 1.77

Haplotype block partitioning and tag SNP selection using genotype data and their applications to association studies. Genome Res (2004) 1.74

Decoding human regulatory circuits. Genome Res (2004) 1.70

Information flow analysis of interactome networks. PLoS Comput Biol (2009) 1.68

Bayesian inference of spatial organizations of chromosomes. PLoS Comput Biol (2013) 1.66

A Bayesian partition method for detecting pleiotropic and epistatic eQTL modules. PLoS Comput Biol (2010) 1.66

Incorporating genotyping uncertainty in haplotype inference for single-nucleotide polymorphisms. Am J Hum Genet (2004) 1.66

Statistical resynchronization and Bayesian detection of periodically expressed genes. Nucleic Acids Res (2004) 1.64

Clustering analysis of SAGE data using a Poisson approach. Genome Biol (2004) 1.58

Broadly heterogeneous activation of the master regulator for sporulation in Bacillus subtilis. Proc Natl Acad Sci U S A (2010) 1.56

BioOptimizer: a Bayesian scoring function approach to motif discovery. Bioinformatics (2004) 1.55

A boosting approach for motif modeling using ChIP-chip data. Bioinformatics (2005) 1.53

Cooperation between Polycomb and androgen receptor during oncogenic transformation. Genome Res (2011) 1.47

Gene expression profiling of human breast tissue samples using SAGE-Seq. Genome Res (2010) 1.41

Determination of local statistical significance of patterns in Markov sequences with application to promoter element identification. J Comput Biol (2004) 1.40

Identification of co-regulated genes through Bayesian clustering of predicted regulatory binding sites. Nat Biotechnol (2003) 1.38

Bayesian inference of protein-protein interactions from biological literature. Bioinformatics (2009) 1.35

A suite of web-based programs to search for transcriptional regulatory motifs. Nucleic Acids Res (2004) 1.29

Defining a centromere-like element in Bacillus subtilis by Identifying the binding sites for the chromosome-anchoring protein RacA. Mol Cell (2005) 1.29

BALSA: Bayesian algorithm for local sequence alignment. Nucleic Acids Res (2002) 1.27

Genetic variation in XPD, sun exposure, and risk of skin cancer. Cancer Epidemiol Biomarkers Prev (2005) 1.24

RSIR: regularized sliced inverse regression for motif discovery. Bioinformatics (2005) 1.23

MAGeCK enables robust identification of essential genes from genome-scale CRISPR/Cas9 knockout screens. Genome Biol (2014) 1.22

Haplotype information and linkage disequilibrium mapping for single nucleotide polymorphisms. Genome Res (2003) 1.21

Ran's C-terminal, basic patch, and nucleotide exchange mechanisms in light of a canonical structure for Rab, Rho, Ras, and Ran GTPases. Genome Res (2003) 1.20

Bayesian models for pooling microarray studies with multiple sources of replications. BMC Bioinformatics (2006) 1.14

Bayesian biclustering of gene expression data. BMC Genomics (2008) 1.14

Predicting gene expression from sequence: a reexamination. PLoS Comput Biol (2007) 1.08

MM-ChIP enables integrative analysis of cross-platform and between-laboratory ChIP-chip or ChIP-seq data. Genome Biol (2011) 1.07

BEST: binding-site estimation suite of tools. Bioinformatics (2005) 1.06

On side-chain conformational entropy of proteins. PLoS Comput Biol (2006) 1.05

Detecting and understanding combinatorial mutation patterns responsible for HIV drug resistance. Proc Natl Acad Sci U S A (2010) 1.05

Identification of genes and pathways involved in kidney renal clear cell carcinoma. BMC Bioinformatics (2014) 1.05

Combining phylogenetic motif discovery and motif clustering to predict co-regulated genes. Bioinformatics (2005) 1.04

Extracting sequence features to predict protein-DNA interactions: a comparative study. Nucleic Acids Res (2008) 1.04

IMID: integrated molecular interaction database. Bioinformatics (2012) 1.02

Monte Carlo sampling of near-native structures of proteins with applications. Proteins (2007) 1.00

Statistical assessment of the global regulatory role of histone acetylation in Saccharomyces cerevisiae. Genome Biol (2006) 0.99

A coalescence-guided hierarchical Bayesian method for haplotype inference. Am J Hum Genet (2006) 0.98

CORRELATION PURSUIT: FORWARD STEPWISE VARIABLE SELECTION FOR INDEX MODELS. J R Stat Soc Series B Stat Methodol (2012) 0.96

Biopolymer structure simulation and optimization via fragment regrowth Monte Carlo. J Chem Phys (2007) 0.94

Using Poisson mixed-effects model to quantify transcript-level gene expression in RNA-Seq. Bioinformatics (2011) 0.94

Ventral midbrain correlation between genetic variation and expression of the dopamine transporter gene in cocaine-abusing versus non-abusing subjects. Addict Biol (2011) 0.94

A Bayesian method for classification of images from electron micrographs. J Struct Biol (2002) 0.92

Tmod: toolbox of motif discovery. Bioinformatics (2009) 0.91

Selective promoter recognition by chlamydial sigma28 holoenzyme. J Bacteriol (2006) 0.90

BLOCK-BASED BAYESIAN EPISTASIS ASSOCIATION MAPPING WITH APPLICATION TO WTCCC TYPE 1 DIABETES DATA. Ann Appl Stat (2011) 0.89

Molecular classification of liver cirrhosis in a rat model by proteomics and bioinformatics. Proteomics (2004) 0.89

Bayesian models for detecting epistatic interactions from genetic data. Ann Hum Genet (2010) 0.88

Bayesian hierarchical model of protein-binding microarray k-mer data reduces noise and identifies transcription factor subclasses and preferred k-mers. Bioinformatics (2013) 0.84

Context-specific protein network miner--an online system for exploring context-specific protein interaction networks from the literature. PLoS One (2012) 0.83

Systems-level analysis and evolution of the phototransduction network in Drosophila. Proc Natl Acad Sci U S A (2007) 0.83

A database of annotated promoters of genes associated with common respiratory and related diseases. Am J Respir Cell Mol Biol (2012) 0.81

Statistical power of phylo-HMM for evolutionarily conserved element detection. BMC Bioinformatics (2007) 0.81

Rasch models of aphasic performance on syntactic comprehension tests. Cogn Neuropsychol (2010) 0.80

Bayesian functional data clustering for temporal microarray data. Int J Plant Genomics (2008) 0.80

Integrated bio-entity network: a system for biological knowledge discovery. PLoS One (2011) 0.80

Genome-wide analysis of regions similar to promoters of histone genes. BMC Syst Biol (2010) 0.79

Fast and Accurate Approximation to Significance Tests in Genome-Wide Association Studies. J Am Stat Assoc (2011) 0.78

PIMiner: a web tool for extraction of protein interactions from biomedical literature. Int J Data Min Bioinform (2013) 0.77

Advances in translational bioinformatics facilitate revealing the landscape of complex disease mechanisms. BMC Bioinformatics (2014) 0.76

Transposon identification using profile HMMs. BMC Genomics (2010) 0.75

The emerging genomics and systems biology research lead to systems genomics studies. BMC Genomics (2014) 0.75

Erratum to: A data-adaptive Bayesian regression approach for polygenic risk prediction. Bioinformatics (2022) 0.75

Discovering herbal functional groups of traditional Chinese medicine. Stat Med (2011) 0.75