Automated hierarchical classification of protein domain subfamilies based on functionally-divergent residue signatures.

PubWeight™: 1.29‹?› | Rank: Top 10%

🔗 View Article (PMC 3599474)

Published in BMC Bioinformatics on June 22, 2012

Authors

Andrew F Neuwald1, Christopher J Lanczycki, Aron Marchler-Bauer

Author Affiliations

1: Institute for Genome Sciences and Department of Biochemistry & Molecular Biology, University of Maryland School of Medicine, BioPark II, Room 617, 801 West Baltimore St, Baltimore, MD 21201, USA. aneuwald@som.umaryland.edu

Articles cited by this

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res (1997) 665.31

Profile hidden Markov models. Bioinformatics (1998) 56.04

Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics (2006) 43.68

OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res (2003) 33.03

The Pfam protein families database. Nucleic Acids Res (2007) 30.53

CDD: a Conserved Domain Database for the functional annotation of proteins. Nucleic Acids Res (2010) 19.07

Cn3D: sequence and structure views for Entrez. Trends Biochem Sci (2000) 16.98

Automatic clustering of orthologs and in-paralogs from pairwise species comparisons. J Mol Biol (2001) 16.47

CDD: a curated Entrez database of conserved domain alignments. Nucleic Acids Res (2003) 14.38

The TIGRFAMs database of protein families. Nucleic Acids Res (2003) 13.59

SMART 6: recent updates and new developments. Nucleic Acids Res (2008) 9.80

An evolutionary trace method defines binding surfaces common to protein families. J Mol Biol (1996) 9.31

Evolutionarily conserved pathways of energetic connectivity in protein families. Science (1999) 9.12

Tolerating some redundancy significantly speeds up clustering of large protein databases. Bioinformatics (2002) 5.46

RIO: analyzing proteomes by automated phylogenomics using resampled inference of orthologs. BMC Bioinformatics (2002) 4.49

Cn3D: a new generation of three-dimensional molecular structure viewer. Trends Biochem Sci (1997) 4.39

Automated ortholog inference from phylogenetic trees and calculation of orthology reliability. Bioinformatics (2002) 3.79

Protein sectors: evolutionary units of three-dimensional structure. Cell (2009) 3.76

Analysis and prediction of functional sub-types from protein sequence alignments. J Mol Biol (2000) 3.44

A method to predict functional residues in proteins. Nat Struct Biol (1995) 3.33

Extracting protein alignment models from the sequence database. Nucleic Acids Res (1997) 3.17

A family of evolution-entropy hybrid methods for ranking protein residues by importance. J Mol Biol (2004) 3.00

The hallmark of AGC kinase functional divergence is its C-terminal tail, a cis-acting regulatory module. Proc Natl Acad Sci U S A (2007) 2.65

Did protein kinase regulatory mechanisms evolve through elaboration of a simple structural component? J Mol Biol (2005) 2.28

DIVERGE: phylogeny-based analysis for functional-structural divergence of a protein family. Bioinformatics (2002) 2.18

Detecting patterns in protein sequences. J Mol Biol (1994) 2.16

Using orthologous and paralogous proteins to identify specificity-determining residues in bacterial transcription factors. J Mol Biol (2002) 2.00

Automated selection of positions determining functional specificity of proteins by comparative analysis of orthologous groups in protein families. Protein Sci (2004) 1.91

Automated protein subfamily identification and classification. PLoS Comput Biol (2007) 1.89

Clustering of proximal sequence space for the identification of protein families. Bioinformatics (2002) 1.87

Characterization and prediction of residues determining protein functional specificity. Bioinformatics (2008) 1.80

Prediction of protein functional residues from sequence by probability density estimation. Bioinformatics (2008) 1.77

Secator: a program for inferring protein subfamilies from phylogenetic trees. Mol Biol Evol (2001) 1.57

Multi-RELIEF: a method to recognize specificity determining residues from multiple sequence alignments using a Machine-Learning approach for feature weighting. Bioinformatics (2007) 1.51

Bayesian search of functionally divergent protein subgroups and their function specific residues. Bioinformatics (2006) 1.51

INTREPID--INformation-theoretic TREe traversal for Protein functional site IDentification. Bioinformatics (2008) 1.42

Sequence comparison by sequence harmony identifies subtype-specific functional sites. Nucleic Acids Res (2006) 1.40

Functional specificity lies within the properties and evolutionary changes of amino acids. J Mol Biol (2007) 1.23

Ran's C-terminal, basic patch, and nucleotide exchange mechanisms in light of a canonical structure for Rab, Rho, Ras, and Ran GTPases. Genome Res (2003) 1.20

Identification of functional residues and secondary structure from protein multiple sequence alignment. Methods Enzymol (1996) 1.18

Gapped alignment of protein sequence motifs through Monte Carlo optimization of a hidden Markov model. BMC Bioinformatics (2004) 1.12

Sequence harmony: detecting functional specificity from alignments. Nucleic Acids Res (2007) 1.10

Genome-scale phylogenetic function annotation of large and diverse protein families. Genome Res (2011) 1.10

Rapid detection, classification and accurate alignment of up to a million or more related protein sequences. Bioinformatics (2009) 1.08

TreeDet: a web server to explore sequence space. Nucleic Acids Res (2006) 1.05

The CHAIN program: forging evolutionary links to underlying mechanisms. Trends Biochem Sci (2007) 1.02

Ensemble approach to predict specificity determinants: benchmarking and validation. BMC Bioinformatics (2009) 0.96

Bayesian shadows of molecular mechanisms cast in the light of evolution. Trends Biochem Sci (2006) 0.93

Surveying the manifold divergence of an entire protein class for statistical clues to underlying biochemical mechanisms. Stat Appl Genet Mol Biol (2011) 0.93

The glycine brace: a component of Rab, Rho, and Ran GTPases associated with hinge regions of guanine- and phosphate-binding loops. BMC Struct Biol (2009) 0.86

The charge-dipole pocket: a defining feature of signaling pathway GTPase on/off switches. J Mol Biol (2009) 0.84

Bayesian classification of residues associated with protein functional divergence: Arf and Arf-like GTPases. Biol Direct (2010) 0.81

Bayesian mixture modeling using a hybrid sampler with application to protein subfamily identification. Biostatistics (2009) 0.79

Articles by these authors

CDD: a Conserved Domain Database for the functional annotation of proteins. Nucleic Acids Res (2010) 19.07

CDD: a Conserved Domain Database for protein classification. Nucleic Acids Res (2005) 18.85

CDD: a curated Entrez database of conserved domain alignments. Nucleic Acids Res (2003) 14.38

Database resources of the National Center for Biotechnology Information. Nucleic Acids Res (2009) 12.51

CDD: a conserved domain database for interactive domain family analysis. Nucleic Acids Res (2006) 11.41

Database resources of the National Center for Biotechnology Information. Nucleic Acids Res (2010) 10.97

CDD: specific functional annotation with the Conserved Domain Database. Nucleic Acids Res (2008) 10.73

Database resources of the National Center for Biotechnology Information. Nucleic Acids Res (2011) 8.62

MMDB: annotating protein sequences with Entrez's 3D-structure database. Nucleic Acids Res (2006) 7.57

CDD: conserved domains and protein three-dimensional structure. Nucleic Acids Res (2012) 6.39

MMDB: Entrez's 3D-structure database. Nucleic Acids Res (2003) 6.14

The NCBI BioSystems database. Nucleic Acids Res (2009) 4.80

MMDB: Entrez's 3D-structure database. Nucleic Acids Res (2002) 4.42

Protein subfamily assignment using the Conserved Domain Database. BMC Res Notes (2008) 3.62

MMDB: 3D structures and macromolecular interactions. Nucleic Acids Res (2011) 2.06

Inferred Biomolecular Interaction Server--a web server to analyze and predict protein interacting partners and binding sites. Nucleic Acids Res (2009) 1.85

Annotation of functional sites with the Conserved Domain Database. Database (Oxford) (2012) 1.81

MMDB and VAST+: tracking structural similarities between macromolecular complexes. Nucleic Acids Res (2013) 1.79

Identification of a subunit of a novel Kleisin-beta/SMC complex as a potential substrate of protein phosphatase 2A. Curr Biol (2003) 1.41

IBIS (Inferred Biomolecular Interaction Server) reports, predicts and integrates multiple types of conserved interactions for proteins. Nucleic Acids Res (2011) 1.16

Refining multiple sequence alignments with conserved core regions. Nucleic Acids Res (2006) 1.15

Analysis and prediction of functionally important sites in proteins. Protein Sci (2007) 1.14

SPEER-SERVER: a web server for prediction of protein specificity determining sites. Nucleic Acids Res (2012) 0.88

Knowledge-based expert systems and a proof-of-concept case study for multiple sequence alignment construction and analysis. Brief Bioinform (2008) 0.80

CORAL: aligning conserved core regions across domain families. Bioinformatics (2009) 0.79

AlexSys: a knowledge-based expert system for multiple sequence alignment construction and analysis. Nucleic Acids Res (2010) 0.79

Automatic annotation of experimentally derived, evolutionarily conserved post-translational modifications onto multiple genomes. Database (Oxford) (2011) 0.78

State of the art: refinement of multiple sequence alignments. BMC Bioinformatics (2006) 0.77

Prediction of functionally important sites from protein sequences using sparse kernel least squares classifiers. Biochem Biophys Res Commun (2009) 0.76