The rapid generation of mutation data matrices from protein sequences.

PubWeight™: 44.38‹?› | Rank: Top 0.01% | All-Time Top 1000

🔗 View Article (PMID 1633570)

Published in Comput Appl Biosci on June 01, 1992

Authors

D T Jones1, W R Taylor, J M Thornton

Author Affiliations

1: Department of Biochemistry and Molecular Biology, University College, London, UK.

Articles citing this

(truncated to the top 100)

MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol (2011) 220.97

Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci U S A (1992) 61.33

MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics (2004) 50.89

MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res (2002) 47.62

MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol (2013) 34.34

MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res (2005) 31.64

PyCogent: a toolkit for making sense from sequence. Genome Biol (2007) 20.64

Detection of conserved segments in proteins: iterative scanning of sequence databases with alignment blocks. Proc Natl Acad Sci U S A (1994) 18.46

Applications and statistics for multiple high-scoring segments in molecular sequences. Proc Natl Acad Sci U S A (1993) 12.10

Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene. Genetics (1998) 11.25

ConSurf 2005: the projection of evolutionary conservation scores of residues on protein structures. Nucleic Acids Res (2005) 10.60

A new method of inference of ancestral nucleotide and amino acid sequences. Genetics (1995) 10.23

ConSurf 2010: calculating evolutionary conservation in sequence and structure of proteins and nucleic acids. Nucleic Acids Res (2010) 9.68

Protein database searches using compositionally adjusted substitution matrices. FEBS J (2005) 8.14

Site-specific evolutionary rate inference: taking phylogenetic uncertainty into account. J Mol Evol (2005) 7.00

Local homology recognition and distance measures in linear time using compressed amino acid alphabets. Nucleic Acids Res (2004) 6.42

Assessment of methods for amino acid matrix selection and their use on empirical data shows that ad hoc assumptions for choice of matrix are not justified. BMC Evol Biol (2006) 5.74

Positive Darwinian selection after gene duplication in primate ribonuclease genes. Proc Natl Acad Sci U S A (1998) 5.54

Transmembrane helices predicted at 95% accuracy. Protein Sci (1995) 5.05

Genome trees constructed using five different approaches suggest new major bacterial clades. BMC Evol Biol (2001) 4.59

From gene trees to organismal phylogeny in prokaryotes: the case of the gamma-Proteobacteria. PLoS Biol (2003) 4.47

Comparison of methods for searching protein sequence databases. Protein Sci (1995) 4.29

Gene loss, protein sequence divergence, gene dispensability, expression level, and interactivity are correlated in eukaryotic evolution. Genome Res (2003) 3.90

The genetic core of the universal ancestor. Genome Res (2003) 3.82

The human phylome. Genome Biol (2007) 3.81

The genome sequence of Blochmannia floridanus: comparative analysis of reduced genomes. Proc Natl Acad Sci U S A (2003) 3.81

Comparative genomics of plant-associated Pseudomonas spp.: insights into diversity and inheritance of traits involved in multitrophic interactions. PLoS Genet (2012) 3.46

Evolutionary origins of genomic repertoires in bacteria. PLoS Biol (2005) 3.43

Discovery of the principal specific transcription factors of Apicomplexa and their implication for the evolution of the AP2-integrase DNA binding domains. Nucleic Acids Res (2005) 3.36

INDELible: a flexible simulator of biological sequence evolution. Mol Biol Evol (2009) 3.29

Cloud computing for comparative genomics. BMC Bioinformatics (2010) 3.25

Variation in the strength of selected codon usage bias among bacteria. Nucleic Acids Res (2005) 3.18

Suppression of long-branch attraction artefacts in the animal phylogeny using a site-heterogeneous model. BMC Evol Biol (2007) 3.09

Phylogenetic and structural analysis of centromeric DNA and kinetochore proteins. Genome Biol (2006) 2.99

Scoredist: a simple and robust protein sequence distance estimator. BMC Bioinformatics (2005) 2.99

The root of the universal tree and the origin of eukaryotes based on elongation factor phylogeny. Proc Natl Acad Sci U S A (1996) 2.93

PRINTS--a database of protein motif fingerprints. Nucleic Acids Res (1994) 2.86

Estimating metazoan divergence times with a molecular clock. Proc Natl Acad Sci U S A (2004) 2.80

Plasmodium falciparum erythrocyte membrane protein 1 diversity in seven genomes--divide and conquer. PLoS Comput Biol (2010) 2.80

Solving the protein sequence metric problem. Proc Natl Acad Sci U S A (2005) 2.78

Bio++: a set of C++ libraries for sequence analysis, phylogenetics, molecular evolution and population genetics. BMC Bioinformatics (2006) 2.77

Evolution of moth sex pheromones via ancestral genes. Proc Natl Acad Sci U S A (2002) 2.64

Gene expression intensity shapes evolutionary rates of the proteins encoded by the vertebrate genome. Genetics (2004) 2.63

Neuronal transcriptome of Aplysia: neuronal compartments and circuitry. Cell (2006) 2.61

Zeta, a novel class of glutathione transferases in a range of species from plants to humans. Biochem J (1997) 2.60

MBGD: microbial genome database for comparative analysis. Nucleic Acids Res (2003) 2.60

BMGE (Block Mapping and Gathering with Entropy): a new software for selection of phylogenetic informative regions from multiple sequence alignments. BMC Evol Biol (2010) 2.58

Resistance to alpha/beta interferon is a determinant of West Nile virus replication fitness and virulence. J Virol (2006) 2.57

Molecular genetics and evolution of pheromone biosynthesis in Lepidoptera. Proc Natl Acad Sci U S A (2003) 2.56

Inference of functional regions in proteins by quantification of evolutionary constraints. Proc Natl Acad Sci U S A (2002) 2.54

Diversity and evolution of the green fluorescent protein family. Proc Natl Acad Sci U S A (2002) 2.51

The universal distribution of evolutionary rates of genes and distinct characteristics of eukaryotic genes of different apparent ages. Proc Natl Acad Sci U S A (2009) 2.50

Phylotranscriptomic analysis of the origin and early diversification of land plants. Proc Natl Acad Sci U S A (2014) 2.48

The basic helix-loop-helix protein family: comparative genomics and phylogenetic analysis. Genome Res (2001) 2.47

Phylogenomic analysis reveals bees and wasps (Hymenoptera) at the base of the radiation of Holometabolous insects. Genome Res (2006) 2.42

The compositional adjustment of amino acid substitution matrices. Proc Natl Acad Sci U S A (2003) 2.42

Phylogenetic analysis of 277 human G-protein-coupled receptors as a tool for the prediction of orphan receptor ligands. Genome Biol (2002) 2.40

An alignment confidence score capturing robustness to guide tree uncertainty. Mol Biol Evol (2010) 2.39

Functional characterization of nine Norway Spruce TPS genes and evolution of gymnosperm terpene synthases of the TPS-d subfamily. Plant Physiol (2004) 2.36

Diversity and evolution of coral fluorescent proteins. PLoS One (2008) 2.36

Molecular evolution of herpesviruses: genomic and protein sequence comparisons. J Virol (1994) 2.34

MBGD: a platform for microbial comparative genomics based on the automated construction of orthologous groups. Nucleic Acids Res (2006) 2.30

Origin and evolution of the slime molds (Mycetozoa) Proc Natl Acad Sci U S A (1997) 2.30

Genome sequence of Blochmannia pennsylvanicus indicates parallel evolutionary trends among bacterial mutualists of insects. Genome Res (2005) 2.28

PhyloFacts: an online structural phylogenomic encyclopedia for protein functional and structural classification. Genome Biol (2006) 2.27

Simultaneous Bayesian gene tree reconstruction and reconciliation analysis. Proc Natl Acad Sci U S A (2009) 2.25

Prolonged replication of a type 1 vaccine-derived poliovirus in an immunodeficient patient. J Clin Microbiol (1998) 2.24

Accuracies of ancestral amino acid sequences inferred by the parsimony, likelihood, and distance methods. J Mol Evol (1997) 2.23

Selecton 2007: advanced models for detecting positive and purifying selection using a Bayesian inference approach. Nucleic Acids Res (2007) 2.21

A phylogeny of caenorhabditis reveals frequent loss of introns during nematode evolution. Genome Res (2004) 2.21

Evidence for a new avian paramyxovirus serotype 10 detected in rockhopper penguins from the Falkland Islands. J Virol (2010) 2.21

Pervasive cryptic epistasis in molecular evolution. PLoS Genet (2010) 2.19

Demonstration of cross-protective vaccine immunity against an emerging pathogenic Ebolavirus Species. PLoS Pathog (2010) 2.18

Toward a comprehensive phylogeny for mammalian and avian herpesviruses. J Virol (2000) 2.17

The diversity of dolichol-linked precursors to Asn-linked glycans likely results from secondary loss of sets of glycosyltransferases. Proc Natl Acad Sci U S A (2005) 2.14

Pancrustacean phylogeny: hexapods are terrestrial crustaceans and maxillopods are not monophyletic. Proc Biol Sci (2005) 2.14

The Chlamydophila abortus genome sequence reveals an array of variable proteins that contribute to interspecies variation. Genome Res (2005) 2.12

Prediction of functional sites by analysis of sequence and structure conservation. Protein Sci (2004) 2.11

Thermodynamic system drift in protein evolution. PLoS Biol (2014) 2.10

Stability-mediated epistasis constrains the evolution of an influenza protein. Elife (2013) 2.07

Assessing the impact of secondary structure and solvent accessibility on protein evolution. Genetics (1998) 2.05

The ConSurf-DB: pre-calculated evolutionary conservation profiles of protein structures. Nucleic Acids Res (2008) 2.04

Genome sequence of Oceanobacillus iheyensis isolated from the Iheya Ridge and its unexpected adaptive capabilities to extreme environments. Nucleic Acids Res (2002) 2.03

Databases of homologous gene families for comparative genomics. BMC Bioinformatics (2009) 2.01

Origin and diversification of the basic helix-loop-helix gene family in metazoans: insights from comparative genomics. BMC Evol Biol (2007) 1.98

New complete genome sequences of human rhinoviruses shed light on their phylogeny and genomic features. BMC Genomics (2007) 1.97

Phylogenetic analysis of condensation domains in NRPS sheds light on their functional evolution. BMC Evol Biol (2007) 1.97

T-REX: a web server for inferring, validating and visualizing phylogenetic trees and networks. Nucleic Acids Res (2012) 1.97

Wild Mandrillus sphinx are carriers of two types of lentivirus. J Virol (2001) 1.96

Phylogenetic analysis of the human basic helix-loop-helix proteins. Genome Biol (2002) 1.96

A maximum likelihood method for detecting directional evolution in protein sequences and its application to influenza A virus. Mol Biol Evol (2008) 1.96

libcov: a C++ bioinformatic library to manipulate protein structures, sequence alignments and phylogeny. BMC Bioinformatics (2005) 1.95

The conserved plant sterility gene HAP2 functions after attachment of fusogenic membranes in Chlamydomonas and Plasmodium gametes. Genes Dev (2008) 1.95

Two C or not two C: recurrent disruption of Zn-ribbons, gene duplication, lineage-specific gene loss, and horizontal gene transfer in evolution of bacterial ribosomal proteins. Genome Biol (2001) 1.94

Early history of mammals is elucidated with the ENCODE multiple species sequencing data. PLoS Genet (2007) 1.92

An elaborate classification of SNARE proteins sheds light on the conservation of the eukaryotic endomembrane system. Mol Biol Cell (2007) 1.89

Evolutionary anatomies of positions and types of disease-associated and neutral amino acid mutations in the human genome. BMC Genomics (2006) 1.87

Direct evidence for secondary loss of mitochondria in Entamoeba histolytica. Proc Natl Acad Sci U S A (1995) 1.87

ALF--a simulation framework for genome evolution. Mol Biol Evol (2011) 1.86

Genetic complementation in apicomplexan parasites. Proc Natl Acad Sci U S A (2002) 1.84

Articles by these authors

CATH--a hierarchic classification of protein domain structures. Structure (1997) 29.95

AQUA and PROCHECK-NMR: programs for checking the quality of protein structures solved by NMR. J Biomol NMR (1996) 29.55

LIGPLOT: a program to generate schematic diagrams of protein-ligand interactions. Protein Eng (1995) 19.22

Satisfying hydrogen bonding potential in proteins. J Mol Biol (1994) 14.79

Principles of protein-protein interactions. Proc Natl Acad Sci U S A (1996) 14.51

Identification of protein sequence homology by consensus template alignment. J Mol Biol (1986) 13.73

A mutation data matrix for transmembrane proteins. FEBS Lett (1994) 13.49

PDBsum: a Web-based database of summaries and analyses of all PDB structures. Trends Biochem Sci (1997) 12.04

Main-chain bond lengths and bond angles in protein structures. J Mol Biol (1993) 9.80

Prediction of protein secondary structure and active sites using the alignment of homologous sequences. J Mol Biol (1987) 9.74

PQS: a protein quaternary structure file server. Trends Biochem Sci (1998) 9.24

Stereochemical quality of protein structure coordinates. Proteins (1992) 9.18

Protein structural topology: Automated analysis and diagrammatic representation. Protein Sci (1999) 8.24

A new approach to protein fold recognition. Nature (1992) 8.17

Comparison of glyceraldehyde-3-phosphate dehydrogenase and 28S-ribosomal RNA gene expression as RNA loading controls for northern blot analysis of cell lines of varying malignant potential. Anal Biochem (1994) 7.81

Amino acid-base interactions: a three-dimensional analysis of protein-DNA interactions at an atomic level. Nucleic Acids Res (2001) 7.27

PROMOTIF--a program to identify and analyze structural motifs in proteins. Protein Sci (1996) 6.76

A model recognition approach to the prediction of all-helical membrane protein structure and topology. Biochemistry (1994) 6.64

Evolution of function in protein superfamilies, from a structural perspective. J Mol Biol (2001) 6.63

Structure comparison and structure patterns. J Comput Biol (2000) 5.73

Comparison of functional annotation schemes for genomes. Funct Integr Genomics (2000) 5.17

Protein superfamilies and domain superfolds. Nature (1994) 5.11

A revised set of potentials for beta-turn formation in proteins. Protein Sci (1994) 4.97

Analysis and prediction of the different types of beta-turn in proteins. J Mol Biol (1988) 4.88

Influence of proline residues on protein conformation. J Mol Biol (1991) 4.87

A structural model for the retroviral proteases. Nature (1987) 4.78

Helix geometry in proteins. J Mol Biol (1988) 4.66

Analysis of protein-protein interaction sites using surface patches. J Mol Biol (1997) 4.51

p53 controls both the G2/M and the G1 cell cycle checkpoints and mediates reversible growth arrest in human fibroblasts. Proc Natl Acad Sci U S A (1995) 4.22

Assigning genomic sequences to CATH. Nucleic Acids Res (2000) 4.22

Protein clefts in molecular recognition and function. Protein Sci (1996) 4.20

Disulphide bridges in globular proteins. J Mol Biol (1981) 4.06

A sequence pattern common to T cell epitopes. EMBO J (1988) 4.02

Protein-protein interactions: a review of protein dimer structures. Prog Biophys Mol Biol (1995) 3.98

An overview of the structures of protein-DNA complexes. Genome Biol (2000) 3.90

Identification and classification of protein fold families. Protein Eng (1993) 3.58

TESS: a geometric hashing algorithm for deriving 3D coordinate templates for searching structural databases. Application to enzyme active sites. Protein Sci (1997) 3.50

Protein-DNA interactions: A structural analysis. J Mol Biol (1999) 3.38

Ion-pairs in proteins. J Mol Biol (1983) 3.37

Continuous and discontinuous protein antigenic determinants. Nature (1986) 3.26

SSAP: sequential structure alignment program for protein structure comparison. Methods Enzymol (1996) 3.25

The p53 network. J Biol Chem (1998) 3.19

The classification of amino acid conservation. J Theor Biol (1986) 3.12

Molecular recognition. Conformational analysis of limited proteolytic sites and serine proteinase protein inhibitors. J Mol Biol (1991) 3.04

Protein-protein interfaces: analysis of amino acid conservation in homodimers. Proteins (2001) 3.00

Knowledge-based prediction of protein structures and the design of novel molecules. Nature (1987) 2.99

Discriminating between homodimeric and monomeric proteins in the crystalline state. Proteins (2000) 2.96

Beta-hairpin families in globular proteins. Nature (1985) 2.95

Protein-RNA interactions: a structural analysis. Nucleic Acids Res (2001) 2.92

Prediction of protein-protein interaction sites using patch analysis. J Mol Biol (1997) 2.91

Conformation of beta-hairpins in protein structures. A systematic classification with applications to modelling by homology, electron density fitting and protein engineering. J Mol Biol (1989) 2.86

The CATH domain structure database. Methods Biochem Anal (2003) 2.83

NUCPLOT: a program to generate schematic diagrams of protein-nucleic acid interactions. Nucleic Acids Res (1997) 2.81

Childhood chronic illness: prevalence, severity, and impact. Am J Public Health (1992) 2.77

Role of NADH/NADPH oxidase-derived H2O2 in angiotensin II-induced vascular hypertrophy. Hypertension (1998) 2.70

Beta-turns and their distortions: a proposed new nomenclature. Protein Eng (1990) 2.67

The CATH Database provides insights into protein structure/function relationships. Nucleic Acids Res (1999) 2.57

Analysis of main chain torsion angles in proteins: prediction of NMR coupling constants for native and random coil conformations. J Mol Biol (1996) 2.50

Coevolving protein residues: maximum likelihood identification and relationship to structure. J Mol Biol (1999) 2.49

Angiotensin II-induced hypertension accelerates the development of atherosclerosis in apoE-deficient mice. Circulation (2001) 2.48

Structural model of HLA-DR1 restricted T cell antigen recognition. Cell (1988) 2.41

Domain assignment for protein structures using a consensus approach: characterization and analysis. Protein Sci (1998) 2.41

Identification, classification, and analysis of beta-bulges in proteins. Protein Sci (1993) 2.36

Intrinsic phi, psi propensities of amino acids, derived from the coil regions of known structures. Nat Struct Biol (1995) 2.32

Buried waters and internal cavities in monomeric proteins. Protein Sci (1994) 2.30

Three-dimensional structure analysis of PROSITE patterns. J Mol Biol (1999) 2.25

Location of 'continuous' antigenic determinants in the protruding regions of proteins. EMBO J (1986) 2.22

Antibody-antigen interactions: contact analysis and binding site topography. J Mol Biol (1996) 2.22

Recognition of super-secondary structure in proteins. J Mol Biol (1984) 2.21

The evolution and structural anatomy of the small molecule metabolic pathways in Escherichia coli. J Mol Biol (2001) 2.15

Sequences annotated by structure: a tool to facilitate the use of structural information in sequence analysis. Protein Eng (1998) 2.12

p22phox mRNA expression and NADPH oxidase activity are increased in aortas from hypertensive rats. Circ Res (1997) 2.04

A rapid method of protein structure alignment. J Theor Biol (1990) 2.03

HERA--a program to draw schematic diagrams of protein secondary structures. Proteins (1990) 2.00

Conservation helps to identify biologically relevant crystal contacts. J Mol Biol (2001) 1.98

Derivation of 3D coordinate templates for searching structural databases: application to Ser-His-Asp catalytic triads in the serine proteinases and lipases. Protein Sci (1996) 1.97

Distributions of water around amino acid residues in proteins. J Mol Biol (1988) 1.90

Modelling protein unfolding: hen egg-white lysozyme. Protein Eng (1997) 1.88

Prediction of super-secondary structure in proteins. Nature (1983) 1.84

Comparison of conformational characteristics in structurally similar protein pairs. Protein Sci (1993) 1.84

Amino and carboxy-terminal regions in globular proteins. J Mol Biol (1983) 1.82

Pi-pi interactions: the geometry and energetics of phenylalanine-phenylalanine interactions in proteins. J Mol Biol (1991) 1.81

SIRIUS. An automated method for the analysis of the preferred packing arrangements between protein groups. J Mol Biol (1990) 1.80

Amino/aromatic interactions in proteins: is the evidence stacked against hydrogen bonding? J Mol Biol (1994) 1.77

Structural families in loops of homologous proteins: automatic classification, modelling and application to antibodies. J Mol Biol (1996) 1.71

Validation of protein models derived from experiment. Curr Opin Struct Biol (1998) 1.70

Protein folds and functions. Structure (1998) 1.69

The CATH Dictionary of Homologous Superfamilies (DHS): a consensus approach for identifying distant structural homologues. Protein Eng (2000) 1.67

On the conformation of proteins: the handedness of the connection between parallel beta-strands. J Mol Biol (1977) 1.66

Sudden death among Southeast Asian refugees. An unexplained nocturnal phenomenon. JAMA (1983) 1.65

Global fold determination from a small number of distance restraints. J Mol Biol (1995) 1.64