Homologous over-extension: a challenge for iterative similarity searches.

PubWeight™: 1.70‹?› | Rank: Top 3%

🔗 View Article (PMC 2853128)

Published in Nucleic Acids Res on January 11, 2010

Authors

Mileidy W Gonzalez1, William R Pearson

Author Affiliations

1: Department of Biological Sciences, University of Maryland Baltimore County, Baltimore, MD 21250, USA.

Articles citing this

HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment. Nat Methods (2011) 6.06

Challenges in homology search: HMMER3 and convergent evolution of coiled-coil regions. Nucleic Acids Res (2013) 1.83

Threshold Average Precision (TAP-k): a measure of retrieval designed for bioinformatics. Bioinformatics (2010) 1.73

FFAS server: novel features and applications. Nucleic Acids Res (2011) 1.52

PSI-Search: iterative HOE-reduced profile SSEARCH searching. Bioinformatics (2012) 1.14

RefProtDom: a protein database with improved domain boundaries and homology relationships. Bioinformatics (2010) 0.96

An introduction to sequence similarity ("homology") searching. Curr Protoc Bioinformatics (2013) 0.93

Comprehensive analysis of DNA polymerase III α subunits and their homologs in bacterial genomes. Nucleic Acids Res (2013) 0.91

The Dfam database of repetitive DNA families. Nucleic Acids Res (2015) 0.91

Adjusting scoring matrices to correct overextended alignments. Bioinformatics (2013) 0.91

Identification of new homologs of PD-(D/E)XK nucleases by support vector machines trained on data derived from profile-profile alignments. Nucleic Acids Res (2010) 0.90

Selecting the Right Similarity-Scoring Matrix. Curr Protoc Bioinformatics (2013) 0.83

HangOut: generating clean PSI-BLAST profiles for domains with long insertions. Bioinformatics (2010) 0.83

Parameterizing sequence alignment with an explicit evolutionary model. BMC Bioinformatics (2015) 0.81

Using EMBL-EBI Services via Web Interface and Programmatically via Web Services. Curr Protoc Bioinformatics (2014) 0.80

Increased taxon sampling reveals thousands of hidden orthologs in flatworms. Genome Res (2017) 0.75

Domain analysis of symbionts and hosts (DASH) in a genome-wide survey of pathogenic human viruses. BMC Res Notes (2013) 0.75

Improving Retrieval Efficacy of Homology Searches Using the False Discovery Rate. IEEE/ACM Trans Comput Biol Bioinform (2015) 0.75

Query-seeded iterative sequence similarity searching improves selectivity 5-20-fold. Nucleic Acids Res (2016) 0.75

Articles cited by this

Basic local alignment search tool. J Mol Biol (1990) 659.07

The Protein Data Bank. Nucleic Acids Res (2000) 187.10

Identification of common molecular subsequences. J Mol Biol (1981) 130.53

SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol (1995) 74.88

The Pfam protein families database. Nucleic Acids Res (2004) 56.46

The Pfam protein families database. Nucleic Acids Res (2007) 30.53

CATH--a hierarchic classification of protein domain structures. Structure (1997) 29.95

Amino acid substitution matrices from an information theoretic perspective. J Mol Biol (1991) 23.38

Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. Nucleic Acids Res (2001) 22.33

The Universal Protein Resource (UniProt) in 2010. Nucleic Acids Res (2009) 19.70

The SWISS-PROT protein sequence data bank and its supplement TrEMBL. Nucleic Acids Res (1997) 12.30

Flexible sequence similarity searching with the FASTA3 program package. Methods Mol Biol (2000) 12.05

The ASTRAL Compendium in 2004. Nucleic Acids Res (2004) 10.03

Searching protein sequence libraries: comparison of the sensitivity and selectivity of the Smith-Waterman and FASTA algorithms. Genomics (1991) 9.65

QuickTree: building huge Neighbour-Joining trees of protein sequences. Bioinformatics (2002) 9.36

Protein database searches using compositionally adjusted substitution matrices. FEBS J (2005) 8.14

Assessing sequence comparison methods with reliable structurally identified distant evolutionary relationships. Proc Natl Acad Sci U S A (1998) 7.18

Comparison of methods for searching protein sequence databases. Protein Sci (1995) 4.29

Sensitivity and selectivity in protein structure comparison. Protein Sci (2004) 3.69

The estimation of statistical parameters for local alignment score distributions. Nucleic Acids Res (2001) 3.61

Post-processing long pairwise alignments. Bioinformatics (1999) 3.00

PSI-BLAST pseudocounts and the minimum description length principle. Nucleic Acids Res (2008) 1.68

SATCHMO: sequence alignment and tree construction using hidden Markov models. Bioinformatics (2003) 1.66

The limits of protein sequence comparison? Curr Opin Struct Biol (2005) 1.57

The effectiveness of position- and composition-specific gap costs for protein similarity searches. Bioinformatics (2008) 0.90

Simple is beautiful: a straightforward approach to improve the delineation of true and false positives in PSI-BLAST searches. Bioinformatics (2008) 0.90

SIB-BLAST: a web server for improved delineation of true and false positives in PSI-BLAST searches. Nucleic Acids Res (2009) 0.84

Articles by these authors

The genome of Cryptosporidium hominis. Nature (2004) 3.93

Sensitivity and selectivity in protein structure comparison. Protein Sci (2004) 3.69

Getting more from less: algorithms for rapid protein identification with multiple short peptide sequences. Mol Cell Proteomics (2002) 1.70

MACiE: exploring the diversity of biochemical reactions. Nucleic Acids Res (2011) 1.67

The Catalytic Site Atlas 2.0: cataloging catalytic sites and residues identified in enzymes. Nucleic Acids Res (2013) 1.42

Nomenclature for mammalian soluble glutathione transferases. Methods Enzymol (2005) 1.36

Identification of residues in glutathione transferase capable of driving functional diversification in evolution. A novel approach to protein redesign. J Biol Chem (2002) 1.15

PSI-Search: iterative HOE-reduced profile SSEARCH searching. Bioinformatics (2012) 1.14

A strategy for the rapid identification of phosphorylation sites in the phosphoproteome. Mol Cell Proteomics (2002) 0.97

RefProtDom: a protein database with improved domain boundaries and homology relationships. Bioinformatics (2010) 0.96

Entamoeba histolytica: sequence conservation of the Gal/GalNAc lectin from clinical isolates. Exp Parasitol (2002) 0.93

Adjusting scoring matrices to correct overextended alignments. Bioinformatics (2013) 0.91

Improving pairwise sequence alignment accuracy using near-optimal protein sequence alignments. BMC Bioinformatics (2010) 0.88

Globally, unrelated protein sequences appear random. Bioinformatics (2009) 0.82

Identification and characterization of GSTT3, a third murine Theta class glutathione transferase. Biochem J (2002) 0.80

CRP: Cleavage of Radiolabeled Phosphoproteins. Nucleic Acids Res (2003) 0.80

Visualization of near-optimal sequence alignments. Bioinformatics (2004) 0.79

Using relational databases for improved sequence similarity searching and large-scale genomic analyses. Curr Protoc Bioinformatics (2004) 0.75