Efficacy of different protein descriptors in predicting protein functional families.

PubWeight™: 0.95‹?› | Rank: Top 15%

🔗 View Article (PMC 1997217)

Published in BMC Bioinformatics on August 17, 2007

Authors

Serene A K Ong1, Hong Huang Lin, Yu Zong Chen, Ze Rong Li, Zhiwei Cao

Author Affiliations

1: Department of Pharmacy, National University of Singapore, Blk S16, Level 8, 08-14, 3 Science Drive 2, Singapore 117543, Singapore. renese7@gmail.com

Articles cited by this

The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res (2003) 52.80

The Pfam protein families database. Nucleic Acids Res (2002) 51.34

Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics (2006) 43.68

Prediction of protein antigenic determinants from amino acid sequences. Proc Natl Acad Sci U S A (1981) 32.98

Knowledge-based analysis of microarray gene expression data by using support vector machines. Proc Natl Acad Sci U S A (2000) 19.39

Notes on continuous stochastic phenomena. Biometrika (1950) 17.81

Amino acid difference formula to help explain protein evolution. Science (1974) 15.19

Conserved structures and diversity of functions of RNA-binding proteins. Science (1994) 14.75

Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics (2000) 13.33

Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics (2000) 11.75

Clustering of highly homologous sequences to reduce the size of large protein databases. Bioinformatics (2001) 9.11

Comparative molecular field analysis (CoMFA). 1. Effect of shape on binding of steroids to carrier proteins. J Am Chem Soc (1988) 6.41

The nature of the accessible and buried surfaces in proteins. J Mol Biol (1976) 5.92

Tolerating some redundancy significantly speeds up clustering of large protein databases. Bioinformatics (2002) 5.46

TCDB: the Transporter Classification Database for membrane transport protein analyses and information. Nucleic Acids Res (2006) 5.31

Exploring expression data: identification and analysis of coexpressed genes. Genome Res (1999) 5.09

Prediction of protein cellular attributes using pseudo-amino acid composition. Proteins (2001) 4.64

Glycosyltransferases. Structure, localization, and control of cell type-specific glycosylation. J Biol Chem (1989) 3.99

Prediction of protein structural classes. Crit Rev Biochem Mol Biol (1995) 3.90

AAindex: amino acid index database. Nucleic Acids Res (2000) 3.83

Themes in RNA-protein recognition. J Mol Biol (1999) 3.69

SVM-Prot: Web-based support vector machine software for functional classification of a protein from its primary sequence. Nucleic Acids Res (2003) 3.34

Predicting protein--protein interactions from primary structure. Bioinformatics (2001) 2.82

ESLpred: SVM-based method for subcellular localization of eukaryotic proteins using dipeptide composition and PSI-BLAST. Nucleic Acids Res (2004) 2.72

The transporter classification (TC) system, 2002. Crit Rev Biochem Mol Biol (2002) 2.56

Classifying G-protein coupled receptors with support vector machines. Bioinformatics (2002) 2.54

Proteins binding to duplexed RNA: one motif, multiple functions. Trends Biochem Sci (2000) 2.47

On the average hydrophobicity of proteins and the relation between it and protein structure. J Theor Biol (1967) 2.25

Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes. Bioinformatics (2004) 2.19

Using functional domain composition and support vector machines for prediction of protein subcellular location. J Biol Chem (2002) 2.18

Prediction of protein folding class using global description of amino acid sequence. Proc Natl Acad Sci U S A (1995) 2.00

Support vector machine-based method for subcellular localization of human proteins using amino acid compositions, their order, and similarity search. J Biol Chem (2005) 1.85

Neighborhood behavior: a useful concept for validation of "molecular diversity" descriptors. J Med Chem (1996) 1.85

PSLpred: prediction of subcellular localization of bacterial proteins. Bioinformatics (2005) 1.82

Recent advances in RNA-protein recognition. Curr Opin Struct Biol (2001) 1.74

Secondary structure prediction with support vector machines. Bioinformatics (2003) 1.70

Genomic sciences and the medicine of tomorrow. Nat Biotechnol (1996) 1.67

Receptors and G proteins as primary components of transmembrane signal transduction. Part 1. G-protein-coupled receptors: structure and function. J Mol Med (Berl) (1995) 1.65

Genetic analysis of chlorophyll biosynthesis. Annu Rev Genet (1997) 1.60

Drug design by machine learning: support vector machines for pharmaceutical data analysis. Comput Chem (2001) 1.59

Amino acid properties and side-chain orientation in proteins: a cross correlation appraoch. J Theor Biol (1975) 1.58

The structural dependence of amino acid hydrophobicity parameters. J Theor Biol (1982) 1.53

The rational design of amino acid sequences by artificial neural networks and simulated molecular evolution: de novo design of an idealized leader peptidase cleavage site. Biophys J (1994) 1.50

GPCRpred: an SVM-based method for prediction of families and subfamilies of G-protein coupled receptors. Nucleic Acids Res (2004) 1.50

PROFEAT: a web server for computing structural and physicochemical features of proteins and peptides from amino acid sequence. Nucleic Acids Res (2006) 1.47

Conserved domains of glycosyltransferases. Glycobiology (1999) 1.45

Enzyme family classification by support vector machines. Proteins (2004) 1.37

Prediction of RNA-binding proteins from primary sequence by a support vector machine approach. RNA (2004) 1.36

Classification of nuclear receptors based on amino acid composition and dipeptide composition. J Biol Chem (2004) 1.36

Recognition of a protein fold in the context of the Structural Classification of Proteins (SCOP) classification. Proteins (1999) 1.28

Whole-proteome interaction mining. Bioinformatics (2003) 1.27

Prediction of protein subcellular locations by incorporating quasi-sequence-order effect. Biochem Biophys Res Commun (2000) 1.25

Hum-PLoc: a novel ensemble classifier for predicting human protein subcellular localization. Biochem Biophys Res Commun (2006) 1.21

Selecting optimally diverse compounds from structure databases: a validation study of two-dimensional and three-dimensional molecular descriptors. J Med Chem (1997) 1.17

Prediction of protein subcellular locations by GO-FunD-PseAA predictor. Biochem Biophys Res Commun (2004) 1.16

Towards 3D structures of G protein-coupled receptors: a multidisciplinary approach. Curr Med Chem (2000) 1.16

Prediction of membrane protein types by incorporating amphipathic effects. J Chem Inf Model (2005) 1.15

Predicting protein-protein interactions from sequences in a hybridization space. J Proteome Res (2006) 1.15

Effect of molecular descriptor feature selection in support vector machine classification of pharmacokinetic and toxicological properties of chemical agents. J Chem Inf Comput Sci (2004) 1.15

An overview on predicting the subcellular location of a protein. In Silico Biol (2002) 1.14

Large-scale plant protein subcellular location prediction. J Cell Biochem (2007) 1.12

Prediction of transporter family from protein sequence by support vector machine approach. Proteins (2006) 1.11

Prediction of MHC-binding peptides of flexible lengths from sequence-derived structural and physicochemical properties. Mol Immunol (2006) 1.11

Effect of training datasets on support vector machine prediction of protein-protein interactions. Proteomics (2005) 1.11

Hydrophobicity and structural classes in proteins. Protein Eng (1992) 1.08

Support vector machines for prediction of protein subcellular location by incorporating quasi-sequence-order effect. J Cell Biochem (2002) 1.05

Prediction of the functional class of metal-binding proteins from sequence derived physicochemical properties by support vector machine approach. BMC Bioinformatics (2006) 1.01

Predicting functional family of novel enzymes irrespective of sequence similarity: a statistical learning approach. Nucleic Acids Res (2004) 1.00

Prediction of membrane protein types based on the hydrophobic index of amino acids. J Protein Chem (2000) 1.00

Protein-specific glycosyltransferases: how and why they do it! FASEB J (1994) 0.98

Predicting protein structural class with pseudo-amino acid composition and support vector machine fusion network. Anal Biochem (2006) 0.98

Prediction of enzyme family classes. J Proteome Res (2003) 0.97

Molecular descriptors in chemoinformatics, computational combinatorial chemistry, and virtual screening. Comb Chem High Throughput Screen (2000) 0.94

TSSub: eukaryotic protein subcellular localization by extracting features from profiles. Bioinformatics (2006) 0.94

Prediction of G-protein-coupled receptor classes. J Proteome Res (2005) 0.93

Prediction of the functional class of lipid binding proteins from sequence-derived properties irrespective of sequence similarity. J Lipid Res (2006) 0.92

GNBSL: a new integrative system to predict the subcellular location for Gram-negative bacteria proteins. Proteomics (2006) 0.91

Cellular lipid binding proteins as facilitators and regulators of lipid metabolism. Mol Cell Biochem (2002) 0.91

Predicting enzyme family class in a hybridization space. Protein Sci (2004) 0.91

Prediction of protein helix content from an autocorrelation analysis of sequence hydrophobicities. Biopolymers (1988) 0.90

Bioinformatical analysis of G-protein-coupled receptors. J Proteome Res (2003) 0.88

Accurate prediction of protein secondary structural content. J Protein Chem (2001) 0.86

RNA-binding proteins: if it looks like a sn(o)RNA... Curr Biol (2001) 0.84

Prediction of protein subcellular location using a combined feature of sequence. FEBS Lett (2005) 0.83

Population structure inferred by local spatial autocorrelation: an example from an Amerindian tribal population. Am J Phys Anthropol (2006) 0.83

Screening with tumor markers: critical issues. Mol Biotechnol (2002) 0.82

Protein folding and the genetic code: an alternative quantitative model. J Theor Biol (1981) 0.80

Evaluation of descriptors and mini-fingerprints for the identification of molecules with similar activity. J Chem Inf Comput Sci (2000) 0.79

Discovering compact and highly discriminative features or feature combinations of drug activities using support vector machines. Proc IEEE Comput Soc Bioinform Conf (2003) 0.76

Articles by these authors

Genome sequencing and comparative analysis of Saccharomyces cerevisiae strain YJM789. Proc Natl Acad Sci U S A (2007) 3.52

Mechanisms of drug combinations: interaction and network perspectives. Nat Rev Drug Discov (2009) 3.20

HIT: linking herbal active ingredients to targets. Nucleic Acids Res (2010) 1.50

Prediction of RNA-binding proteins from primary sequence by a support vector machine approach. RNA (2004) 1.36

Therapeutic target database update 2014: a resource for targeted therapeutics. Nucleic Acids Res (2013) 1.31

Effect of training datasets on support vector machine prediction of protein-protein interactions. Proteomics (2005) 1.11

Therapeutic effects of astragaloside IV on myocardial injuries: multi-target identification and network analysis. PLoS One (2012) 1.07

Drug Adverse Reaction Target Database (DART) : proteins related to adverse drug reactions. Drug Saf (2003) 1.06

Effect of selection of molecular descriptors on the prediction of blood-brain barrier penetrating and nonpenetrating agents by statistical learning methods. J Chem Inf Model (2005) 1.05

In-silico approaches to multi-target drug discovery : computer aided multi-target drug design, multi-target virtual screening. Pharm Res (2010) 1.05

Computer prediction of drug resistance mutations in proteins. Drug Discov Today (2005) 1.02

Are herb-pairs of traditional Chinese medicine distinguishable from others? Pattern analysis and artificial intelligence classification study of traditionally defined herbal properties. J Ethnopharmacol (2006) 1.02

MHC-BPS: MHC-binder prediction server for identifying peptides of flexible lengths from sequence-derived physicochemical properties. Immunogenetics (2006) 1.02

Computer prediction of allergen proteins from sequence-derived protein structural and physicochemical properties. Mol Immunol (2006) 0.98

Support vector machines approach for predicting druggable proteins: recent progress in its exploration and investigation of its usefulness. Drug Discov Today (2007) 0.94

Multi-task learning for cross-platform siRNA efficacy prediction: an in-silico study. BMC Bioinformatics (2010) 0.94

Simulation of the regulation of EGFR endocytosis and EGFR-ERK signaling by endophilin-mediated RhoA-EGFR crosstalk. FEBS Lett (2008) 0.93

Dana-Farber repository for machine learning in immunology. J Immunol Methods (2011) 0.93

Silicone intubation and endoscopic dacryocystorhinostomy: a meta-analysis. J Otolaryngol Head Neck Surg (2010) 0.93

Derivation of stable microarray cancer-differentiating signatures using consensus scoring of multiple random sampling and gene-ranking consistency evaluation. Cancer Res (2007) 0.92

In silico search of putative adverse drug reaction related proteins as a potential tool for facilitating drug adverse effect prediction. Toxicol Lett (2006) 0.91

Computational vaccinology and the ICoVax 2012 workshop. BMC Bioinformatics (2013) 0.91

iPEAP: integrating multiple omics and genetic data for pathway enrichment analysis. Bioinformatics (2013) 0.91

Ab-origin: an enhanced tool to identify the sourcing gene segments in germline for rearranged antibodies. BMC Bioinformatics (2008) 0.90

Simulation of crosstalk between small GTPase RhoA and EGFR-ERK signaling pathway via MEKK1. Bioinformatics (2008) 0.89

Stacking and energetic contribution of aromatic islands at the binding interface of antibody proteins. Immunome Res (2010) 0.89

A new protein-ligand binding sites prediction method based on the integration of protein sequence conservation information. BMC Bioinformatics (2011) 0.88

Towards a bioinformatics analysis of anti-Alzheimer's herbal medicines from a target network perspective. Brief Bioinform (2012) 0.88

Epitope predictions indicate the presence of two distinct types of epitope-antibody-reactivities determined by epitope profiling of intravenous immunoglobulins. PLoS One (2013) 0.87

Potential metabolic mechanism of girls' central precocious puberty: a network analysis on urine metabonomics data. BMC Syst Biol (2012) 0.86

Multi-target QSAR modelling in the analysis and design of HIV-HCV co-inhibitors: an in-silico study. BMC Bioinformatics (2011) 0.86

S100A7 enhances invasion of human breast cancer MDA-MB-468 cells through activation of nuclear factor-κB signaling. World J Surg Oncol (2013) 0.86

Recent progresses in the application of machine learning approach for predicting protein functional class independent of sequence similarity. Proteomics (2006) 0.85

Bioinformatics analysis of the epitope regions for norovirus capsid protein. BMC Bioinformatics (2013) 0.85

Target network differences between western drugs and Chinese herbal ingredients in treating cardiovascular disease. BMC Bioinformatics (2014) 0.85

Screening of selective histone deacetylase inhibitors by proteochemometric modeling. BMC Bioinformatics (2012) 0.85

Reconstruction and analysis of human liver-specific metabolic network based on CNHLPP data. J Proteome Res (2010) 0.84

Continuum transport model of Ogston sieving in patterned nanofilter arrays for separation of rod-like biomolecules. Electrophoresis (2008) 0.84

Analytical description of Ogston-regime biomolecule separation using nanofilters and nanopores. Phys Rev E Stat Nonlin Soft Matter Phys (2009) 0.84

Virtual drug screen schema based on multiview similarity integration and ranking aggregation. J Chem Inf Model (2012) 0.83

Study on human GPCR-inhibitor interactions by proteochemometric modeling. Gene (2012) 0.82

Drug discovery prospect from untapped species: indications from approved natural product drugs. PLoS One (2012) 0.82

Investigations on inhibitors of hedgehog signal pathway: a quantitative structure-activity relationship study. Int J Mol Sci (2011) 0.82

Calmodulin as a potential target by which berberine induces cell cycle arrest in human hepatoma Bel7402 cells. Chem Biol Drug Des (2013) 0.82

Tannic acid, a potent inhibitor of epidermal growth factor receptor tyrosine kinase. J Biochem (2006) 0.82

Novel natural inhibitors of CYP1A2 identified by in silico and in vitro screening. Int J Mol Sci (2011) 0.81

Super paramagnetic clustering of DNA sequences. J Biol Phys (2006) 0.81

Reconsideration of in-silico siRNA design based on feature selection: a cross-platform data integration perspective. PLoS One (2012) 0.81

Pathway sensitivity analysis for detecting pro-proliferation activities of oncogenes and tumor suppressors of epidermal growth factor receptor-extracellular signal-regulated protein kinase pathway at altered protein levels. Cancer (2009) 0.81

The Therapeutic Target Database: an internet resource for the primary targets of approved, clinical trial and experimental drugs. Expert Opin Ther Targets (2011) 0.81

Reconsideration of in silico siRNA design from a perspective of heterogeneous data integration: problems and solutions. Brief Bioinform (2012) 0.81

Multitarget inhibitors derived from crosstalk mechanism involving VEGFR2. Future Med Chem (2014) 0.80

Insight into potential toxicity mechanisms of melamine: an in silico study. Toxicology (2011) 0.80

Developing a mouse model of acute bacterial rhinosinusitis. Eur Arch Otorhinolaryngol (2011) 0.79

Proteochemometric modeling of the bioactivity spectra of HIV-1 protease inhibitors by introducing protein-ligand interaction fingerprint. PLoS One (2012) 0.79

Quantitatively integrating molecular structure and bioactivity profile evidence into drug-target relationship analysis. BMC Bioinformatics (2012) 0.79

What does it take to synergistically combine sub-potent natural products into drug-level potent combinations? PLoS One (2012) 0.79

Potassium channels: structures, diseases, and modulators. Chem Biol Drug Des (2014) 0.78

Realistic simulations of combined DNA electrophoretic flow and EOF in nano-fluidic devices. Electrophoresis (2008) 0.78

Transport of biomolecules in asymmetric nanofilter arrays. Anal Bioanal Chem (2009) 0.78

Metabolic network analysis revealed distinct routes of deletion effects between essential and non-essential genes. Mol Biosyst (2012) 0.77

The stereoselectivity of CYP2C19 on R- and S-isomers of proton pump inhibitors. Chem Biol Drug Des (2014) 0.77

HIM-herbal ingredients in-vivo metabolism database. J Cheminform (2013) 0.77

Cancer informatics for the clinician: an interaction database for chemotherapy regimens and antiepileptic drugs. Seizure (2009) 0.77

Discrimination of approved drugs from experimental drugs by learning methods. BMC Bioinformatics (2011) 0.77

Study of drug function based on similarity of pathway fingerprint. Protein Cell (2012) 0.77

Prediction of functional class of proteins and peptides irrespective of sequence homology by support vector machines. Bioinform Biol Insights (2009) 0.76

The recent progress in proteochemometric modelling: focusing on target descriptors, cross-term descriptors and application scope. Brief Bioinform (2016) 0.76

Comparison of different ranking methods in protein-ligand binding site prediction. Int J Mol Sci (2012) 0.76

[The effects on nasal mucociliary clearance system of excising partial inferior turbinectomy with HUMMER or microwave]. Lin Chung Er Bi Yan Hou Tou Jing Wai Ke Za Zhi (2007) 0.76

Toxicogenomic analysis suggests chemical-induced sexual dimorphism in the expression of metabolic genes in zebrafish liver. PLoS One (2012) 0.76

Similar connotation in chronic hepatitis B and nonalcoholic Fatty liver patients with dampness-heat syndrome. Evid Based Complement Alternat Med (2013) 0.75

In silico target-specific siRNA design based on domain transfer in heterogeneous data. PLoS One (2012) 0.75

[Early characters of recurred vocal cord cancers after laser cordectomy and treatment]. Lin Chuang Er Bi Yan Hou Ke Za Zhi (2005) 0.75

[The observation of normal uncinate process mucosa compared with inferior turbinate in epithelium ultrastructure]. Lin Chung Er Bi Yan Hou Tou Jing Wai Ke Za Zhi (2016) 0.75

Absorption, distribution, metabolism, and excretion-associated protein database. Clin Pharmacol Ther (2002) 0.75

Comparative analysis of two-component signal transduction system in two streptomycete genomes. Acta Biochim Biophys Sin (Shanghai) (2007) 0.75

Exploration of 1-(3-chloro-4-(4-oxo-4H-chromen-2-yl)phenyl)-3-phenylurea derivatives as selective dual inhibitors of Raf1 and JNK1 kinases for anti-tumor treatment. Bioorg Med Chem (2012) 0.75

The database and bioinformatics studies of probiotics. J Agric Food Chem (2017) 0.75

[The expression of neuN in the development of olfactory bulb and epithelium of mice]. Lin Chung Er Bi Yan Hou Tou Jing Wai Ke Za Zhi (2007) 0.75

Conservative management of transnasal intracranial injury. Am J Otolaryngol (2010) 0.75

Similarity between segments in protein conformational epitopes and MHC II peptides. Int J Comput Biol Drug Des (2013) 0.75

[Expression and role of IL-33 and its receptor ST2 in eosinophilic and non-eosinophilic chronic rhinosinusitis with nasal polyps]. Lin Chung Er Bi Yan Hou Tou Jing Wai Ke Za Zhi (2015) 0.75

Dispersive transport of biomolecules in periodic energy landscapes with application to nanofilter sieving arrays. Electrophoresis (2011) 0.75

AAIR: antibody antigen information resource. J Immunol (2007) 0.75

Advances in computational approaches in identifying synergistic drug combinations. Brief Bioinform (2017) 0.75

[Evaluation to the quality of life of patients with chronic sinusitis and polyps and analysis of influential factors]. Lin Chung Er Bi Yan Hou Tou Jing Wai Ke Za Zhi (2007) 0.75

[A study on the expression of IL-8 and IL-3 in human nasal polyp and polyposis]. Lin Chuang Er Bi Yan Hou Ke Za Zhi (2004) 0.75

[Expression and clinical significance of Eotaxin-3 in chronic rhinosinusitis with and without nasal polyps]. Lin Chung Er Bi Yan Hou Tou Jing Wai Ke Za Zhi (2016) 0.75

Effect of training data size and noise level on support vector machines virtual screening of genotoxic compounds from large compound libraries. J Comput Aided Mol Des (2011) 0.75

[The clinical significance of 1,3-beta-D glucanase detection in plasma to the diagnosis of fungal rhinosinusitis]. Lin Chung Er Bi Yan Hou Tou Jing Wai Ke Za Zhi (2013) 0.75

Advances in machine learning prediction of toxicological properties and adverse drug reactions of pharmaceutical agents. Curr Drug Saf (2008) 0.75

In silico prediction of adverse drug reactions and toxicities based on structural, biological and clinical data. Curr Drug Saf (2012) 0.75