Predicting Protein Subcellular Locations Based on Feature Selection Method

Proper subcellular location of a protein is essential for the functioning of the protein, i.e. proteins with certain functions should locate at the certain compartments of a cell. Therefore, knowing the subcellular location of a new protein is helpful to elucidate its functions. In this paper, we propose a strategy to predict the subcellular locations of proteins by combining various methods from the literature. Firstly, proteins are coded by amino-acid composition and physicochemical properties, then these features are arranged by Minimum Redundancy Maximum Relevance method and further filtered by feature selection procedure proposed by Nearest Neighbor Algorithm is used to construct the predictor for the prediction of protein subcellular location, which gains a correct prediction rate of 85.66% tested by jackknife cross-validation. Also with the novel feature selection and analysis strategy, we are able to find out the most important protein properties. These important protein properties include hydrophobicity, polarity, protein secondary structure, amino-acid composition and solvent accessibility.
Please input the sequence dataInput the sequence data( The sequence must be in Fasta Format, its length must be more than 50 and less than 2700 )

E-mail: caiyudong@staff.shu.edu.cn

Institute of system biology, Shanghai University, Shanghai, 200444, China