Original Research Article
A Python analytical pipeline to identify prohormone precursors and predict prohormone cleavage sites
1 Department of Chemistry, University of Illinois, USA
2 Department of Animal Sciences, University of Illinois, USA
2 Department of Animal Sciences, University of Illinois, USA
Neuropeptides and hormones are signaling molecules that support cell-cell communication in the central nervous system. Experimentally characterizing neuropeptides requires significant efforts because of the complex and variable processing of prohormone precursor proteins into neuropeptides and hormones. We demonstrate the power and flexibility of the Python language to develop components of an bioinformatic analytical pipeline to identify precursors from genomic data and to predict cleavage as these precursors are en route to the final bioactive peptides. We identified 75 precursors in the rhesus genome, predicted cleavage sites using support vector machines and compared the rhesus predictions to putative assignments based on homology to human sequences. The correct classification rate of cleavage using the support vector machines was over 97% for both human and rhesus data sets. The functionality of Python has been important to develop and maintain NeuroPred (http://neuroproteomics.scs.uiuc.edu/neuropred.html), a user-centered web application for the neuroscience community that provides cleavage site prediction from a wide range of models, precision and accuracy statistics, post-translational modifications, and the molecular mass of potential peptides. The combined results illustrate the suitability of the Python language to implement an all-inclusive bioinformatics approach to predict neuropeptides that encompasses a large number of interdependent steps, from scanning genomes for precursor genes to identification of potential bioactive neuropeptides.
Keywords: Python, bioinformatics, neuropeptides, machine learning, support vector machine, precursor cleavage, rhesus monkey
Copyright: © 2008 Southey, Sweedler and Rodriguez-Zas. This is an open-access article subject to an exclusive license agreement between the authors and the Frontiers Research Foundation, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are credited.
*Correspondence: Bruce Southey, Department of Chemistry, University of Illinois, 1207 W. Gregory Dr., Urbana, IL 61801, USA. e-mail: southey@illinois.edu
Citation: Southey BR, Sweedler JV and Rodriguez-Zas SL (2008) A Python analytical pipeline to identify prohormone precursors and predict prohormone cleavage sites. Front. Neuroinform. (2008) 2:7. doi:10.3389/neuro.11.007.2008
Received: 04 September 2008; paper pending published: 26 September 2008; accepted: 11 November 2008; published online: 16 December 2008.
Edited by:
Rolf Kötter, Radboud University Nijmegen, The Netherlands
Reviewed by:
Yoonseong Park, Kansas State University, USA
Niovi Santama, University of Cyprus and Cyprus Institute of Neurology and Genetics, Cyprus
Niovi Santama, University of Cyprus and Cyprus Institute of Neurology and Genetics, Cyprus
*Correspondence: Bruce Southey, Department of Chemistry, University of Illinois, 1207 W. Gregory Dr., Urbana, IL 61801, USA. e-mail: southey@illinois.edu


