Research Thesis
Title
Development of computational approaches to understand plant-pathogen interactions
Objectives
I) To develop an empirical model to predict protein-protein interactions between plants and pathogens. II) To compare the performance of the empirical model with existing methods used for predicting protein-protein interactions. III) To predict and study protein-protein interactions between wheat and pathogens in commonly prevalent diseases. IV) To develop a prediction server to predict protein-protein interactions between plants and pathogens.
Abstract
Identifying protein-protein interactions (PPIs) in plant-pathogen system is an intriguing and demanding field of research that is necessary to comprehend the complex molecular mechanism of plant defense mechanism and pathogen virulence. Because identifying plant-pathogen PPIs experimentally requires so much time and effort, computational techniques are beginning to emerge as a helpful way to augment experimental methods. In the present study, the accuracy of well-established computational techniques for predicting plant-pathogen PPIs were investigated, such as interolog-based approach which is based on similarity searches. Due to the low sensitivity of the interolog technique, a machine learning (ML)-based ensemble model was employed to construct a multi-species plant-pathogen PPI predictor using diverse sequence encodings and multiple learning algorithms. Several amino acid sequences encoding schemes were evaluated. Auto-covariance (AC), conjoint triad (CT), and local descriptor (LD) schemes were selected based on their performance in terms of various evaluation metrics such as accuracy, sensitivity, precision, recall, Matthew’s correlation coefficient and F1-score. The selected features were combined with multiple learning algorithms such as random forest (RF), support vector machine (SVM), and artificial neural network (ANN). It was observed that AC and CT attained high accuracy with SVM (~96% and ~94% respectively) whereas LD performed better with RF (~95%). The predictions of these three individual models were further combined to yield an ensemble model with improved accuracy (~97%). The developed ensemble model was compared with the existing similar tools used for PPIs prediction between plant and pathogen, using an independent test dataset. The result of the comparative assessment exhibited the promising potential of the classifier in this domain. Hence, the developed model is proposed as an efficient tool for the prediction of multispecies plant-pathogen PPIs. Furthermore, to demonstrate the utility of the proposed classifier, it was employed to predict PPIs involved in the wheat blast, caused due to Magnaporthe oryzae pathotype Triticum (MoT). Wheat blast is a comparatively recent fungal disease but has become a serious threat to global wheat production. Most of the wheat proteins involved in the cross-talk between wheat and MoT were involved in the energy production mechanisms in response to the fungal attack. The fungal effector proteins were involved in biological processes that support the growth of the pathogen. Finally, a web-based prediction server, named PlantPathoPPI, was developed using the proposed model to extend the support for diverse levels of end-users. The prediction server is freely accessible and is available at http://login1.cabgrid.res.in:5080/. Taken together, PlantPathoPPI can serve as a valuable tool accelerate the investigation of plant-pathogen interactions.
Publications (3)
Prediction of protein–protein interactions between anti-CRISPR and CRISPR-Cas using machine learning technique.
Murmu, S., Chaurasia, H., Guha Majumdar, S., Rao, A. R., Rai, A., & Archak, S.
In-silico study of protein-protein interactions in wheat blast using docking and molecular dynamics simulation approach.
Murmu, S., & Archak, S.
PlantPathoPPI: An Ensemble-based Machine Learning Architecture for Prediction of Protein-Protein Interactions between Plants and Pathogens
Murmu, S., Chaurasia, H., Rao, A.R., Rai, A., Jaiswal, S., Bharadwaj, A., Yadav, R. and Archak, S.