Research Thesis
Title
Developmental of Computational Approach for Natural Products in Plants
Objectives
1. To develop a database of natural products related to the plants. 2. To develop a machine learning-based model and web application for predicting protein- ligand binding affinity. 3. To identify potent natural products against selected pathogens/proteins using the proposed model.
Abstract
Scientific research on plant-derived natural products has exponentially increased in recent years, with various new natural compounds of important therapeutic uses, which are being reported regularly in the scientific literature. These natural products have been used to treat a wide range of diseases, from infection to cancer, as well as to combat pests in various crops. In this doctoral thesis, a database of plant-derived natural products for crop protection “NatProCP,” has been developed, which will serve as a comprehensive resource for researchers by providing detailed information on natural products. The database contains information on 262 plant species of 5,281 unique natural compounds. It includes data on medicinal and drug likeness properties of the natural compounds, along with their 2D and 3D structures. The natural compounds were collected using text mining approaches and their 3D structures were optimized through the MMFF94 force field method via an in-house Python script. The database also provides information on the antifungal and antiviral potency of these compounds, which can be used for virtual screening studies. The primary purpose of the development of this database is to provide a library of natural compounds for virtual screening studies against various therapeutic proteins. Additionally, a machine learning based model for virtual screening (ML-VSPred) web server was developed for predicting the binding affinity scores of protein-ligand complexes. In this study, eight machine learning (ML) regression models, such as linear regression (LR), random forest (RF), decision tree (DT), support vector regression (SVR), polynomial SVR (PSVR), XGBoost (XGB), gradient boosting regression (GBR) and deep neural network (DNN) were trained on various protein-ligand structural features derived from the PDBbind dataset to build ML-based predictive models for protein-ligand binding affinity scores prediction. The result shows that the XGBoost (R2=0.84±0.012) model has the best performance compared to the other models, followed by GBR, RF, DNN, DT, LR, PSVR and SVM. This ML-VSPred prediction server accurately screened natural compounds against various target proteins. This prediction web server offers a valuable resource for advancing research in natural product based compounds discovery and crop protection. Thereafter, the SDH1 protein of M. oryzae, a major fungal disease of rice blast pathogen, was selected for screening against the NatProCP database using the ML-VSPred web server, followed by docking and MD simulation to find the unique natural compounds that inhibit the 92 SDH1 protein. The MM-PBSA method was used to perform the binding free energy analysis. Two compounds, Quercetin and Cinchonine were identified, showing strong binding affinity with binding free energy (Δ𝐺𝑏𝑖𝑛𝑑) values of -89.27 kJ/mol and -82.03 kJ/mol, respectively, as compared to that reference compound Azoxystrobin (-76.82 kJ/mol). These in-silico findings can be further validated through biochemical and structural investigation to explore the potential of these natural compounds for treating the M. oryzae receptor protein. Keywords: NatProCP, MMFF94, ML-VSPred, linear regression (LR), random forest (RF), decision tree (DT), support vector regression (SVR), polynomial SVR (PSVR), XGBoost (XGB), gradient boosting regression (GBR), deep neural network (DNN), PDBbind, SDH1, docking, MD simulation, MM-PBSA.
Publications (1)
Elevating the rice blast disease immunity through CPKA protein targeting in Magnaporthe oryzae (M. oryzae) with natural compounds
Nimai Charan Mahanandia, Satyaranjan Biswal, Dwijesh Chandra Mishra, Sudhir Srivastava, Krishna Kumar Chaturvedi, Sneha Murmu, Anu Sharma, Girish Kumar Jha & Mohammad Samir Farooqi