Back to Alumni Directory
Ritwika Das
PhD
Alumni • Class of 2017-18

Ritwika Das

Ph.D. Bioinformatics

Scientist (Bioinformatics), ICAR - Indian Agricultural Statistics Research Institute, New Delhi - 110012
11005
Dr. Anil Rai

1

Publications

1

Experiences

5 yr 11 mo

Duration

Research Thesis

Title

Development of Advanced Learning Based Classification Approach for Fungal Metagenomic Data

Objectives

i) To develop an efficient advanced learning based approach for molecular marker based classification of fungi data from metagenomics data sets ii) To empirically evaluate and compare the performance of the developed approach with existing software iii) To develop software for the proposed approach

Abstract

Microorganisms are an inevitable part of the ecosystem playing beneficial roles like nutrient mineralization, bioremediation, organic matter decomposition as well as posing harmful effects as pathogens. Rapid advancement in NGS technologies has given rise to a new field of study, “Metagenomics” for understanding the microbial community composition and functions directly from any environmental sample such as human gut, skin, soil, ocean, crop rhizosphere etc. Accurate binning and taxonomic annotation of raw metagenomic reads is an essential step before the subsequent functional analysis. Computational approaches, especially machine learning and deep learning algorithms, have been found to efficiently classify prokaryotic microorganisms, viz. bacteria and archaea from metagenomic datasets as compared to the reference-based method using BLAST. However, identification of fungi species from metagenomic data is a highly challenging task due to the complexity of eukaryotic genomes. Internal Transcribed Spacer (ITS) region is the most widely used DNA marker for the taxonomic annotation of a majority of fungal species. In this present study, a convolutional neural network based approach, CNN_Funbar has been developed using UNITE+INSDC reference ITS datasets for classifying fungi ITS sequences at all the six taxonomic levels, viz., species, genus, family, order, class and phylum while varying convolution kernel size, filter numbers, k-mer size, unique category numbers and category-wise ITS sequence frequencies. The proposed CNN_FunBar models have produced > 93% average accuracy for classifying ITS sequences from balanced datasets with 500 sequences per category and 6-mer frequency features at all the taxonomic levels. Species and genus level CNN_FunBar models, viz., Species_Model.h5 and Genus_Model.h5 could identify 62 species and 41 genera from the simulated fungal metagenomic dataset with a classification accuracy of 91.93% and 95.16% respectively. The comparative study has suggested that CNN_FunBar could outperform existing fungal taxonomy prediction tools (funbarRF, Mothur, RDP Classifier, and SINTAX) as well as competitive machine learning-based algorithms (SVM, KNN, Naive-Bayes, and Random Forest). A web application, CNN_FunBar has been developed for extracting oligonucleotide frequency features from the input ITS sequences followed by their classification using proposed CNN_FunBar models at various taxonomic levels. The developed tool is freely available at https://github.com/ritwika1993/CNN_FunBar_ITS.

Publications (1)

CNN_FunBar: Advanced Learning Technique for Fungi ITS Region Classification

CNN_FunBar: Advanced Learning Technique for Fungi ITS Region Classification

Ritwika Das, Anil Rai, Dwijesh Chandra Mishra

Genes 2023 NAAS: 8.80 IF: 2.8
View →

Academic Details

Program
PhD
Roll Number
11005
Batch Year
2017-18
Fellowship
IARI Institute Fellowship
Admission
Jul 2017
Completion
Jul 2023

Experience

M.Sc. in Bioinformatics from PG School, ICAR - Indian Agricultural Research Institute, New Delhi - 110012

Jul 2015 — Jul 2017