Soutrik Mukherjee
M.Sc. Bioinformatics
1 yr 10 mo
Duration
Research Thesis
Title
Identification of bacteriophage from the metagenomic data of Ganga and Yamuna Rivers
Objectives
Identification of bacteriophages Annotation of the phages
Abstract
Microbes are important in each and every aspect of not only human but also all the lifeforms in earth. Every system in the biosphere is induced by the almost infinite ability of microbes to transform the world around them. Identification of bacteriophage from various regions of Ganga and Yamuna rivers was indeed a very important task to know the abundance of different species of bacteriophages. As bacteriophages play a very important role in riverine system by checking the growth of bacteria, it was very important to understand abundance of bacteriophages. Further, very few works have been done on the annotation of bacteriophages identified from the Ganga and Yamuna rivers. Sediment samples from various regions of Ganga and Yamuna River like Balkeshwar-ShivpuriAgra, Koteswar-Ganga, Rasulabad-Ganga, Sahi-Dabad-Ganga, Taj-Gung-Yamuna, Triveni-Sangam-Ganga, Yamuna- Expressway-Agra, Bagwan-Ganga area by ICARCentral Inland Fisheries Research Institute under CABIN project. Two approaches were followed for the identification of bacteriophages, one is identification of bacteriophages by binning of the metagenomic contigs data with Metabat2 tool and then distinguishing bacteriophage sequences by a machine learning based tool MARVEL. The other approach was alignment-based approach by BLASTN with the query as the contigs of the metagenomics samples and database made from the bacteriophage sequences downloaded from NCBI. With MARVEL tool from the 9 datasets, two bins of Balkeshwar-Ganga contigs data shows the result of having bacteriophage sequence. Using the bioinformatics software program Blast2GO, unique sequence data was automatically and quickly functionally annotated (genes, proteins). Blast Table describes the quantification of the bacteriophage species from the samples was generated. Aeribacillus phage AP45, complete genome phage was the most abundant phage in all 9 sites of Ganga and Yamuna rivers and gene ontology pie chart describes the biological process, cellular component and molecular function.