Lal Dhari Patel

M.Sc. Bioinformatics

21256

Sh. Sanjeev Kumar

2 yr 3 mo

Duration

Research Thesis

Title

Deep Learning for Predicting Breeding Value using High Throughput Genotyping and Phenotyping

Objectives

Objective 1: To develop Deep Learning model for Predicting Breeding Value for drought responsive trait in wheat. Objective 2: To evaluate the performance of developed model for predicting Breeding Value for drought responsive trait in wheat.

Abstract

Accurate estimation of the breeding value in a crop breeding program is of key importance. Traditionally, statistical methods have been widely utilized for predicting breeding values using genotypic effects. These statistical methods usually assume that genotypic effects are independently distributed and follows a prior distribution such as Gaussian etc. These statistical assumptions may play limiting role in predicting the breeding values using high throughput genotyping data, which has very precise information of genotypes. At the same time, harnessing the potential of this precise information of genotyping equally precise phenotyping is also warranted. Precise phenotyping is laborious, expensive and sometime impossible in case of conventional phenotyping. Therefore to overcome these limitations, the present work proposes the use of deep learning in prediction of breeding value by exploiting the full potential of high-throughput genotyping in conjecture with high throughput phenotyping. Hence, deep learning-based CNN Model has been trained for the prediction of breeding Value using High Throughput Genotyping and Phenotyping data of wheat dataset, which consist of 184 RILs and each RILs contains 3121 filtered SNPs. Altogether, data of six traits were taken, under two environments (controlled and drought condition), for the prediction of breeding value. First, the whole dataset was randomly divided into two parts, one is training dataset and other is testing dataset. The CNN models were trained on training dataset, which contains 80% of total dataset and remaining 20% of the total data was used for testing. Two parameters were used for testing and evaluation of the deep learning model training. The trained and tested deep learning model was compared with the existing statistical models i.e., GBLUP (Genomic best linear unbiased prediction), rrBLUP (ridge regression best linear unbiased prediction) and Bayesian LASSO (Bayesian Least Absolute Selection and Shrinkage Operator). The result shows that deep learning model performs better as compare to statistical methods undertaken.

Academic Details

Program

MSc

Roll Number

21256

Batch Year

2019-20

Fellowship

Institute

Admission

Aug 2019

Completion

Nov 2021