Surapuram Aswini
1 yr 7 mo
Duration
Research Thesis
Title
Integrating GWAS Module with HtP-DAP for SNP-trait Associations Mining
Objectives
1. To Develop Data Pre-processing and Data Handling Modules for High Through-Put Genomics and Phenomics data to facilitate GWAS Analysis. 2. To Integrate GWAS Analysis Tools and Develop a Result Visualization Module within HtP-DAP Software.
Abstract
Genome-wide association studies (GWAS) provide a crucial methodology for identifying genetic variants associated with traits in organisms. These studies are important for understanding the genetic basis of complex traits, which can aid in improving crop performance, human health, and livestock breeding. This thesis seamlessly integrates a GWAS analysis tool with the existing phenomics data analysis platform, HtP-DAP, aimed at enhancing and streamlining GWAS analysis workflows. The tool addresses key challenges in GWAS by offering robust preprocessing capabilities, including data filtering based on allelic frequency thresholds, imputation of missing genotypic data, and file conversion to ensure compatibility with various analysis pipelines. A major feature of the tool is its comprehensive set of relatedness analysis functions, which include kinship estimation, principal component analysis (PCA), and multi-dimensional scaling (MDS). These analyses provide critical insights into the underlying genetic architecture of populations, facilitating more accurate GWAS results. The GWAS analysis itself is highly flexible, supporting both single-locus models, which test individual markers for trait associations, and multi-locus models, which examine interactions between multiple markers. Result visualization is a key component of the tool, offering users the ability to generate clear and informative graphical outputs, such as Manhattan plots to highlight significant associations, circular Manhattan plots for a more compact genome-wide view, and Q-Q plots to assess the quality of the GWAS results and also provide a platform for presenting results in a meaningful way for publication or further research. The tool’s backend leverages the power of the GAPIT R package, known for its efficiency and scalability in handling large genomic datasets. GAPIT enables the seamless execution of GWAS analyses by managing the computational load, thus ensuring that the tool performs optimally even with large-scale datasets. By incorporating this GWAS tool within the HtP-DAP platform, this study bridges the gap between phenotypic data from high-throughput phenotyping and genotypic data from modern genomic studies. The integration facilitates a holistic approach to genetic research, allowing users to move from data collection to meaningful biological insights within a single platform.