Mohan Babu H S
M.Sc. Bioinformatics
2 yr 4 mo
Duration
Research Thesis
Title
Development of Non-B, DNA Database for Rice and Maize
Abstract
Amongst nucleic acids it has been found that apart from normal canonical form of B-DNA there are many other forms which are biologically functional. Keeping this in mind a database of non-B DNA was created for Rice and Maize. The Chromosome sequences and the gene information was collected from NCBI Database for Rice (Oryza sativa Japonica Group) and Maize (Zea mays) crops. The seven major non-B forms of DNA i.e., A-DNA, Z-DNA, G-Quadruplex motifs, Inverted Repeats, Direct Repeats, Mirror Repeats and Short Tandem Repeats were predicted in Rice and Maize chromosomes using the non-B DNA motif search tool which is freely available over the internet. The results were used to create the database, using the WAMP framework for Windows operating system. The database Architecture includes three tiers with the clients at the top, web server at the middle and MySQL database at the bottom. Bioperl script with the SeqIO module was used to divide the chromosomal sequences into subsequences while predicting the motifs from the chromosomes of Maize since their size is more than the analysis limit of the motif search algorithm. Window analysis was done to obtain the motifs that might have been missed, at the flanking regions of the subsequences. The interface was created using the client side programming languages HTML (Hypertext markup Language), CSS (Cascading style sheets), and JavaScript. PHP (Hypertext Preprocessor) was used as server side scripting language. MySQL was used as the structured query language to create the General search option to retrieve the motif data and the advanced search option in which user can search by Gene ID, accession number and description of the protein product coded by the gene. With these search options the interface also includes the menu for crop wise and chromosome wise statistics of non-B DNA motifs. The glossary menu includes the definitions for technical terms used in the project. Link is provided to NCBI Genome Data view to visualize the motif location on the genome. Links are also provided to the User manual, index files of Rice and Maize, tools and resources used in the research project. Since these motifs are involved in critical functions in the cell their study may be important for understanding economically important physiological phenomenons in other crops and animals of agricultural importance so the database can be further extended to meet this objective.