HIDDEN MARKOV MODEL FOR ANTICIPATING THE OCCURRENCE OF CANCEROUS TISSUES IN EUKARYOTIC ORGANISMS.
Oluwafemi A. Sarumi and Adebayo O. Adetunmbi.
The Federal University of Technology, Akure, Ondo State, Nigeria.
This research is geared towards early identification of genes with inherent traits and potential to develop into cancerous tissues in Eukaryotic organisms.
Widespread of Deoxyribonucleic Acid (DNA) palindromes accentuates gene amplification in Eukaryotic organisms. Investigations from model systems have demonstrated that palindrome formation can be an early rate-limiting step in DNA amplification. Also, bioinformatics research has discovered that early detection of palindromes in organisms’ genome can aid the prediction of cancerous tissues growth in cell and also infertility in males. A palindrome sequence is a character sequence that reads the same frontwards and backward. DNA palindromes are words from the nucleotide base alphabets A, T, G, and C that are symmetrical in the sense that they read exactly the same as their complementary sequences in the reverse direction (inverted repeats).
In this research, we developed an algorithm using the Hidden Markov Models (HMM) to discover implicit, and previously unknown palindrome sequences from large volumes of eukaryotic organisms’ DNA sequences. HMMs are statistical models that can be used to describe the evolution of observable events that depend on internal factors, which are not directly observable. Also, our algorithm was implemented on the Spark parallel framework to provide for easy scalability as the volume of the datasets increases.
Experimental results show the effectiveness and high performance of our algorithm in mining hidden palindrome sequences from large volumes of genomic data.
Identification of palindrome sequences in genes of Eukaryotic organisms would initiate a treatment procedure that could prevent such genes from growing into cancerous tissues.