Assignment 3: CS696 Programming Problems In Bioinformatics

Describe and discuss different approaches for finding motif in genetic string.
DNA, RNA and Proteins, these are three types of very large molecules essential for every living organism, for their biological functioning. Each molecule, DNA, RNA and Protein, play a vital role in living organisms, without them no life could survive. Let’s understand functionality one by one:
• DNA consists of encoded instructions which are necessary to maintain, assemble and reproduce.
• Proteins play important role in movement, photosynthesis, vision etc.
• RNA is used to make proteins from encoded instructions in DNA.
From above, we can conclude that, DNA makes RNA and RNA makes proteins.
Usually genetic strings are considered for any of the following three strings.

1. DNA strings: Known as DeoxyriboNucleic Acid (DNA). DNA is the main reason behind functioning and development of all living organisms. Usually, DNA molecule looks like a double helix because it contains two biopolymer strands arranges such that they looks like they are coiling around each other. DNA is made up from one of the four nucleotide, guanine (G), adenine (A), thymine (T), or cytosine (C). DNA strings are made up of alphabets {A, C, G, T}.

2. RNA strings: Known as RiboNucleic acid (RNA). Similar to DNA, RNA is also nucleic acid. But RNA usually found in the form of folded to it. RNA is also made up from one of the four nucleotide, guanine (G), adenine (A), uracil (U) and cytosine (C). RNA strings are made up of {A, C, G, U}.

3. Protein strings: Proteins are large molecules. Proteins are made up of from one or more amino acids. They are basic necessities for performing cellular functions. Protein strings are made up of 20 alphabets from English alphabets except B, O, U, X, J and
Usually repeats contain with little modifications. For humans, repeats are called Alu repeats. Alu repeats may occur more than a million times in every human genome. We will detail algorithm about this problem in Algorithm section.
• Finding Motif in Proteins5:
Protein performs every important role played by a cell in living organisms. Primary structure of protein is called a domain. Each domain is made up of amino acids, which are responsible for performing functions.
During evolution, proteins form homologous groups. Homologous groups are formed by groups of organisms that shares common ancestors. Homologous groups are also called protein family. Now, protein families usually have proteins that come under the same domain. An important component of a domain necessary for its function is called a motif. During evolution, usually protein motif remains conserved and it can be really helpful for study. Let’s understand how proteins are formed from RNA:
We have already seen that proteins are formed from amino acids having initials with 20 letters from English alphabets except O, U, X, Z, J and B. An RNA Condon table consists of RNA to Proteins alphabets conversion. From RNA string we can convert it to protein string using RNA Condon

