Consider a random iidmodel of nucleotide sequence where the
Consider a random i.i.d.model of nucleotide sequence where the GC content is 35%. Assume that the frequency of G = C and A = T. What are the expected probabilities of each nucleotide?
Assuming these nucleotidefrequencies, what is the probability of encoding a translational STOP?
The sum of all amino acids should be 1. The sum of all codons is 1. However, STOP is not a valid amino acid. Yet we should still be able to calculate the probability of any amino acid from the nt frequencies, by proper normalization.In this case, we normalize by the total probability that leads to codons. So: P(amino acid) = P(all codons for the amino acid) / P(all codons that are valid amino acids).
Using proper normalization, what is the probability of the following amino acids? Leu(L) Ile (I) Trp (W)
Solution
Since the GC content of DNA is 35%
According to Chargaff law, number of adenosine residues is equal to number of thymidine residues which means that A=T; and the number of guanosine residues is equal to the number of cytidine residues; G= C.
So, G+C= 35%
A+T= 100-35= 65%
So, A= 65/2= 32.5%
Probability of A= 13/40
and that of T = 32.5%= 13/40
Similarly, G= 35/2= 17.5%= 7/40
and C= 17.5%= 7/40
