BIOINFORMATICS I have already done the calculations necessar

BIOINFORMATICS

I have already done the calculations necessary, but need help explaining the equation used during sequence alignment (is in bold)

The following is a sequence alignment of two genes from Drosophila melanogaster (fruitfly) and are homologs created by gene duplication. A. Compute the alignment score by the following equation: D = y + 2z. “y” is the total number of mismatches; “z” is the total number of gaps. B. Compute the alignment score again, but instead by this equation: D = y + 4z.

Show all work toward computing the above values. Also, describe the role of the coefficient in the two equations (the 2 in 2z and the 4 in 4z) and how it influences the alignment score; further explain its role in deciding which sequence alignment is optimal.

Solution

Role of coefficient in the two equations

Alignment score, D may be understood as the difference between the matches and mismatches in the alignment. As we all know that we can insert gaps in either of the sequences to increase the number of matches. In that case, the alignment score may not correlate with the best alignment. Therefore to restrict the number of gaps, a gap penalty is introduced. The gap penalty is of three main types viz., (i) constant (each gap irrespective of its length have some fixed negative value); (ii) linear (depends upon the length of the gap); and (iii) affine gap penalty (most commonly used penalty; It combines both constant and linear gap penalty).

The coefficient in the two equations (the 2 in 2z and the 4 in 4z) are the gap penalty. It negatively influences the alignment score i.e., higher is the coefficient lower is the alignment score.

BIOINFORMATICS I have already done the calculations necessary, but need help explaining the equation used during sequence alignment (is in bold) The following i

Get Help Now

Submit a Take Down Notice

Tutor
Tutor: Dr Jack
Most rated tutor on our site