Hello Im trying to modify an Rscript below to calculate the

Hello; I’m trying to modify an R-script (below) to calculate the Viterbi path for the following nucleotide sequence: CGCTTCAGGCAAGT

for sequence 1 (coding, c) and sequence 2 (noncoding, n)

Initial transition probablities:

a0c= a0n =0.5

ann= anc =0.5

acc =0.55 acn= 0.45

where, aij is the transition probability)

The emission probabilities for the 4 nucleotides (A, C, G, T) are:

pA=0.20 pC=0.30 pG=0.36 pT=0.14 for the coding sequence (c), and

pA=0.19 pC=0.28 pG=0.33, and pT=0.20 for the non-coding sequence (n)

I got this example R-script from the CRAN project, and need help populating the correct values in each variable and/or otherwise modifying the script.

Here is the example script:

#Initialise HMM

hmm = initHMM(c(\"A\",\"B\"), c(\"L\",\"R\"), transProbs=matrix(c(.6,.4,.4,.6),2),

emissionProbs=matrix(c(.6,.4,.4,.6),2))

print(hmm)

# Sequence of observations

observations = c(\"L\",\"L\",\"R\",\"R\")

# Calculate Viterbi path

viterbi = viterbi(hmm,observations)

print(viterbi)

Here is what I have so far:

# Initialise HMM

hmm = initHMM(c(\"1\", \"2\", \"3\", \"4\"), c(\"A\", \"C\", \"G\", \"T\"), startProbs=matrix(c(0.25, 0.25, 0.275, 0.225),transProbs=matrix(c(0.20,0.30,0.36,0.14),emissionProbs=matrix(c(0.20,0.30,0.36,0.14))

print(hmm)

# Sequence of observations

observations=c(\"C\",\"G\",\"C\",\"T\",\"T\",\"C\",\"A\",\"G\",\"G\",\"C\",\"A\",\"A\",\"G\")

# Calculate Viterbi path

viterbi = viterbi(hmm,observations)

print(viterbi)

Here is the console output:

# Initialise HMM

> hmm = initHMM(c(\"1\", \"2\", \"3\", \"4\")), c(\"A\", \"C\", \"G\", \"T\")), startProbs=matrix(c(0.25, 0.25, 0.275, 0.225)),transProbs=matrix(c(0.20,0.30,0.36,0.14)),emissionProbs=matrix(c(0.20,0.30,0.36,0.14))

Error: unexpected \',\' in \"hmm = initHMM(c(\"1\", \"2\", \"3\", \"4\")),\"

>

> print(hmm)

$States

[1] \"A\" \"B\"

$Symbols

[1] \"L\" \"R\"

$startProbs

A   B

0.5 0.5

$transProbs

    to

from   A   B

   A 0.6 0.4

   B 0.4 0.6

$emissionProbs

      symbols

states   L   R

     A 0.6 0.4

     B 0.4 0.6

> # Sequence of observations

> observations=c(\"C\",\"G\",\"C\",\"T\",\"T\",\"C\",\"A\",\"G\",\"G\",\"C\",\"A\",\"A\",\"G\")

> # Calculate Viterbi path

> viterbi = viterbi(hmm,observations)

Error in hmm$emissionProbs[state, observation[1]] :

subscript out of bounds

> print(viterbi)

[1] \"A\" \"A\" \"B\" \"B\"

>

Solution

you have made a small mistake. the first line will be

hmm = initHMM( c(\"1\", \"2\", \"3\", \"4\"), c(\"A\", \"C\", \"G\", \"T\"), startProbs=matrix(c(0.25, 0.25, 0.275, 0.225),2),transProbs=matrix(c(0.20,0.30,0.36,0.14),2),emissionProbs=matrix(c(0.20,0.30,0.36,0.14),2))

here is the entire code and output

code

require(HMM)
# Initialise HMM
hmm = initHMM( c(\"1\", \"2\", \"3\", \"4\"), c(\"A\", \"C\", \"G\", \"T\"), startProbs=matrix(c(0.25, 0.25, 0.275, 0.225),2),transProbs=matrix(c(0.20,0.30,0.36,0.14),2),emissionProbs=matrix(c(0.20,0.30,0.36,0.14),2))
print(hmm)
# Sequence of observations
observations=c(\"C\",\"G\",\"C\",\"T\",\"T\",\"C\",\"A\",\"G\",\"G\",\"C\",\"A\",\"A\",\"G\")
# Calculate Viterbi path
viterbi = viterbi(hmm,observations)
print(viterbi)

OUTPUT

Hello; I’m trying to modify an R-script (below) to calculate the Viterbi path for the following nucleotide sequence: CGCTTCAGGCAAGT for sequence 1 (coding, c) a
Hello; I’m trying to modify an R-script (below) to calculate the Viterbi path for the following nucleotide sequence: CGCTTCAGGCAAGT for sequence 1 (coding, c) a
Hello; I’m trying to modify an R-script (below) to calculate the Viterbi path for the following nucleotide sequence: CGCTTCAGGCAAGT for sequence 1 (coding, c) a

Get Help Now

Submit a Take Down Notice

Tutor
Tutor: Dr Jack
Most rated tutor on our site