Hello Im trying to modify an Rscript below to calculate the
Hello; I’m trying to modify an R-script (below) to calculate the Viterbi path for the following nucleotide sequence: CGCTTCAGGCAAGT
for sequence 1 (coding, c) and sequence 2 (noncoding, n)
Initial transition probablities:
a0c= a0n =0.5
ann= anc =0.5
acc =0.55 acn= 0.45
where, aij is the transition probability)
The emission probabilities for the 4 nucleotides (A, C, G, T) are:
pA=0.20 pC=0.30 pG=0.36 pT=0.14 for the coding sequence (c), and
pA=0.19 pC=0.28 pG=0.33, and pT=0.20 for the non-coding sequence (n)
I got this example R-script from the CRAN project, and need help populating the correct values in each variable and/or otherwise modifying the script.
Here is the example script:
#Initialise HMM
hmm = initHMM(c(\"A\",\"B\"), c(\"L\",\"R\"), transProbs=matrix(c(.6,.4,.4,.6),2),
emissionProbs=matrix(c(.6,.4,.4,.6),2))
print(hmm)
# Sequence of observations
observations = c(\"L\",\"L\",\"R\",\"R\")
# Calculate Viterbi path
viterbi = viterbi(hmm,observations)
print(viterbi)
Here is what I have so far:
# Initialise HMM
hmm = initHMM(c(\"1\", \"2\", \"3\", \"4\"), c(\"A\", \"C\", \"G\", \"T\"), startProbs=matrix(c(0.25, 0.25, 0.275, 0.225),transProbs=matrix(c(0.20,0.30,0.36,0.14),emissionProbs=matrix(c(0.20,0.30,0.36,0.14))
print(hmm)
# Sequence of observations
observations=c(\"C\",\"G\",\"C\",\"T\",\"T\",\"C\",\"A\",\"G\",\"G\",\"C\",\"A\",\"A\",\"G\")
# Calculate Viterbi path
viterbi = viterbi(hmm,observations)
print(viterbi)
Here is the console output:
# Initialise HMM
> hmm = initHMM(c(\"1\", \"2\", \"3\", \"4\")), c(\"A\", \"C\", \"G\", \"T\")), startProbs=matrix(c(0.25, 0.25, 0.275, 0.225)),transProbs=matrix(c(0.20,0.30,0.36,0.14)),emissionProbs=matrix(c(0.20,0.30,0.36,0.14))
Error: unexpected \',\' in \"hmm = initHMM(c(\"1\", \"2\", \"3\", \"4\")),\"
>
> print(hmm)
$States
[1] \"A\" \"B\"
$Symbols
[1] \"L\" \"R\"
$startProbs
A B
0.5 0.5
$transProbs
to
from A B
A 0.6 0.4
B 0.4 0.6
$emissionProbs
symbols
states L R
A 0.6 0.4
B 0.4 0.6
> # Sequence of observations
> observations=c(\"C\",\"G\",\"C\",\"T\",\"T\",\"C\",\"A\",\"G\",\"G\",\"C\",\"A\",\"A\",\"G\")
> # Calculate Viterbi path
> viterbi = viterbi(hmm,observations)
Error in hmm$emissionProbs[state, observation[1]] :
subscript out of bounds
> print(viterbi)
[1] \"A\" \"A\" \"B\" \"B\"
>
Solution
you have made a small mistake. the first line will be
hmm = initHMM( c(\"1\", \"2\", \"3\", \"4\"), c(\"A\", \"C\", \"G\", \"T\"), startProbs=matrix(c(0.25, 0.25, 0.275, 0.225),2),transProbs=matrix(c(0.20,0.30,0.36,0.14),2),emissionProbs=matrix(c(0.20,0.30,0.36,0.14),2))
here is the entire code and output
code
require(HMM)
# Initialise HMM
hmm = initHMM( c(\"1\", \"2\", \"3\", \"4\"), c(\"A\", \"C\", \"G\", \"T\"), startProbs=matrix(c(0.25, 0.25, 0.275, 0.225),2),transProbs=matrix(c(0.20,0.30,0.36,0.14),2),emissionProbs=matrix(c(0.20,0.30,0.36,0.14),2))
print(hmm)
# Sequence of observations
observations=c(\"C\",\"G\",\"C\",\"T\",\"T\",\"C\",\"A\",\"G\",\"G\",\"C\",\"A\",\"A\",\"G\")
# Calculate Viterbi path
viterbi = viterbi(hmm,observations)
print(viterbi)
OUTPUT


