A subject with an alpha045 and has an actionvalue for symbol
A subject with an alpha=0.45 and has an action-value for symbol A = 0.75 and symbol B = 0.65. The subject selects B and receives a reward of 1.
After this trial, which symbol is the subject more likely to select.
Please explain
Solution
Answer-
Here we have,
action-value for symbol A = 0.75
action-value for symbol B = 0.65
and, alpha = 0.45
The delta for symbol B is = action value for symbol A + action value of aymbol B/ alpha
= 0.75 + 0.65/0.45
= 1.40/0.45
= 2.3
After this trial symbol B is more likely to select.
