In this study we are interested in assessing how people explore a "gridworld" (see figure below). A gridworld contains several obstacles (black cells) and four targets that give points in a range from 0 to 100. The targets are represented by four colored objects (blue, green, orange, and purple). Only 1 of the 4 objects has a high value, the other 3 objects have a low value.
In this task, all the obstacles and the four targets are concealed, and hence you will be presented with a blank 11 x 11 grid as illustrated in the figure below:
Starting at an initial position (black dot), your goal is to find out where the highest value target is. To do this task, you need to make sequential decisions about going up, down, left or right to explore the space. You will be provided with 40 opportunities (i.e., episodes) to explore the gridworld. You have a limit of 31 decisions in each episode. An episode ends when you reach one of the four targets or when you reach the 31-step limit without any target consumption (see figure below).
You will earn points by consuming (i.e., reaching) a target, but also you are penalized 1 point for each decision made (i.e., a movement cost). You are also penalized 5 points for walking into an obstacle.