Doctor of Philosophy (PhD)


Computer Science

Document Type



In the field of reinforcement learning, robot task learning in a specific environment with a Markov decision process backdrop has seen much success. But, extending these results to learning a task for an environment domain has not been as fruitful, even for advanced methodologies such as relational reinforcement learning. In our research into robot learning in environment domains, we utilize a form of deictic representation for the robot’s description of the task environment. However, the non-Markovian nature of the deictic representation leads to perceptual aliasing and conflicting actions, invalidating standard reinforcement learning algorithms. To circumvent this difficulty, several past research studies have modified and extended the Q-learning algorithm to the deictic representation case with mixed results. Taking a different tact, we introduce a learning algorithm which searches deictic policy space directly, abandoning the indirect value based methods. We apply the policy learning algorithm to several different tasks in environment domains. The results compare favorably with value based learners and existing literature results.



Committee Chair

Chen, Jianhua



Included in

Robotics Commons