project-id | 00158 | ||||||
---|---|---|---|---|---|---|---|
Project title | Domain Informed Oracle for Reinforcement Learning | ||||||
Abstract | Reinforcement learning (RL) is a powerful AI method that does not require pre-gathered data but relies on a trial-and-error process for the agent to learn. This is made possible through a reward function that associates current state configurations to a numerical value. The agent’s goal is then to maximize its cumulative reward over its lifetime. Unfortunately, there is no systematic method to design a reward function. This needs to be done on a case by case basis, and might be hard depending on how the states are represented. States are typically represented as vector of values in RL, and translating properties and rules from a domain into this representation can be complicated depending on how many values are used, what they represent, whether they are normalized or not, etc. We propose a Domain Informed Oracle (DIO) as a solution for systematically incorporating do- main specific knowledge into RL reward functions. DIO is a collection of domain specific rules written in a declarative language, such as Prolog. It does not rely on the RL representation of states, allowing the programmer to focus on the domain specific knowledge using an expressive and intuitive language, where they can define states and rules in the most convenient way. DIO provides an informed decision to the reward function, thus allowing it to dynamically adapt the rewards. Our implementation is tested on a Traffic Simulator scenario and compared to a basic uninformed RL algorithm. The comparison is based on performance which we define by three metrics: time to train, optimality of the learned policy and finally, number of errors states reached. | ||||||
Primary contact name | Samar Rahmouni | ||||||
Primary contact email | Email hidden; Javascript is required. | ||||||
Primary contact mobile phone | 50523308 | ||||||
Students/participant(s) programs |
| ||||||
Faculty advisor(s) |
| ||||||
For CMU-Q advisor(s), please select their program(s) |
| ||||||
Box folder | box.com |