2021-03-25 Seminar Report

On 24th March, Dr. Akinori Tanaka gave an introduction to the reinforcement learning (RL) in our journal club of the Information Theory Study Group. He started from simple examples of a maze and a chess game to introduce the fundamental variables (i.e., states, actions, and rewards) and their evolution as a Markov decision process.After explaining that the goal of the RL is to maximize the value function, he discussed policy improvement theorem with the application to the epsilon-greedy update. We thank Akinori for the great and clear talk!

