Question

(40 points) Part 1: Q-Learning and Policy Iteration on the Frozen Lake Environment

In this part of the assignment, you will implement a basic version of Q-Learning on the Frozen Lake environment, following the tutorial

provided here. The main purpose is to familiarize yourself with using the OpenAl Gym library. In part 2, we will be using a more complex RL

method.

Objective: Implement a Q-Learning agent to solve the Frozen Lake environment.

Tasks:

1. Familiarize yourself with the Frozen Lake environment and its dynamics.

2. Implement the Q-Learning algorithm using the tutorial as a guide.

3. Train your Q-Learning agent on the Frozen Lake environment.

4. Evaluate the performance of your agent and analyze the impact of hyperparameters on the learning process. Specifically, verify the

impact of the following hyperparameters:

o alpha learning rate

o gamma: discount factor

• epsilon: exploration rate Test at least 3 different values for each hyperparameter and explain the effect of each hyperparameter

on the learning process.

5. Implement the policy iteration algorithm and compare its performance to Q-Learning.

Question image 1