Search for question
Question

1:09>> 5G Assignment details DS 402, Section 001: TREND IN DATA SCI (22411--UP---P-D... Objective This lab is meant to get you to be faimilar with creating the measuring Probabilistic Policies by conducting experiments on MDPs that model parking. Recipe Ingredients Add the following files to your project from Lab 1: testsLab2.py (this has the new tests in it, and is where we will be working for the assignment portion) Parking LotFactory.py (this has NOT changed if you already downloaded it in Lab 1) ParkingDefs.py (this has NOT changed if you already downloaded it in Lab 1) agent/ProbPolicyAgents.py (This is the most important new file, and is where we will be working for the lab) Arrange the ProbPolicyAgents.py in the agent folder. The other files should be in the top level directory of your code project. Submit assignment ◄ Previous Next ▸ 12 51 Dashboard Calendar To-do Notifications Inbox 1:09 Assignment details 5G DS 402, Section 001: TREND IN DATA SCI (22411--UP---P-D... Before you begin Today's lab will not use slides, but I'll do a short code review (check the recording if you miss it or would like to review). Afterwards, follow the instructions of the pretask in testo, found in testsLab2.py, to better understand the implementation of the parking MDP. In addition, you may want to explore ParkingDef.py and Parking LotFactory.py to better understand the construction of a parking MDP (based on the graph of the parking MDP shown in Lab 1's presentation). NOTE: if an agent parks in an unoccupied handicapped space, they will receive the negated parkingReward (in this case, -1000, so that is where that # is coming from) Your Tasks 1. Test a basic probabilistic policy on two MDPs: Follow the instructions of the Task 1 section given in test1 and test2 to measure the performance the basic probabilistic policy (Occupied RandomAgent) on different MDPS. Submit assignment ◄ Previous Next ▸ 12 51 Dashboard Calendar To-do Notifications Inbox 1:09>> Assignment details 5G DS 402, Section 001: TREND IN DATA SCI (22411--UP---P-D... 1. Test a basic probabilistic policy on two MDPs: Follow the instructions of the Task 1 section given in test1 and test2 to measure the performance the basic probabilistic policy (Occupied RandomAgent) on different MDPs. A. Explore test1 in the testlab2.py to understand measuring the value of policies. You may find it helpful to change verbosity throughout this lab to control the level of output you receive. B. Run test1 and test2 in main.py to see results of the Occupied RandomAgent on a simple MDP and a harder MDP, created by varying the busyRate. C. (TURN THIS IN) Create a chart to compare the rewards obtained by the Occupied RandomAgent on the MDP with different busy Rates (you may use any type of chart you think is appropriate, but it must show the reward resulting from each run, NOT just a grand average). D. (TURN THIS IN) Provide a written interpretation of what you see in the chart. 2. Test multiple probabilistic policies on the same MDP: Follow the instructions of the Task Submit assignment ◄ Previous Next ▸ 12 51 Dashboard Calendar To-do Notifications Inbox 1:09>> 5G Assignment details DS 402, Section 001: TREND IN DATA SCI (22411--UP---P-D... interpretation of what you see in the chart. 2. Test multiple probabilistic policies on the same MDP: Follow the instructions of the Task 2 section given in test3 and test4 to measure two different probabilistic policies on the hard parking MDP from Task 1. In particular, we will use the Occupied RandomNoHandicapAgent and Occupied RandomNoHandicapLapAgent. A. Please investigate the implementation of the aforementioned two agents in ProbPolicyAgents.py to understand the improvements we have made to the basic probabilistic policy. B. Run test3 and test4 in main.py. C. (TURN THIS IN) Create a chart to compare the rewards obtained by a total of THREE probabilistic policies on the hard MDP. Note that we have seen only three types of probabilistic policies so far, meaning you should measure each one OR compare different parameter settings on one agent. (again, you may use any type of chart you think is appropriate, but it must show the distribution of rewards resulting from each run, NOT just a grand average). ITIIDN TUIS INI Provide a written Submit assignment ◄ Previous Next ▸ 12 51 Dashboard Calendar To-do Notifications Inbox 1:09>> 5G Assignment details DS 402, Section 001: TREND IN DATA SCI (22411--UP---P-D... each run, NOT just a grand average). D. (TURN THIS IN) Provide a written interpretation of what you see in the chart. Be sure to indicate why each agent seems to be performing better. 3. Create your own Probabilistic Policy: Follow the instructions in the Task 3 section in test5 to create your own probabilistic policy, then test it! A. (TURN THIS IN) In ProbPolicyAgents.py, finish implementing the policy adding whatever twist you like (do not share your twist with classmates). Turn in your selectAction function with your submission. B. Run test5 in main.py to report the rewards obtained by YourAgent. Feel free to iterate on your agent to try to achieve the highest rewards you can. C. (TURN THIS IN) Create a chart to compare the rewards obtained by YourAgent's probabilistic policy on the hard MDP (again, you may use any type of chart you think is appropriate, but it must show the distribution of rewards resulting from each run, NOT just a grand average). Submit assignment ◄ Previous Next ▸ 12 51 Dashboard Calendar To-do Notifications Inbox