1 09 greater than greater than 5g assignment details ds 402 section 00
Search for question
Question
1:09>>
5G
Assignment details
DS 402, Section 001: TREND IN DATA SCI (22411--UP---P-D...
Objective
This lab is meant to get you to be faimilar with
creating the measuring Probabilistic Policies by
conducting experiments on MDPs that model
parking.
Recipe Ingredients
Add the following files to your project from Lab 1:
testsLab2.py (this has the new tests in it, and
is where we will be working for the
assignment portion)
Parking LotFactory.py (this has NOT changed
if you already downloaded it in Lab 1)
ParkingDefs.py (this has NOT changed if you
already downloaded it in Lab 1)
agent/ProbPolicyAgents.py (This is the most
important new file, and is where we will be
working for the lab)
Arrange the ProbPolicyAgents.py in the agent
folder. The other files should be in the top level
directory of your code project.
Submit assignment
◄ Previous
Next ▸
12
51
Dashboard Calendar
To-do
Notifications
Inbox 1:09
Assignment details
5G
DS 402, Section 001: TREND IN DATA SCI (22411--UP---P-D...
Before you begin
Today's lab will not use slides, but I'll do a short code
review (check the recording if you miss it or would
like to review). Afterwards, follow the instructions
of the pretask in testo, found in testsLab2.py, to
better understand the implementation of the
parking MDP. In addition, you may want to explore
ParkingDef.py and Parking LotFactory.py to better
understand the construction of a parking MDP
(based on the graph of the parking MDP shown in
Lab 1's presentation).
NOTE: if an agent parks in an unoccupied
handicapped space, they will receive the negated
parkingReward (in this case, -1000, so that is where
that # is coming from)
Your Tasks
1. Test a basic probabilistic policy on two MDPs:
Follow the instructions of the Task 1 section
given in test1 and test2 to measure the
performance the basic probabilistic policy
(Occupied RandomAgent) on different MDPS.
Submit assignment
◄ Previous
Next ▸
12
51
Dashboard Calendar
To-do
Notifications
Inbox 1:09>>
Assignment details
5G
DS 402, Section 001: TREND IN DATA SCI (22411--UP---P-D...
1. Test a basic probabilistic policy on two MDPs:
Follow the instructions of the Task 1 section
given in test1 and test2 to measure the
performance the basic probabilistic policy
(Occupied RandomAgent) on different MDPs.
A. Explore test1 in the testlab2.py to
understand measuring the value of
policies. You may find it helpful to
change verbosity throughout this lab to
control the level of output you receive.
B. Run test1 and test2 in main.py to see
results of the Occupied RandomAgent
on a simple MDP and a harder MDP,
created by varying the busyRate.
C. (TURN THIS IN) Create a chart to
compare the rewards obtained by the
Occupied RandomAgent on the MDP
with different busy Rates (you may use
any type of chart you think is
appropriate, but it must show the
reward resulting from each run, NOT
just a grand average).
D. (TURN THIS IN) Provide a written
interpretation of what you see in the
chart.
2. Test multiple probabilistic policies on the
same MDP: Follow the instructions of the Task
Submit assignment
◄ Previous
Next ▸
12
51
Dashboard Calendar
To-do
Notifications
Inbox 1:09>>
5G
Assignment details
DS 402, Section 001: TREND IN DATA SCI (22411--UP---P-D...
interpretation of what you see in the
chart.
2. Test multiple probabilistic policies on the
same MDP: Follow the instructions of the Task
2 section given in test3 and test4 to measure
two different probabilistic policies on the hard
parking MDP from Task 1. In particular, we
will use the
Occupied RandomNoHandicapAgent and
Occupied RandomNoHandicapLapAgent.
A. Please investigate the implementation
of the aforementioned two agents in
ProbPolicyAgents.py to understand the
improvements we have made to the
basic probabilistic policy.
B. Run test3 and test4 in main.py.
C. (TURN THIS IN) Create a chart to
compare the rewards obtained by a total
of THREE probabilistic policies on the
hard MDP. Note that we have seen only
three types of probabilistic policies so
far, meaning you should measure each
one OR compare different parameter
settings on one agent. (again, you may
use any type of chart you think is
appropriate, but it must show the
distribution of rewards resulting from
each run, NOT just a grand average).
ITIIDN TUIS INI Provide a written
Submit assignment
◄ Previous
Next ▸
12
51
Dashboard Calendar
To-do
Notifications
Inbox 1:09>>
5G
Assignment details
DS 402, Section 001: TREND IN DATA SCI (22411--UP---P-D...
each run, NOT just a grand average).
D. (TURN THIS IN) Provide a written
interpretation of what you see in the
chart. Be sure to indicate why each
agent seems to be performing better.
3. Create your own Probabilistic Policy: Follow
the instructions in the Task 3 section in test5
to create your own probabilistic policy, then
test it!
A. (TURN THIS IN) In
ProbPolicyAgents.py, finish
implementing the policy adding
whatever twist you like (do not share
your twist with classmates). Turn in your
selectAction function with your
submission.
B. Run test5 in main.py to report the
rewards obtained by YourAgent. Feel
free to iterate on your agent to try to
achieve the highest rewards you can.
C. (TURN THIS IN) Create a chart to
compare the rewards obtained by
YourAgent's probabilistic policy on the
hard MDP (again, you may use any type
of chart you think is appropriate, but it
must show the distribution of rewards
resulting from each run, NOT just a
grand average).
Submit assignment
◄ Previous
Next ▸
12
51
Dashboard Calendar
To-do
Notifications
Inbox