Search for question
Question

1:23 5G Assignment details DS 402, Section 001: TREND IN DATA SCI (22411--UP---P-D... Your Tasks 1. Task 1: Inspect DQN critical states on the toy MDP: A. Run lab5test 1 to observe output from this agent's training and critical state identification. B. (TURN THIS IN) Find a "critical" state, according to the agent. Do you agree that it is critical? Why or why not? C. (TURN THIS IN) Find a "non-critical" state, according to the agent. Do you agree that it is non-critical? Why or why not? D. (TURN THIS IN) Are the two criticality metrics meaningfully different on this MDP? Why or why not? 2. Task 2: Make Q-learning agent capable of computing critical states (starting with the toy MDP) A. lab5test2 will not run until you have written two functions. Do so now, using the implementations found in the DQN agent as a reference. B. (TURN THIS IN, your code) Write determine_criticalities_huang() for the Q-learning agent. It should return a list Submit assignment ◄ Previous Next ▸ 12 51 Dashboard Calendar To-do Notifications Inbox 1:23 < 5G Assignment details DS 402, Section 001: TREND IN DATA SCI (22411--UP---P-D... the implementations found in the D agent as a reference. B. (TURN THIS IN, your code) Write determine_criticalities_huang() for the Q-learning agent. It should return a list of tuples (state, criticality) for each state of the MDP. The criticality computation found in Huang et al. uses max-average. C. (TURN THIS IN, your code) Write determine_criticalities_amir() for the Q- learning agent. It should also return a list of tuples (state, criticality) for each state of the MDP. The criticality computation found in Amir+Amir (HIGHLIGHTS paper, 2018) uses max- min. D. (TURN THIS IN) Are the two criticality metrics meaningfully different on this MDP? Why or why not? E. (TURN THIS IN) Compare the output with that from Task 1. Do the two agents produce meaningfully different critical states? Why or why not? 3. Task 3: DQN criticality on the parking MDP A. Run lab5test3 to observe a training session on a parking MDP. B. (TURN THIS IN) Are the two criticality metrics meaningfully different on this MDP? Why or why not? Submit assignment ◄ Previous Next ▸ 12 51 Dashboard Calendar To-do Notifications Inbox 1:23 5G Assignment details DS 402, Section 001: TREND IN DATA SCI (22411--UP---P-D... states? Why or why not? 3. Task 3: DQN criticality on the parking MDP A. Run lab5test3 to observe a training session on a parking MDP. B. (TURN THIS IN) Are the two criticality metrics meaningfully different on this MDP? Why or why not? 4. Task 4: Q-learning criticality on the parking MDP A. Run lab5test4 to observe a training session on a parking MDP (You will need to copy over the printCriticalities function from the DQN agent to the QLearning agent, apologies). B. (TURN THIS IN) Are the two criticality metrics meaningfully different on this MDP? Why or why not? C. (TURN THIS IN) Compare the output with that from task 3. Do the two agents produce meaningfully different critical states? Why or why not? 5. Task 5: Criticality for both agents on a random MDP A. Run lab5test5 to observe two training sessions for the DQN and Q-learning agents on a random MDP. B. (TURN THIS IN) Are the two criticality metrics meaningfully different on this Submit assignment ◄ Previous Next ▸ 12 51 Dashboard Calendar To-do Notifications Inbox 1:23 5G Assignment details DS 402, Section 001: TREND IN DATA SCI (22411--UP---P-D... produce meaningfully different critical states? Why or why not? 5. Task 5: Criticality for both agents on a random MDP A. Run lab5test5 to observe two training sessions for the DQN and Q-learning agents on a random MDP. B. (TURN THIS IN) Are the two criticality metrics meaningfully different on this MDP? Why or why not? C. (TURN THIS IN) Do the two agents produce meaningfully different critical states? Why or why not? 6. Task 6: Testing with criticality A. Run lab5test6 to load pickle files from a trained agent and see their state criticalities. If you cannot get output from loading the pickle files, here is mine. B. (TURN THIS IN) Your job is to determine which of the agents are: undertrained (there are 2x, undertrained and more undertrained), trained (there is 1x), and mutated (there are 3x, high, medium, and low). Indicate which color you think is which type of agent using the state criticalities. C. (TURN THIS IN) Provide justification Submit assignment ◄ Previous Next ▸ 12 51 Dashboard Calendar To-do Notifications Inbox 1:23A 5G Assignment details DS 402, Section 001: TREND IN DATA SCI (22411--UP---P-D... 6. Task 6: Testing with criticality Submit A. Run lab5test6 to load pickle files from a trained agent and see their state criticalities. If you cannot get output from loading the pickle files, here is mine. B. (TURN THIS IN) Your job is to determine which of the agents are: undertrained (there are 2x, undertrained and more undertrained), trained (there is 1x), and mutated (there are 3x, high, medium, and low). Indicate which color you think is which type of agent using the state criticalities. C. (TURN THIS IN) Provide justification for each label you assign to each color D. (TURN THIS IN) If you create any figures/summaries/etc to answer the previous task, please turn those in as well. A file that is readable (pdf, docx, etc) containing your writing, criticality functions, and any charts you made for task 6. ◄ Previous Submit assignment Next ▸ 12 51 Dashboard Calendar To-do Notifications Inbox