Question
Instructions for submitting the solution 1. Submit the source code file for every programming assignment like .c file, .java file. 2. If any dataset is used then you need to share all the files with solution in a single zip file. 3. Share the readme file in which you have to include the system specification, required software and the execution instructions. 4. Share the output of the programs. If it is a single program then you can submit the screenshot, if multiple files are included then please share the screen recording. Below are some steps that need to be covered in screen recording. ❖ Show the complete solution according to the instructions. ❖ Need to cover all the compilation steps. Show the complete output of program/project. Show all pass test cases if given in assignment. 5. Always add proper comments in code. If any specific package is used then please mention it in comment./n UCF-CECS Using Python & a screen scraper to extract data from Wikipedia Homework Assignment 3 (hw03) April 1, 2024 1 Objectives The goal of this homework assignment is to develop a solution to extract (screen scrape) the SpaceX Falcon 9 Block 5 launch records from Wikipedia. It is important to note that this website is being actively & substantially modified. So a stable version of the Wikipedia page for this assignment has been supplied. Use the supplied file is to develop a solution to extract (screen scrape) data for the SpaceX Falcon 9/Heavy Launches using the Block 5 engines. This is described in further detail below. 1.1 Specific data There are several data in the webpage elements to be extracted in this assignment. They are as follows: • All Block 5 engines' launch history ((may vary from one to many launches). - Launch identified by launch number (Fx-DDD where x is either a 9 for Falcon 9, or H for a Falcon Heavy and DDD is a 3 decimal digit) - - Launch date in DD-Month-Year format Turnaround time in days CIS4340-McAlpin HW 03 1 • In this assignment it is important to note the Falcon 9 launches are a single launch booster using a Block 5 engine, and Falcon Heavy launches have three Block 5 engines. These objectives will be met and demonstrated in the exercises specified later in the assignment. 1.2 Collected data As discussed earlier, the Falcon 9 wikipedia page is currently being substantially revised. This page is currently being split. After a discussion, consensus to split this page into List of Falcon 9 and Falcon Heavy launches (2020-2021) was found. You can help implement the split by following the instructions at Help:Splitting and the resolution on the discussion. Process started in March 2024. For this reason, the file (Falcon9first-stageBoosters.html) has been supplied for this assignment. It is in the Webcourses assignment page. 2 CIS4340-McAlpin HW 03 S/Nial Type Launches Launch date (UTC)[5] Falcon 9 block 5 first-stage boosters Expended, Destroyed, or Officially Retired) Flight No. Turnaround Payload cl 11 May 2018 [b] F9-054 time Bangabandhu-188] 7 August 2018 E9-06088 days Telkom-4 Merah Putih 891 B1046 E9 3 December 2018 P9-064 118 days 19 January 2020911 F9-079412 days SHERPA (SSO-A)[88][90] Dragon C205 (In-Flight Abort Testy 921 22 July 2018 F9-058 - Telstar 19y1931 B1047 9 15 November 2018 F9-063116 days Es'hail 2241 Launch Landing (pad) (location) Success Success (39A) (OCISLY) Success Success (40) (OCISLY) Success (4E) Success (39A) Success (40) Success (39A) Success Status Expended Success (JRTI) No attempt (OCISLY) Success Expended (OCISLY) 6 August 20191951 E9-074263 days AMOS-17 No attempt 271 (40) B1024's history in html: B1024 Figure 1.1: The first 7 rows of Block 5 engines' data FT 15 June 2016 F9-026 âĂŤ ABS-2A / Eutelsat 117 West B Success
(40) Failure Destroyed[40] The rows, , and cells by column, tags support configuration for the number of rows in each column. Those tags are navigable in the scraping code, often indexable too. CIS4340-McAlpin HW 03 3 S/Nlal Type Launches uch date TC151 Flight No. Turnaround F9-159 daigne [156] Launch Landing Starlinkpl L19) Success (pad) (location) (40) (OCISLY)[157] Status B1066 EH core 1 1 November 2022 FH-004 USSF-44 B1068 FH core 1331 1 1 May 2023[131] FH-006 - ViaSat-3 Americas[131] B1070 FH core 1 15 January 2023[158] FH-005 - USSF-67 Success (39A) Success (39A) Success (39A) No attempt Expended No attempt[132] Expended No attempt Expended B1074 FH core 1 29 July 2023 FH-007 - Jupiter-3 (EchoStar-24) Success (39A) Success No attempt Expended B1079 FH core 1 13 October 2023 FH-008 Psyche 1591 No attempt Expended (39A) B1084 FH core 1 29 December 2023 FH-009 USSF-52 (Boeing X-37B OTV-7) Success (39A) No attempt Expended 1.3 Programs 1.3.1 Extraction Figure 1.2: The last 6 rows of Block 5 engines' data The following data needs to be extracted using a screen scraper applied to the HTML file, Falcon9first-stageBoosters.html, which is supplied via Webcourses. 1. The Block 5 engine number. 2. The Flight number. 3. The Flight type a) F9 for Falcon 9 b) FH for Falcon Heavy 4. The launch date, in the YYYY-MM-DD format. 5. The launch pad. 6. The landing location, typically an acronym, sometimes also identified as No attempt. 7. The Turnaround time, in days. 8. The engine's status: a) Expended b) Destroyed c) Lost at sea d) Returned to service 9. The total number of launches for this engine. These data elements should be output in the order shown above, to STDOUT. Each element should be separated by a comma, thereby building a CSV. This Python program should be named, Block5Extract.py. CIS4340-McAlpin HW 03 4 Wondering how to run the program and capture the output? - - xyz$python3 Block5Extract.py > Block5.csv 1.3.2 Reports * This command prompt executes the Python program using the Falcon9first-stageBoosters.html as input and redirects the output from STDOUT to the file named Block5.csv. * Make sure Falcon9first-stageBoosters.html is in the same directory as the code. # Title 1 f9only 2 fHonly 3 fHpairs 4 5 6 Table 1.1: Report names Description Only Falcon 9 launches Only Falcon Heavy launches The three engines used for each Falcon Heavy launch longestTurnaround | The longest turnaround for a Block 5 engine fastestTurnaround mostLaunches Notes: The fastest turnaround for a Block 5 engine The most number of launches for a Block 5 engine Use the Title as shown above for both the program name, i.e. #1 would be f9only .py and the output to be redirected to the filename f9only.txt. Both the program and the output file for each of the 6 programs/reports will be submitted to Webcourses. 1.3.3 Submission instructions You must submit this assignment in Webcourses as file uploads. It is preferred to ZIP your submissions. The submitted programs are as follows: 1. f9only 2. fHonly 3. fHpairs 4. longestTurnaround 5. fastestTurnaround 6. mostLaunches 7. Block5Extract CIS4340-McAlpin HW 03 5/n