ucf cecs using python to extract data from a csv file homework assignm
Search for question
Question
UCF-CECS
Using Python to extract data from a CSV file
Homework Assignment 2 (hw02)
March 7, 2024
1 Objectives
The goal of this homework assignment is to build a database from supplied data to demonstrate
defining, building, and populating a database using many of the features in MySQL software.
2 Objectives
There are several objectives in this assignment. They are as follows:
•
Import data from a CSV file.
• Produce several reports demonstrating competence in:
-
Sorting data by one or more search criteria.
Performing calculations on the data and ranking or sorting based on those
calculations.
These objectives will be met and demonstrated in the exercises specified later in the assignment.
2.1 Raw data
The data used to build the tables for this assignment are based on the 2023 Walt Disney World
Marathon's Military members participation. The results are based on the PDFs in the files.
CIS4340-McAlpin
Hw 02
1 label name
place
Table 2.1: Input data:Raw data field names & their data types
data type
integer
comments
gender
character
first name
character
last name
character
bib #
character
age
integer
place
integer
gender place
integer
clock time
HH:MM:SS
net time
HH:MM:SS
branch
character
distance
character
Official start of the race aka gun time
Time that the sensors recognize racer crossing the start line
Army, Navy, Marines, Air Force, Space Force, Coast Guard
USA, USN, USMC, USAF, USSF, USCG
marathon, half marathon, 5K, 10K, 17.75K, etc.
However, the data from the PDFs have been updated and are described below.
2023 Walt Disney World MARATHON runDISNEY DUNNING
WEEFEND
State Farm
2023 Walt Disney World Marathon
Date: 2/21/2023
Time: 5:29:41 P
Div
Place Name
Race
Num Age
Place Gender Clock Net
Place Time Time
Military
Branch
MEN -- MILITARY
1
JOSEPH CRAIG
30012 26 11
11 2:48:50 2:48:48
2 GARRETT GROVE
78 29 73
Date: 2/21/2023
64 3:20:34 3:18:01
2023 Walt Disney World Marathon
Space Force
Air Force
Time: 5:29:41 P
Div
Place Name
Race
Num Age
Place Gender Clock
Place Time
Net
Time
Military
Branch
WOMEN -- MILITARY
1 ABIGAIL BAHLAU
2 JESSICA ANDERSON
181 20 193
20406 33 262
35
51
3:34:25 3:34:08
Navy
3:43:34 3:41:22 Coast Guard
3 CHRISTINE KENDZIOR
473 41 383
94
3:51:31 3:49:16 Army
The raw data in the input file, wdwM-military-23.csv, has CSV data separated with a semi-
colon (;) in the following order with the corresponding data types. One participant's data will be
on each row of data in the input file.
NB: The first row of the CSV contains the name or description for every field in all the following
rows.
Notes:
1. The data types are recognizable for most languages, including Python. See also Python
time math operators.
CIS4340-McAlpin
Hw 02
2 2. The data fields place; gender; first; last; bib #3age%3Bplace; gender place;clock
time; net time; branch; distance are named in the first line of the CSV file.
• It might be very useful to keep track of the parameters based on this order.
• branch should be represented using the following acronyms: USAF, USN, USMC,
USCG, USSF, USA. The branch is a text string (Army, Navy, etc. in the CSV file, and
should be output based on the acronyms shown here.
2.2 Programs
2.2.1 Reports (short Python programs)
The following reports need to be written.
1. The 10 fastest females in this race, based on net time.
2. The 10 fastest males in this race, based on net time.
3. The 5 fastest in each gender for each of the military branches. (If there aren't 5 females (or
males) in the USCG, then only report the ones in the CSV file.)
4. The 10 runners, regardless of gender or branch, who had the longest wait from the
beginning based on a calculation of the difference between clock time to net time. Sorted
from longest to shortest time.
5. The top 10, regardless of gender, based on their average minute per mile. The minuter per
mile can be calculated in Python with a bit of research. It will be necessary to figure out to
find the top 10 using Python or some other resources.
2.2.2 Report names
2.2.3 Submission instructions
You must submit this assignment in Webcourses as file uploads. Note that all submissions will
be via Webcourses.
As discussed above each of the five Python programs will be submitted via Webcourses.
Wondering how to run the program and capture the output?
- xyzpython3 top10men.py CSVfilename> top10men.txt
-
This command prompt executes the Python program using the CSVfilename as input
and redirects the output from STDOUT to the file named top10men.txt.
It is preferred to ZIP your submissions.
CIS4340-McAlpin
Hw 02
3 #
Title
1
2
top10men
top10women
3 5fastestbyBnG
4 longest10Waits
5 top10fastest
Notes:
Table 2.2: Report names
Description
Fastest 10 men based on net time
Fastest 10 women based on net time
5 fastest in each gender for each of the military branches
runners who had the longest wait to cross the start line (see #4
above)
Fastest 10, based on minutes per mile (see #5 above)
Use the Title as shown above for both the program name, i.e. #1 would be
top10men.py and the output to be redirected to the filename top10men.txt.
Both the program and the output file for each of the 5 programs/reports will be
submitted to Webcourses.
3 Grading
Scoring will be based on the following rubric:
Table 3.1: Grading Rubric
Deduction
-100
- 20
Description
No program or output submissions
Missing program or output files - each file
Start with 100 points and deduct per the schedule above
4
Hw 02
CIS4340-McAlpin