2 1 raw data the data used to build the tables for this assignment are
Search for question
Question
2.1 Raw data
The data used to build the tables for this assignment are based on the 2023 Walt Disney World
Marathon's Military members participation. The results are based on the PDFs in the files.
However, the data from the PDFs have been updated and are described below.
Date: 2/21/2023
Time: 5:29:41 P
Dw
Place Name
MEN-MILITARY
1 JOSEPH CRAIG
2 GARRETT GROVE
Date: 2/21/2023
Time: 5:29:41 P
DN
Place Name
File Preview
WARATHON WEDNESDA
WOMEN-MILITARY
1 ABIGAIL BAHLAU
2 JESSICA ANDERSON
3 CHRISTINE KENDZIOR
2023 Walt Disney World Marathon
Race
Num Age
30012 26
State Far
11
78 29 73
2023 Walt Disney World Marathon
Hace
Num Age
Place Gender Clock Net
Place Time Time
101 20 193
20406 33
473 41 383
11 2:48.50 2:48:48 Space Force
64 3:20:34 3:18:01 Air Force
Place Gender Clock
Place Time
Matary
Branch
35
51
94
Net
Mitary
Time Dranch
3.34.25 3.34.00 Navy
3.43.34 3.41.22 Coast Guard
3:51:31 3:49.16 Army
The raw data in the input file, wdwM-military-23.csv, has CSV data separated with a semi-
colon (;) in the following order with the corresponding data types. One participant's data will be
on each row of data in the input file.
NB: The first row of the CSV contains the name or description for every field in all the following
rows. label name
place
gender
first name
last name
bib #
age
place
gender place
clock time
net time
branch
distance
Table 2.1: Input data:Raw data field names & their data types
data type
integer
character
character
character
character
integer
integer
integer
comments
HH:MM:SS
Official start of the race aka gun time
HH:MM:SS Time that the sensor recognizes racer crossing the start line
character Army, Navy, Marines, Air Force, Space Force, Coast Guard
USA, USN, USMC, USAF, USSF, USCG
marathon, half marathon, 5K, 10K, 17.75K, etc.
character
Notes:
1. The data types are recognizable for most languages. However, there are SQL Specific data
types appropriate to the types listed above. Make sure to use the appropriate SQL data
types. This is especially true for time.
2. The data fields, branch and distance must be in another appropriately named data table.
• branch could be represented using the following acronyms: USAF, USN, USMC,
USCG, USSĘ, USA. The branch could be the primary key, and would contain the full
name in the record. (This would minimize table sizes of the user data. As a side
note there are over 30,000 participants in this race. Hint: This database may be used
in future topics and assignments.)
• distance could be represented in a table containing the acronym or nickname, the
distance in both metric and English numerical units of measure. For example, the
marathon is 26.2 miles or 42.195 kilometers. This will facilitate calculations of
minutes per mile or kilometer.
3. Make sure to create the database for this assignment to be used for your NID MySQL
account.
2.2 Tables
Given the input text file contains all the individual competitors user data, it should be organized
to optimize the overall table size and to enable quick queries and calculations. The structures for the tables should follow the following outline:
1. Competitor specific data, i.e. place, gender, first name, last name, bib #, age on race day,
overall place, gender place, clock time, net time, branch, & distance
2. A table containing the race distance name, metric & English distances.
3. A table containing the military branch acronym and full name, i.e., USN corresponding to
US Navy, USA for US Army, etc.
It is required that an Entity-Relationship Diagram be submitted as part of the assignment. (It
might be helpful to pencil a quick diagram showing the ERD and then use the MySQL EEF tools
to generate a more polished ERD.)
This database may be used in future topics and assignments.
2.2.1 Relationships
There are only two relationships required for this database, as discussed previously, they are
described below. The data fields, branch and distance must be in another appropriately named
data table.
• branch could be represented using the following acronyms: USAF, USN, USMC, USCG,
USSF, USA. The branch could be the primary key, and would contain the full name in the
record. (This would minimize table sizes.)
• distance could be represented in a table containing the acronym or nickname, the
distance in both metric and English numerical units of measure. For example, the
marathon is 26.2 miles or 42.195 kilometers. This will facilitate calculations of minutes
per mile or kilometer.
2.3 Exercises
2.3.1 Queries
The following queries need to be written.
1. The 10 fastest females in this race, based on net time.
2. The 10 fastest males in this race, based on net time.
3. The 5 fastest in each gender for each of the military branches. (If there aren't 5 females (or
males) in the USCG, then only report the ones in the database.)
4. The 10 runners, regardless of gender or branch, who had the longest wait from the
beginning based on a calculation of the difference between clock time to net time. Sorted
from longest to shortest time. 5. The top 10, regardless of gender, based on their average minute per mile. The minuter per
mile can be calculated in SQL with a bit of research.
3 Documenting the steps
Each of the five queries outline above can be run from the MySQL command line. And a
screenshot of the results will be sufficient proof of query completion.
3.1 Screenshots
1. Make sure to take a screenshot of the final results of each of the queries, from 1 thru 5.
2. Save the screenshot in a file named YourNID-QryX where QryX is the query number.
Submit the screen shots from each step identified above as part of your assignment. The
following queries need to be screenshot.
1. The 10 fastest females in this race, based on net time.
2. The 10 fastest males in this race, based on net time.
3. The 5 fastest in each gender for each of the military branches. (For example, if there aren't
5 females (or males) in the USCG, then only report the ones in the database.)
4. The 10 runners, regardless of gender or branch, who had the longest wait from the
beginning based on a calculation of the difference between clock time to net time. Sorted
from longest to shortest time.
5. The top 10, regardless of gender, based on their average minute per mile. The minuter per
mile can be calculated in SQL with a bit of research.
4 Submission instructions
You must submit this assignment in Webcourses as file uploads. Note that all submissions will
be via Webcourses.
Be sure to include the five query result screenshots and the MySQL EER screenshot of the ERD.
It is preferred to ZIP your submissions.
5 Grading
Scoring will be based on the following rubric:/n