01 02 2024 13 16 assignment 0 cis 6930 spring 2024 this assignment wil
Search for question
Question
01/02/2024, 13:16
Assignment 0 - CIS 6930 Spring 2024
This assignment will be practicing extracting data from an online source and
reformatting the data. Use your knowledge of Python3, SQL, regular expressions,
and the Linux command line tools to extract information from a CSV file on the
web.
The Norman, Oklahoma police department regularly reports incidents, arrests, and
other activities. This data is hosted on their website. This data is distributed to the
public in the form of PDF files.
The website contains three types of summaries arrests, incidents, and case
summaries. Your assignment is to build a function that collects only the incidents.
To do so, you need to write Python (3) function(s) to do each of the following:
●
00
Assignment 0 - CIS 6930 Spring 2024 | CIS 6930 Spring 24
Download the data given one incident pdf
Extract the fields:
O Location
O
Date/Time
Incident Number
Nature
Incident ORI
Create a SQLite database to store the data
Insert the data into the database
Print each nature and the number of times the nature appears
Below we describe the assignment structure and each required function. Please
read through this whole document before starting!
README.md
The README file should be all uppercase with .md extension. You should write your
name in it, and an example of how to run it, include a demo (gif/video) in the
readme demonstrating the execution, and any bugs that should be expected. You
should describe all functions and your approach to developing the database. The
README file should contain a list of any bugs or assumptions made while writing
the program. You should include directions on how to install and use the Python
package. We know your code will not be perfect, be sure to include any
assumptions you make for your solution. Note: You should not be copying code
from any website not provided by the instructor.
https://ufdatastudio.com/cis6930sp24/assignments/0
1/8 01/02/2024, 13:16
Below is an example template:
#cis6930sp24
Name:
# How to install
pipenv install
# Assignment Description (in your own words)
## How to run
pipenv run ...
! [video] (video)
Assignment 0 - CIS 6930 Spring 2024 | CIS 6930 Spring 24
Assignmento Template
## Functions
#### main.py \
extractincidents()
...other functions
...
this functions has these parameters, does t
## Database Development
## Bugs and Assumptions
COLLABORATORS.md file
This file should contain a pipe-separated list describing who you worked with and a
small text description describing the nature of the collaboration. If you visited a
website for inspiration, including the website. This information should be listed in
three fields as in the example is below:
Katherine Johnson | kj@nasa.gov | Helped me understand calculati
Dorothy Vaughan | doro@dod.gov | Helped me with multiplexed time
Stackoverflow | https://example | helped me with a compilation
https://ufdatastudio.com/cis6930sp24/assignments/0
2/8 Assignment 0 - CIS 6930 Spring 2024 | CIS 6930 Spring 24
The collaborator file is mainly used to ensure that code similarities are coincidental.
Be sure to abide by the acadenmic integrity guidelines outlined in the syllabus.
Generative AI tools should not be used for this assignment.
Assignment Description
Create a private repository called cis6930sp24-assignment0 Add collaborators
cegme and wyfunique by going to Settings > Collaborators and teams >
Add people. Your code structure should be in a directory with the following
format:
01/02/2024, 13:16
Feel free to combine or optimize functions as long as your code preserves the
behavior of main.py. You may have more or fewer files in your directory as needed.
You may have several additional tests and modules in your code.
Create a Python package
cis6930sp24-assignment0/
COLLABORATORS.md
LICENSE
Pipfile
README.md
assignmento
main.py
docs
resources
setup.cfg
setup.py
tests
test_download.py
test_random.py
setup.py / setup.cfg
from setuptools import setup, find_packages
setup(
name='assignmento',
version='1.0',
author='You Name',
author_email='your ufl email',
packages=find_packages (exclude= ('tests', 'docs', 'resour
setup_requires=['pytest-runner'],
https://ufdatastudio.com/cis6930sp24/assignments/0
3/8 01/02/2024, 13:16
tests_require=['pytest']
)
Note, the setup.cfg file should have at least the following text inside:
[aliases]
test-pytest
[tool:pytest]
norecursedirs
main.py
Assignment 0 - CIS 6930 Spring 2024 | CIS 6930 Spring 24
I
import assignmento
Here is an example main.py file we expect that yours will be different. This snippet
shows an outline of the expected functionality. Calling the main function should
download data insert it into a database and print a summary of the incidents. Your
code will likely differ significantly. You may have more or less individual steps.
Below is an outline; it shows how to use argparse to pass parameters to the code.
You will also need to ensure that your code works with linux based systems. We will
use the pipenv environment to run your code.
def main(url):
CVS, _darcs, {arch}, *.egg, venv
# -*- coding: utf-8 *_
# Example main.py
import argparse
# Download data
incident_data = assignment0.fetchincidents(url)
# Extract data
incidents = assignment.extractincidents(incident_data)
# Create new database
db = assignment.createdb()
# Insert data
assignment0.populatedb(db, incidents)
# Print incident counts
assignment0.status (db)
https://ufdatastudio.com/cis6930sp24/assignments/0
4/8 01/02/2024, 13:16
if
Assignment 0 - CIS 6930 Spring 2024 | CIS 6930 Spring 24
name
parser = argparse. ArgumentParser()
_main___':
parser.add_argument("--incidents", type=str, required=True,
help="Incident summary url.")
args = parser.parse_args()
if args.incidents:
main (args.incidents)
Your code should take a URL from the command line and perform each operation.
After the code is installed, you should be able to run the code using the command
below. We will use Pipfile to manage the package installation (more on Pipefiles
here).
pipenv run python assignment0/main.py --incidents <url>
Each run of the above command should create a new normandb database file. You
can add other command line parameters to test each operation but the --
incidents <ur l> flag is required.
Download Data
Below is a discussion of each interface. Note, the function names are suggestions
and should be changed to suit your programmer.
The function fetchincidents(url) takes a URL string and uses the Python
urllib.request library to grab one incident pdf for the Norman Police Report
Webpage.
import urllib
Below is an example snippet below to grab an incident pdf document from the
URL.
url = ("https://www.normanok.gov/sites/default/files/documents/'
"2024-01/2024-01-01_daily_incident_summary.pdf")
headers = {}
headers['User-Agent'] = "Mozilla/5.0 (X11; Linux 1686) Applewebk
data = urllib.request.urlopen(urllib.request. Request(url, header
https://ufdatastudio.com/cis6930sp24/assignments/0
5/8