Search for question
Question

01/02/2024, 13:16 Assignment 0 - CIS 6930 Spring 2024 This assignment will be practicing extracting data from an online source and reformatting the data. Use your knowledge of Python3, SQL, regular expressions, and the Linux command line tools to extract information from a CSV file on the web. The Norman, Oklahoma police department regularly reports incidents, arrests, and other activities. This data is hosted on their website. This data is distributed to the public in the form of PDF files. The website contains three types of summaries arrests, incidents, and case summaries. Your assignment is to build a function that collects only the incidents. To do so, you need to write Python (3) function(s) to do each of the following: ● 00 Assignment 0 - CIS 6930 Spring 2024 | CIS 6930 Spring 24 Download the data given one incident pdf Extract the fields: O Location O Date/Time Incident Number Nature Incident ORI Create a SQLite database to store the data Insert the data into the database Print each nature and the number of times the nature appears Below we describe the assignment structure and each required function. Please read through this whole document before starting! README.md The README file should be all uppercase with .md extension. You should write your name in it, and an example of how to run it, include a demo (gif/video) in the readme demonstrating the execution, and any bugs that should be expected. You should describe all functions and your approach to developing the database. The README file should contain a list of any bugs or assumptions made while writing the program. You should include directions on how to install and use the Python package. We know your code will not be perfect, be sure to include any assumptions you make for your solution. Note: You should not be copying code from any website not provided by the instructor. https://ufdatastudio.com/cis6930sp24/assignments/0 1/8 01/02/2024, 13:16 Below is an example template: #cis6930sp24 Name: # How to install pipenv install # Assignment Description (in your own words) ## How to run pipenv run ... ! [video] (video) Assignment 0 - CIS 6930 Spring 2024 | CIS 6930 Spring 24 Assignmento Template ## Functions #### main.py \ extractincidents() ...other functions ... this functions has these parameters, does t ## Database Development ## Bugs and Assumptions COLLABORATORS.md file This file should contain a pipe-separated list describing who you worked with and a small text description describing the nature of the collaboration. If you visited a website for inspiration, including the website. This information should be listed in three fields as in the example is below: Katherine Johnson | kj@nasa.gov | Helped me understand calculati Dorothy Vaughan | doro@dod.gov | Helped me with multiplexed time Stackoverflow | https://example | helped me with a compilation https://ufdatastudio.com/cis6930sp24/assignments/0 2/8 Assignment 0 - CIS 6930 Spring 2024 | CIS 6930 Spring 24 The collaborator file is mainly used to ensure that code similarities are coincidental. Be sure to abide by the acadenmic integrity guidelines outlined in the syllabus. Generative AI tools should not be used for this assignment. Assignment Description Create a private repository called cis6930sp24-assignment0 Add collaborators cegme and wyfunique by going to Settings > Collaborators and teams > Add people. Your code structure should be in a directory with the following format: 01/02/2024, 13:16 Feel free to combine or optimize functions as long as your code preserves the behavior of main.py. You may have more or fewer files in your directory as needed. You may have several additional tests and modules in your code. Create a Python package cis6930sp24-assignment0/ COLLABORATORS.md LICENSE Pipfile README.md assignmento main.py docs resources setup.cfg setup.py tests test_download.py test_random.py setup.py / setup.cfg from setuptools import setup, find_packages setup( name='assignmento', version='1.0', author='You Name', author_email='your ufl email', packages=find_packages (exclude= ('tests', 'docs', 'resour setup_requires=['pytest-runner'], https://ufdatastudio.com/cis6930sp24/assignments/0 3/8 01/02/2024, 13:16 tests_require=['pytest'] ) Note, the setup.cfg file should have at least the following text inside: [aliases] test-pytest [tool:pytest] norecursedirs main.py Assignment 0 - CIS 6930 Spring 2024 | CIS 6930 Spring 24 I import assignmento Here is an example main.py file we expect that yours will be different. This snippet shows an outline of the expected functionality. Calling the main function should download data insert it into a database and print a summary of the incidents. Your code will likely differ significantly. You may have more or less individual steps. Below is an outline; it shows how to use argparse to pass parameters to the code. You will also need to ensure that your code works with linux based systems. We will use the pipenv environment to run your code. def main(url): CVS, _darcs, {arch}, *.egg, venv # -*- coding: utf-8 *_ # Example main.py import argparse # Download data incident_data = assignment0.fetchincidents(url) # Extract data incidents = assignment.extractincidents(incident_data) # Create new database db = assignment.createdb() # Insert data assignment0.populatedb(db, incidents) # Print incident counts assignment0.status (db) https://ufdatastudio.com/cis6930sp24/assignments/0 4/8 01/02/2024, 13:16 if Assignment 0 - CIS 6930 Spring 2024 | CIS 6930 Spring 24 name parser = argparse. ArgumentParser() _main___': parser.add_argument("--incidents", type=str, required=True, help="Incident summary url.") args = parser.parse_args() if args.incidents: main (args.incidents) Your code should take a URL from the command line and perform each operation. After the code is installed, you should be able to run the code using the command below. We will use Pipfile to manage the package installation (more on Pipefiles here). pipenv run python assignment0/main.py --incidents <url> Each run of the above command should create a new normandb database file. You can add other command line parameters to test each operation but the -- incidents <ur l> flag is required. Download Data Below is a discussion of each interface. Note, the function names are suggestions and should be changed to suit your programmer. The function fetchincidents(url) takes a URL string and uses the Python urllib.request library to grab one incident pdf for the Norman Police Report Webpage. import urllib Below is an example snippet below to grab an incident pdf document from the URL. url = ("https://www.normanok.gov/sites/default/files/documents/' "2024-01/2024-01-01_daily_incident_summary.pdf") headers = {} headers['User-Agent'] = "Mozilla/5.0 (X11; Linux 1686) Applewebk data = urllib.request.urlopen(urllib.request. Request(url, header https://ufdatastudio.com/cis6930sp24/assignments/0 5/8