Search for question
Question

Assignment #1 This assignment requires you analyze data about electricity demand and production in 28 European countries from 2000 to 2020. The data for this assignment is contained in the

Excel file "Europe_Power_Gen". There are four variables: ● ● ● Year: The year the electricity was generated or used Country: Country name (28 values) Category: The demand for electricity, or the source/method used to generate electricity: O Demand -Total electricity demand for a given country and year o Bioenergy - Derived from recently living organic materials (renewable energy source) Coal - Carbon-based sedimentary rock burned to produce power (fossil fuel source) Gas Natural gas burned to produce power (fossil fuel) Hydro - Fast moving water spins turbine blades to generate power (renewable) Nuclear - Use of nuclear reactions to produce electricity (some people consider this "green") Other fossil - Fossil fuels other than coal or gas Other renewables - Renewable power sources other than bioenergy, hydro, solar, or wind Solar - Manufactured cells transform sunlight into electricity (renewable) Wind - Wind spins turbine blades to generate power (renewable) Generation (TWh) - amount of electricity generated or used (in terawatt hours) Demand for a given country and year is typically met by the sources/methods listed for that country in that year. That is, the value for Demand (listed in the Generation column) equals the sum of the values for the nine listed sources/methods (Demand = Bioenergy + Coal + + Wind). If electricity demand for a given country and year was not met by the sources listed, the untry imported the remaining power. If a country produced more than it needed, it exported the rest. A negative number for generation means that the country exported that amount of electricity from that source/method. Do not make any modifications to the Excel file. Perform any calculations, filtering, etc. in Tableau. Assignment Instructions 1. Use Tableau Desktop to create a visualization that shows the viewer how European electricity demand and/or production has changed over the years 2000 to 2020. Determine an interesting message or "story" contained in the data. That is, what observations and/or information are most important to show the reader? Determining what is most important is ultimately up to you, but it should be something that you consider significant, surprising, and/or unexpected. Perform your own analysis and create a visualization that conveys your findings effectively. Create the visualization for a general audience, but one that understands the variables and their values. 2. Create a static visualization – either a choropleth map or line chart. You do not have to use all of the countries, but you must use at least five of them. You may use a calculated field if desired. Do not add any additional data (anything that is automatically generated by Tableau can be used). Do not use customized tooltips or animations since this will be a static visualization. You may group related data values together (any groups that you create should be clearly noted on your visualization). You can also filter your data as needed or desired (the filter will not be displayed on the visualization graphic). 3. Follow the guidelines for good visualization design discussed in class and summarized on the slides. Implement your chart carefully. For example, use appropriate colors, titles, axis labels, legends, data labels, and notes to the reader (if needed). Ensure that the reader can quickly recognize and understand what you are trying to show them. Format any legends appropriately. Don't feel that you need to create an overly complex chart – sometimes simpler charts are most effective. Do provide a good title to help guide the reader in what you are showing with your visualization. 4. Generate an image file (.jpg) of your visualization (use Dashboard/Export Image). Be sure to leave the boxes for caption and legend checked. Also create a Tableau .TWBX file that contains your worksheet and the data. Upload these two files to Canvas by the assignment deadline. Double-check your submission in Canvas to ensure that you have uploaded the required files in the proper formats, and that the files were submitted successfully. 5. This assignment will be graded on a scale of A, B, C (potentially with a plus or minus). Items that will factor into your grade include 1) the type of chart used, 2) how well it effectively conveys the data, 3) how well the visualization is implemented, 4) well-written titles, legends, axis labels, data labels, and notes (as applicable), 5) good and proper use of color, 6) following the instructions above, and 7) attention to detail./nDescription Download the attached Excel data set along with the PDF file, which contains assignment instructions. Perform your data analysis using Tableau Desktop. When you are finished, upload the completed files specified in the instructions. Comment

Fig: 1


Most Viewed Questions Of Tableau

Case Questions Here is a list of questions that you will be answering for this case. Complete instructions for how to construct the interactive Tableau dashboard are in the tutorial video that accompanies this project. Please read this student case handout in its entirety before starting the case. Use the data dictionary provided on page 1 of these instructions and the interactive Tableau dashboard file to answer the following questions: 1. What were total sales revenues for the 1st quarter of 2020? 2. What were the total sales revenues for the Medical category in the 3rd quarter of 2021? 3. Which product line had the highest sales in 2020 under the materials processing category? 4. What is the top product line for the Germany division in 2019? 5. In 2021, which quarter had the highest sales in the Telecom category, under the UK Division? What was the dollar amount? 6. Who is KAT Manufacturing's top customer in the 4th quarter of 2019? 7. Who is the top customer for the Germany division in 2021? 8. If you were on the top management team of KAT Manufacturing, what overall strategy questions could this dashboard help you address? Answer this in your video assignment!


Deliverables: 1) Go through the dataset (excel file). 2) Go through the "question instruction file" 3) Create visualisation in Tableau. 4) Write a story identifying injury pattern of two age groups (young and old) and suggest how to avoid such injuries for these groups. 5) Write 1 page for each group. EACH STORY MUST CONTAIN THE USE OF MULTIPLE PRODUCTS AND BODY PARTS.


(This relies on the file Sample - Superstore.xlsx and postal-code-delivery-data.xlsx that can be found in LumiNUS. Use those files to avoid problems associated with different versions.) (This question might be viewed as the first of two independent parts of the Delivery from the Customer Perspective question. As the ordering of questions may not be sequential, you may want to check if the other independent part was included in the assessment this semester.) You are managing fulfilment at a nationwide retailer in the USA. You are interested in the customer experience but have limited data on fulfilment due to late adoption of supply chain management methodologies and technology. If an order ships the same day, we take the gap between ordering and shipping to be 0.25; if an order ships the day after it was ordered, we take the gap between ordering and shipping to be 1.25; if an order ships two days after it was ordered, we take the gap between ordering and shipping to be 2.25; and so on. There is data on average timings between orders being shipped and arrival at the customer's address in postal-code-delivery-data.xlsx. If there is missing data on the Average Delivery Lead Time (Days) (time from shipment to arrival at a customer's address) for a given Postal Code, you will fill in the missing value with 3.5, a conservative value for the entire country. Calculate the average End to End lead time between orders being placed and deliveries arriving at the customer address (in days) for the following states: Alabama: (1) • Arkansas: (2) • California: • Colorado: • Mississippi: Give your answers to 2 decimal places. Note: In this question, Ship Mode does not matter. SX int: Postal Codes can be tricky to work with. There is at least some postal code data for New Jersey and New Hampshire. Hint: Be careful to avoid double counting since there might be multiple rows from the same order.


Question 6 (This relies on the file iris.csv that can be found in LumiNUS. Use that file to avoid problems associated with different versions.) p-norms are used to measure the distance between multi-dimensional data points and the origin. For a n-dimensional data point x = (X₁, X2, ..... Xn), the p-norm is given by: 11/p n ΣX₂² k=1 See: https://en.wikipedia.org/wiki/Lp_space#The_p-norm_in_finite_dimensions Here we will, for each type of flower (setosa, versicolor, and virginica), measure the distance between each data point in the 1-norm, 2-norm, and 3-norm from the mean of each of the factors: Sepal Length, Sepal Width, Petal Length, and Petal Width. So each data point is in 4-dimensions, and the distance from each data point from the mean from is in 4-dimensions The number of data points where the (component-wise) difference of the data point from the mean for its flower type has a p-norm less than or equal to 1.5 is: Type of Flower \p setosa FLAG QUESTION versicolor virginica 2 FYI: Depending on which items were selected for this assignment in this semester, there may or may not be another question that tells you do do the exact same thing for with a different threshold. So create your visual accordingly. Note: The 1-norm is the Manhattan distance, which is quite relevant in transportation operations in cities. The 2-norm is the usual straight line distance. In analytics work, it is common to generalise well known metrics.


Analyze our team: How many salespeople are meeting their goal? Who are the top 4 performers? After we identify them, we can assign them as mentors to the rest of our salesforce. We have 28 total salespeople, so each mentor will get 7 people. List our top 10 most profitable stores? What are they selling? What is the income level in that area? That can help us predict what will sell in other areas. Where do most of our orders come from? (Look at our sales channels) Analyze our products: What are our top 5 bestsellers, in terms of profit? What are our top 5 bestsellers in terms of popularity? Analyze our market: Who are our top 5 customers, in terms of profit? What are these customers buying? Rank our regions in terms of profitability. What items are most popular in each region. Rank our states in terms of profitability. What items are most popular in each state? For each region: What products are bringing in the most money? What items are most popular? We want to keep these regions stocked with those products. For each state: What products are bringing in the most money? What items are most popular? We want to keep these regions stocked with those products. What about seasonal trends? Are some products more popular at certain times of the year than others?


SX Question 9 FLAG QUESTION (This relies on the file Sample - Superstore.xlsx and postal-code-delivery-data.xlsx that can be found in LumiNUS. Use those files to avoid problems associated with different versions.) (This question might be viewed as the second of two independent parts of the Delivery from the Customer Perspective question. As the ordering of questions may not be sequential, you may want to check if the other independent part was included in the assessment this semester.) You are managing fulfilment at a nationwide retailer in the USA. You are interested in the customer experience but have limited data on fulfilment due to late adoption of supply chain management methodologies and technology. If an order ships the same day, we take the gap between ordering and shipping to be 0.25; if an order ships the day after it was ordered, we take the gap between ordering and shipping to be 1.25; if an order ships two days after it was ordered, we take the gap between ordering and shipping to be 2.25; and so on. There is data on average timings between orders being shipped and arrival at the customer's address in postal-code-delivery-data.xlsx. If there is missing data on the Average Delivery Lead Time (Days) (time from shipment to arrival at a customer's address) for a given Postal Code, you will fill in the missing value with the current average across all orders in the state for which there is data (this has to be weighted against the number of orders). If there is no data for the entire state, then use 3.5, a conservative value for the entire country. Calculate the average End to End lead time between orders being placed and deliveries arriving at the customer address (in days) for the following states: • Alabama: • Arkansas: (2) • California: • Colorado: • Mississippi: 5 Give your answers to 2 decimal places. Note: In this question, Ship Mode does not matter. Hint: Postal Codes can be tricky to work with. There is at least some postal code data for New Jersey and New Hampshire. Hint: Be careful to avoid double counting since there might be multiple rows from the same order.


Olympics Data Analysis and Dashboard: Power BI Data Preparation and Dashboard Development Summary With Power BI "Transform Data," clean and prepare the Olympics data set. Then, with Power BI Desktop, plan and develop at least 4 professional quality dashboards plus a sheet that allows the user to drill down to lowest-level details contained in the data, to help someone generally knowledgeable about the Olympics, explore and learn more about: a) the Olympics as a whole (dashboard); b) the Olympics in a specific year and season that the user specifies (winter or summer) (dashboard); c) country-level information (user specifies the country) (dashboard); d) sport/event-level information (specified by the user), including information about individual athletes (dashboard); and e) a page in the PBI file that allows the user to interactively drill down to the lowest level details in the data. Prepare a brief writeup as part of the PBI file. Guidance Read the entire document before starting. Plan to go through two or three iterations of your dashboards. Start early enough so you can be thinking about how to improve the dashboards and make the document unified (i.e., common styles, look/feel, user interactions). Data Files and Making Connections to Data The zip file olympics.zip contains three CSV files. Two contain information about Olympic Games performances over many years (olympic_summer.csv, over 220,000 rows; and olympic_winter.csv, over 48,000 rows). These two files have the same schema (format/structure of data columns). The third CSV file (olympic_countries.csv; about 230 rows) contains country abbreviations and associated country names (see the Data Dictionary section below). Do not make any changes to the CSV files. Connect to these files from Power BI. You can certainly look at and explore them in a text editor and/or Excel, but when you load them to Power BI, connect to them as CSV files. It is best to keep the CSV files and your PBIX file in the same directory. It is important to realize that the PBIX file contains connections to data files. The PBIX file format is designed to both keep this connection information in addition to a copy of the data. However, the “true” source of the data is the CSV files, and whenever PBI refreshes the data (for example, in the Transform Data steps), it looks to the connection information. You will likely be working on this from several different computers. It is important, after copying the PBIX file to a new computer, to also update the "Data Source" settings in PBI. From the Home menu, pull the arrow down next to Transform Data, and choose "Data source settings." From there, select each CSV file and update the file location to where the file resides on your computer. This is the most important during the Transform Data phase (data preparation). You will not be able to do any Transform Data operations unless PBI can "find" the true source files. After your data is completely prepared, PBI can actually work with the copy of the data in the current PBIX file, but if the source data ever changes or you need to go into Transform Data again, you will need to update the Data Source settings so PBI can find the files. More information, not required if you follow the steps in the previous paragraph. It is possible to store the CSV files on a web server or shared resource, e.g., Google Drive, Sharepoint, etc. In my testing with these files, connections to Google Drive became quite slow due to the file sizes, and Sharepoint adds some Olympics Data Preparation and Dashboard Page 1 of 6 complexity that is beyond the current scope of this case. If you do choose to try this approach, make sure that the source files are readable by anyone...me...without needing a username/password. Tasks (see also the “Requirements for the Dashboards” section, below) Note for dashboards: After your clean/prepare the data, you will have 4 tables: summer, winter, combined, and countries. For the dashboards you will only use fields from the combined table and the countries table. The purpose of the combined table is so that we can have combined summer & winter results when desired, and just as easily filter out one season or the other when we don't. Do not create visuals with the winter and summer tables. 1. Prepare/clean the data (see Data Preparation Guidance, below). 2. Explore/Experiment. Spend time exploring the data by creating a number of visuals, tables, etc. Create one or more "Experimental” pages for this. Do enough of this to learn major aspects/insights you want to explore further, as well as nuances of the data that may not be obvious at first. 3. Overall Dashboad. Create an overall (i.e., top-level) dashboard (name the page "Overall"). The visuals must show at least the following: medals awarded over time, athletes competing over time, medal count by country, and athlete count by country. Include card(s) to summarize key values. You may want to present some of this on the same visual(s). Use slicers and/or filters to allow user to drill into the data as they choose to. Note that “athletes” competing over time can be interpreted several ways, e.g., unique people who have competed ever, unique people for each Olympic games, and total number of entrants into all events (this last one multi-counts athletes as people). Be clear in your visuals (on this page and all) whether you are talking about unique people, or event entrants. 4. Year/Season Level Dashboard ("Year and Season Persona"). For this dashboard, have the user select the year(s) and season(s) (use slicers or filters), and your dashboard will then present visuals that inform the user about that specific year(s)/season(s). 5. a. Note: For the dashboards in steps 4, 5, and 6, the idea is that the user will choose a specific year(s)/season(s) (or country(s), or sport(s)/event(s)). The visuals then should show details about that specific year/season (or country, or sport/event). Therefore, you would not want a visual, for example, that shows medals over time for the year/season dashboard, as most of the time the user will just be selecting a single year/season...that would make for a chart with one data point, i.e., without meaning. Think about what is meaningful, test it out with various settings as the user would do, and iterate; do not just click the buttons and accept defaults. Country Level Dashboard ("Country Persona"). Provide a way for the user to select a country (or countries), and then present information about that country and its participation in the Olympics. 6. Sport and Event Level Dashboard ("Sport and Event Persona"). Provide a way for the user to select a sport(s) and event(s), and then present information about that sport/event. Within sports, there are events, so this dashboard is more of a sport- and event-level dashboard. Ideally, the user should be able to drill down to specific events and see specific information on individual athletes. 7. Drill-down sheet. On this sheet provide a way for the user to intelligently and seamlessly drill down into the data to any level they wish, even to the individual record level in the data. You will probably need to experiment with a mix of graphs, decomposition tree, and table/matrix (a combination of a decomposition tree and a table or matrix is often a good way to provide drill-down capability). 8. Writeup. Create a "Writeup" page. Insert a text box. List/discuss three key insights you gained specific to the problem context (that is, what did you learn about the Olympics). Then list/discuss three key things you learned about visualization and/or Power BI in completing the assignment. Target length for this is the equivalent of a one-page writeup (more than just bullet points...give the main point and then explain, with examples to illustrate). 9. PBIX and PDF. Save your Power BI file as a PBIX file. Also generate a PDF of your PBI file by using the File...Export option. You will be submitting both. You do not need to submit the CSV files. Olympics Data Preparation and Dashboard Page 2 of 6 Guidelines and suggestions for the dashboards Spend time on the general layout and formatting of your first (overall) dashboard. That way you can duplicate the page and make changes for the other dashboards. While obviously you will make changes for each dashboard, there should be a reasonably common look/feel to your dashboards. Try to make visuals professional quality, titled/labeled appropriately, with appropriate color selections, and able to be understood by user without additional explanation. With visualization, getting something 70% done can be pretty quick, but the remaining 30% of tweaking settings, titles, colors, alignment, etc. is what often separates a professional-level job from a novice job. Slicers and filters should add meaningfully to the dashboard's value. Each dashboard should contain at least 4 visuals (for this count, slicers don't count as a visual, but you should have one or more slicers also; table/matrix does count as a visual). Each dashboard must have a brief text header for the title of the dashboard (put this in a text box). Be precise about language. For example, "athletes" (unique competitors) is different than "event entries" (all athletic entries in a competition), is different than "medals awarded," and is different than “medalists” (unique athletes who won at least one medal). For many/most charts, you will need to change the default titles to ensure the user knows exactly what the chart displays. You can distinguish between these (depending on the field) by summarizing as a Count versus the Count Distinct option. Create at least three measures and use them in cards or similar visuals (not required for every dashboard, but in total). ● Use at least one of each of the following visuals, across all your pages. This is not for every page, but taking all your pages together, utilize at least one of each type of the following visuals: timeline (line or area), ribbon, bar, column, scatter (xy), filled map, treemap, histogram (use data grouping and a column chart, not a custom visual), matrix, table, card (or multi-value card). Of course, you will use some visual types more often than others. The above is a requirement to have at least one of those in the list so you can experiment with the best chart(s) to show particular aspects. ● On every dashboard, use slicers (you will need several slicers for some pages) and/or page-level filters. Visual-level filters get added automatically for each visual for you to be able to tweak an individual visual and you may need to customize settings on some individual visuals. For this assignment, do not use the "Filters on all pages" capability (this affects all pages in the document, sort of like filtering out rows in Transform Data would do). Use the country code and country name intelligently. Do not assume the user will know the country codes. Submit ● All pages must be interactive, that is, clicking on one visual automatically cross-filters/highlights the others, and similarly for slicers and filters. PBIX file PDF file obtained by exporting the PBI file to a PDF file (File...Export). Data Preparation Guidance Change the names of the queries to Summer, Winter, and Countries. The default names are probably inherited from the rather long file identifier in the links provided earlier in this document. First row as headers. When importing text (e.g., CSV) files, the first row may not be automatically used for column headers. In Home tab of Query Editor, Use First Row As Headers option is used to promote the first row as headers. Winter and Summer Files (do these steps on each file) O Delete the "Changed Type" step that PBI likely adds automatically. With this data, PBI's default choices, result in errors for some of the columns (PBI looks at the first 200 rows to decide on data types, and this causes issues for at least one file here). Deleting this step and reassigning the appropriate data type (later) works better. Olympics Data Preparation and Dashboard Page 3 of 6 ● ● Create New column named Season. Set to "Summer" for summer table; "Winter" for winter table. The formula for this column is just = "Summer" or = "Winter", respectively when you are in the Add Column dialog box. Make sure you do this before appending the queries! Append the queries. O Appending queries combines two queries into one by combining the rows; this is why we needed a Season field in the previous step. In this stage you will combine winter and summer results. It is critical that you successfully complete the previous steps before doing this one. While in Query Editor ("Transform Data" in PBI), select the summer query. Go to Home Tab → Append Queries → Append Queries as New In the dialog box, the summer query should already be filled in one field. In the other field, select the winter query, and hit OK. Rename the resulting query to be "olympic_combined" and verify that you have all rows from both summer and winter in the combined query. O O O O Clean the data in the olympic_combined query O Age, Height, and Weight columns. Replace NA with blank (i.e., nothing; you can usually also type null without quotation marks). Convert to decimal number. You can do this one column at a time or select all three columns and do it at once. Query Editor will probably insert the null value for the empty values which is OK (it uses null to tell you that there is truly nothing entered, not even a space character). If you try to convert a column containing some text values to numeric, you will get errors. Year. Convert to whole number (don't try to convert to date; we're just interested in year and season so whole number is sufficient). Season. Convert to text. O Medal column. This column contains Gold, Silver, Bronze, or NA. In contrast to some other columns, NA is not missing data here. It means the athlete did not earn a medal. For clearer communication, replace this with something like "No Medal." Add a Conditional Column called Medal_Index. The Medal column is text, but it is really ordinal categorical data (i.e, categorical data with a natural order from best to worst). We need to be able to display results in Gold, Silver, Bronze, and "No Medal." To do this, create an index column based on the Medal column. Specifically, add a conditional column. Assign the value 1 to Gold, 2 to Silver, 3 to Bronze, and 4 to "No Medal" (see screen shot). After closing Query Editor, we instruct Power BI to sort the Medal column based on the value of the Medal_Index column (keep reading for instructions). Close and Apply to return to Power BI Model View. Create the connection between the "Olympic Code" column in the olympic_countries table and the "Country" column in the olympic_combined table. Go to the Model Tab and create a 1-to- many relationship between Olympic_Code in the olympic_countries table and Country in the olympic_overall table. Data View. In the Data View for the olympic_combined table, select the Medal column. Choose the Column Tools tab, and "Sort by Column" dropdown. Sort the Medal column by the value of the Medal_Index column. This does not immediately sort the Medal column. Rather, it tells Power BI that whenever Medal is included in a visual, that the medal names will be listed in order according to the Medal_Index column (you can choose ascending or descending). Check this when you create your first visual or table listing the medal types and counts. O O Olympics Data Preparation and Dashboard Page 4 of 6 Conditional Column Creation for Model_Index ● ● ● ● ● Add Conditional Column ● Add a conditional column that is computed from the other columns or values. New column name Medal_Index If Else If Else If Else If Else (0) 123 Column Name Medal Medal Medal Add Clause Medal null ● Operator equals equals equals equals Sport Event Year (of Olympics) City (of Olympics) Value > 123 123 Gold Silver 123 Bronze 123 No Medal Then Data Dictionary olympic_summer and olympic_winter files. Each row represents one entry into an event. Athlete Gender PBIX/td-p/800887 Output 12-1 ABC Then 123 Then 123 Then 2 3 4 ОК Age (some data is missing; see Data Preparation section) Height (cm) (some data is missing; see Data Preparation section) Weight (kg) (some data is missing; see Data Preparation section) Country abbreviation of athlete Cancel X Medal (Gold, Silver, Bronze, or NA, where NA means athlete competed but did not win a medal. See guidance in Data Preparation for how to deal with the NA values) olympic_countries file Country Olympic Code (abbreviation of country, compatible with Country in other files) Making Copy of Report Page, From One File to Another You will likely be partly working independently on your own files, and then seeking to combine (PBI Desktop does not have “live” sharing). This will help. You can copy one report page from one file to another using the following steps: Files must have the same data source(s). Sources here are the three CSV files. Add a blank page in your target PBIX file Go to your source PBIX file, click on the page you want to copy, and with nothing selected on the report page hit CTRL + A (select all). Then hit CTRL + C (copy). Go to your target report file and on the blank page, hit CTRL+V See https://community.powerbi.com/t5/Desktop/Copy-report-page-from-PBIX-to-another- Notes and Tips PBI does not have a direct way to allow multiple people to edit the file simultaneously. You may want to keep your PBIX in a cloud storage folder and make sure everyone has read/write access to that folder, to better maintain version control. You will still need to update the Data Source settings when opening the PBIX file on another computer. Olympics Data Preparation and Dashboard Page 5 of 6


Tableau Storytelling Assignment: prevent customers from leaving Marks: 15% Learning Objectives evaluated: 1. Learn the fundamental concepts of storytelling 2. Learn the principles of visualizations 3. Apply visualization principles to create effective visualizations 4. Learn how to organize and prepare data prior to analysis 5. Apply storytelling concepts to write effective stories with data AirTel Corp. is a telecommunication organization providing internet and phone services. The telecommunication market is highly competitive, and customers often switch from one organization to another. Your task is to come up with specific strategies to prevent a group of customers from leaving the organization. Tasks You will write a one-page story that will answer the following question. 1. Based on a specific profile of customer, what specific strategies will you suggest stopping them from leaving the organization? Please see the following details for this assignment. 1. This is an assignment to be done in pairs. Make sure that both partners contribute to the assignment. Mention in a separate paragraph, the responsibilities of each member of the team 2. You must use Tableau to do this analysis also Excel can be used to organize/format the data 3. The assignment MUST include two and a maximum of four visualizations based in Tableau 4. The page limit of this story is one page including visualizations. You can consider this assignment as a poster with visuals and text. 5. Submit only the assignment document on canvas and not the dataset. Do not upload the visualizations separately. 6. Follow the principles/concepts of storytelling and visualizations covered in modules 1 and 2.


DELIVERABLES: 1) There are 4 dashboards attached (ch2, ch3, ch13, ch14), we need to write 1 page summary for each dashboard. Choose any 2 for part1 and other 2 for part2. 2) For the Second Section, read the instructions, go through the data sets attached and provide with what is asked, following all the instructions.


Planning - Part 1 Planning is the first step towards creating a dashboard. For this step you will need to: 1. Explore the various repositories and check multiple datasets. For each dataset, o Read its description. o Find the number of rows and columns. o Check the column data types. 0 If available, check what other data scientists used this dataset for. o Think about what can be visualized about this dataset. https://www.kaggle.com/datasets https://www.kdd.org/kdd-cup https://population.un.org/wpp/Download/Standard/Population/ https://grouplens.org/datasets/ 2. Select a dataset of your choice & download it. Explain what this dataset is about. 3. For each column, specify its data type: Categorical, Ordinal, Interval or Ratio. 4. For each column, specify its domain; that is the list or range of values that it can take. 5. Think about who would like to use a dashboard to create visualizations about this dataset. o For example, a card fraud dashboard can be used by bank cybersecurity teams. o A Parkinson's dashboard can be used by physicians and health care providers. 6. List the prospective users that you will develop the dashboard for, and what do you think they can use this dashboard for. 7. List a comprehensive set of questions that the users might ask about this dataset. o This is not the final set of questions that the dashboard will address. o It is just a starting point and may contain much more questions than what the dashboard will finally address. So list as many questions as you may think of./nProject - Part 1 Submission Submit a report in which you answer the questions listed above. The report should have three main sections: Section 1: Dataset Description (10 points) Section 2: Prospective Dashboard Users (10 points) Section 3: List of User Requirements & Potential Questions. (20 points) Decision Making - Part 2 Now that you understand your data, the users that are interested in it and what sorts of questions they might have about the data, it is time to make decisions related to how the actual dashboard will look like. The decisions that you will make are for two main subjects:/nChoosing Visualization Tools List the visualization tools that you will use to create the dashboard. Explain why you chose these tools. This can be due to data-related issues or personal preference of certain development tools. Explain why you prefer some tools over others. Data Preparation & Preprocessing In case your data requires any kind of pre-processing such as computing certain attributes or removing missing values, explain how the data will be processed and prepared for visualization. Final Set of Questions List the final set of questions that your dashboard will be designed address. The dashboard users should be able to find answers for these questions by using your dashboard. List at least five questions./nChoosing Visualization Tools List the visualization tools that you will use to create the dashboard. Explain why you chose these tools. This can be due to data-related issues or personal preference of certain development tools. Explain why you prefer some tools over others. Data Preparation & Preprocessing In case your data requires any kind of pre-processing such as computing certain attributes or removing missing values, explain how the data will be processed and prepared for visualization. Final Set of Questions List the final set of questions that your dashboard will be designed address. The dashboard users should be able to find answers for these questions by using your dashboard. List at least five questions./nList of Plots For each of the questions listed above, think about the best plots that can be used to address it. Keep in mind that one question might require multiple plots to address. Alternatively, one plot can address multiple questions. The dashboard should contain at least five plots. For each plot, • • Explain what it shows and how that relates to the set of questions. List the set of used pre-attentive attributes and colors. Include a rough, hand-drawn or computer-drawn, figure of the plot. List of Interactive Controls The dashboard user can change the visualizations via interactive controls. If your dashboard contains any controls. • List what they will be used for Which plots are connected to each one • The value range for each control and whether or not it is loaded from a certain attribute in the data. Project - Part 2 Submission Submit a report in which you answer the questions listed above. The first page of the report should contain the student name & the project title. The report should have five sections: Section 1: Used Visualization Tools • Section 2- Explanation of Required Data Pre-processing if any/nPART 3 General Project Information • The final deliverables are: o The dashboard o A report that includes all answers to questions listed in the three project phases. . A power point presentation. Details for each deliverable will be provided in the relevant project phase. Implementation - Part 3 Now that have laid down the design of your dashboard, it is time to implement it! Step I: Create the Single Visualizations Create each visualization to match the pre-specified design. 1. Use consistent color palettes and the minimum number of pre-attentive attributes possible to covey your information. Step II: Add Interactivity Add controls, if any, to the visualizations. Choose the most user-friendly controls. 1. For example, if you have controls that allow the user to change a numeric value, and this value has a very wide range, e.g., all numbers from 1 to 200, it is better to use a sliding bar control and not a list or a drop down menu.