Search for question
Question

Working individually and in an entirely reproducible way please write a paper that involves original work to tell a story with data. Step 1 is to find a dataset then Data should be extracted, cleaned, and processed using R. I need you to give me the R codes for data processing as "Scripts" like in the Rubric. The analysis dataset should be saved as a parquet file. Step 2 is using the cleaned dataset to write make graphs and models in the paper written on a Quarto file with chunks of R code. Develop a research question that is of interest to you based on your own interests, background, and expertise, then obtain or create a relevant dataset. Research Question: "How do changes in the Bank of Canada's interest rate policy impact consumer debt levels in Canada?" This question delves into the relationship between central bank policy decisions- specifically, adjustments in interest rates—and the amount of debt that consumers carry. It's particularly relevant in an era where household debt levels are a significant concern for economic stability and personal financial health. So create a model to see the relationship between the two (it can be a linear regression model to see if higher interest rates mean more or less consumer debt). Note: This is a research question I wanted to do but if you want to do something else with a better model explanation and with lesser random factors that might affect our model, you can do that but keep it economics/finance related. Do not use a dataset from Kaggle, UCI, or Statistica. Mostly this is because everyone else uses these datasets and so it does nothing to make you stand out to employers, but there are sometimes also concerns that the data are old, or you do not know the provenance. FAQ: How much should I write? Most students submit something that has 10-to- 20-pages of main content, with additional pages devoted to appendices, but it is up to you. Be precise and thorough. Can I use any model? You are welcome to use any model, but you need to thoroughly explain it and this can be difficult for more complicated models. Start small. Pick one or two predictors. Once you get that working, then complicate it. Remember that every predictor and the outcome variable needs to be graphed and explained in the data section. Rubric: Component R is appropriately cited Class paper Range Requirement Must be properly referred to in the main content and included 0 'No' 1 'Yes' in the reference list. If not, no need to continue marking, paper gets 0 overall. Check meta data such as project and folder names, as well as O 'No' 1 'Yes' other aspect such as title etc. If there is any sign this is a class paper then no need to continue marking, paper gets 0 overall. LLM usage is documented 0 'No'; 1 'Yes' Title 0 - 'Poor or not A separate paragraph or dot point must be included in the README about whether LLMs were used, and if so how. If auto- complete tools such as co-pilot were used this must be mentioned. If chat tools such as Chat-GPT4, were used then the entire chat must be included in the usage text file. If not, no need to continue marking, paper gets 0 overall. An informative title is included that explains the story, and done'; 1- 'Yes'; 2 ideally tells the reader what happens at the end of it. 'Paper X' - 'Exceptional' Author, date, and repo 0 'Poor or not done'; 2 'Yes' Abstract 0 'Poor or not done'; 1 'Gets job done'; 2 - 'Fine'; 3- 'Great'; 4- 'Exceptional' is not an informative title. There should be no evidence this is a school paper. The author, date of submission in unambiguous format, and a link to a GitHub repo are clearly included. (The later likely, but not necessarily, through a statement such as: 'Code and data supporting this analysis is available at: LINK'). An abstract is included and appropriately pitched to a non- specialist audience. The abstract answers: 1) what was done, 2) what was found, and 3) why this matters (all at a high level). Likely four sentences. Abstract must make clear what we learn about the world because of this paper. Introduction Estimand Data Measurement Model 0 - 'Poor or not done'; 1 'Gets job done'; 2- 'Fine'; 3- 'Great'; 4- 'Exceptional' 0 'Poor or not done'; 1 'Done' 0 'Poor or not done'; 2 'Many issues'; 4 - 'Some issues'; 6 - 'Good'; 8- 'Great'; 10- 'Exceptional' 0 'Poor or not done'; 2 'Some issues'; 3 - 'Good'; 4- 'Exceptional' 0 'Poor or not done'; 2 issues'; 4 - 'Many 'Some issues'; 6 - 'Good'; 8- 'Great'; 10 - 'Exceptional' The introduction is self-contained and tells a reader everything they need to know including: 1) broader context to motivate; 2) some detail about what the paper is about; 3) a clear gap that needs to be filled; 4) what was done; 5) what was found; 6) why it is important; 7) the structure of the paper. A reader should be able to read only the introduction and know what was done, why, and what was found. Likely 3 or 4 paragraphs, or 10 per cent of total. The estimand is clearly stated in the introduction. A sense of the dataset should be communicated to the reader. The broader context of the dataset should be discussed. All variables should be thoroughly examined and explained. Explain if there were similar datasets that could have been used and why they were not. If variables were constructed then this should be mentioned, and high-level cleaning aspects of note should be mentioned, but this section should focus on the destination, not the journey. It is important to understand what the variables look like by including graphs, and possibly tables, of all observations, along with discussion of those graphs and the other features of these data. Summary statistics should also be included, and well as any relationships between the variables. If this becomes too detailed, then appendices could be used. Basically, for every variable in your dataset that is of interest to your paper there needs to be graphs and explanation and maybe tables. A thorough discussion of measurement, relating to the dataset, is provided in the data section. Please ensure that you explain how we went from some phenomena in the world that happened to an entry in the dataset that you are interested in. The model should be nicely written out, well-explained, justified, and appropriate. Results Discussion 0 'Poor or not Results will likely require summary statistics, tables, graphs, done'; 2 'Many images, and possibly statistical analysis or maps. There should issues'; 4- 'Some issues'; 6 - 'Good'; 8- 'Great'; 10- 'Exceptional' 0 'Poor or not done'; 2 'Many issues'; 4 - also be text associated with all these aspects. Show the reader the results by plotting them where possible. Talk about them. Explain them. That said, this section should strictly relay results. Regression tables must not contain stars. Some questions that a good discussion would cover include (each of these would be a sub-section of something like half a page to a page): What is done in this paper? What is something 'Some issues'; 6 that we learn about the world? What is another thing that we - 'Good'; 8 - 'Great'; 10- 'Exceptional' Cross-references 0 'Poor or not done'; 2 - 'Yes' Prose Graphs/tables/etc Referencing 0 'Poor or not done'; 2 'Many issues'; 4- 'Good'; 6- 'Exceptional' 0 'Poor or not done'; 1 'Gets job done'; 2 - 'Fine'; 3- 'Great'; 4- 'Exceptional' 0 - 'Poor or not done'; 3 'One minor issue'; 4- 'Perfect' learn about the world? What are some weaknesses of what was done? What is left to learn or how should we proceed in the future? All figures, tables, and equations, should be numbered, and referred to in the text using cross-references. All aspects of submission should be free of noticeable typos, spelling mistakes, and be grammatically correct. Prose should be coherent, concise, and clear. Do not use filler phrases such as 'delve into' or 'shed light'. Remove unnecessary words. Graphs and tables must be included in the paper and should be to well-formatted, clear, and digestible. They should: 1) serve a clear purpose; 2) be fully self-contained through appropriate use of captions and sub-captions; 3) appropriately sized and colored; and 4) have appropriate significant figures, in the case of tables. All data, software, literature, and any other relevant material, should be cited in-text and included in a properly formatted reference list made using BibTeX. A few lines of code from Stack Overflow or similar, would be acknowledged just with a comment in the script immediately preceding the use of the code rather than here. But larger chunks of code should be fully acknowledged with an in-text citation and appear in the reference list. Simulation Tests Parquet Reproducibility SPOSI UITIOL done'; 1 - 'Gets job done'; 2- 'Fine'; 3- 'Great'; 4- 'Exceptional' 0 'Poor or not done'; 1 - 'Gets job done'; 2- 'Fine'; 3- 'Great'; 4- 'Exceptional' 0 'Not done'; 1 - 'Done' 0 'Poor or not done'; 1 'Gets job done'; 2- 'Fine'; 3 - 'Great'; 4- 'Exceptional' The script is clearly commented and structured. All variables are appropriately simulated. Data and code tests are appropriately used. The analysis dataset is saved as a parquet file. (Note that the raw data should be saved in whatever format it came.) The paper and analysis should be fully reproducible. The repo should have a detailed README. All code should be thoroughly documented. An R project should be used. Code should be used to do all steps including appropriately read data, prepare it, create plots, conduct analysis, and generate documents. Seeds should be used where needed. Code should have a preamble and be well-documented including comments and layout. The repo should be appropriately organized and not contain extraneous files. setwd() and hard coded file paths must not be used. 0 'Poor or not Code style done'; 1- 'Exceptional' Code is appropriately styled using styler or lintr Enhancements 0 'Poor or not done'; 1 'Gets job done'; 2 - 'Fine'; 3- 'Great'; 4- You should pick at least one of the following and include it to enhance your submission: 1) A datasheet for the dataset; 2) A model card for the model; 3) A Shiny application; 4) An R package; or 5) API for the model. 'Exceptional' A couple of examples from past assignments: Code should look like this in the Quarto document: ## Results ### Bacon (per 500 grams) The cost of bacon fluctuated significantly over the analyzed time period. On average, the cost of bacon increased by 5.39% while the inflation rate increased by 3.04%. The highest price {d}... #| messages: false #echo: false #| warning: false #### Create graphs plotting mandated and actual response times #### # Define colors to be used in legend bacon_table< grocery_data |> select(year, bacon) bacon_inflation_data <- # Specify which tables to merge (this merges by row) merge( x = bacon_table, y = inflation_data, by = "year", all.x = TRUE > | > # Calculate the percent change of chicken prices mutate(percent_change = round((((bacon - lag (bacon, 1))/lag (bacon, 1)) * 100), digits = 2)) |> # Remove empty rows drop_na() {J}... #messages: false #| echo: false After rendering, Quarto document will look like this: Analyzing the Relationship between Increased Employment Rates and Higher Permanent Immigrant Inflows* An Analysis of Canada's Employment Rate and Permanent Immigrant Inflows 19 April 2023 Abstract This paper utilizes data from the OECD to examine the relationship between Canada's employment rate and permanent immigrant inflows between 2009 to 2019. The analysis revealed a positive correlation between the two variables, indicating that as the employment rate increased, so did permanent immigrant inflows. These findings matter as they highlight the importance of a strong labor market and economic development in attracting and retaining permanent immigrants. The insights can guide policymakers in developing policies to attract and retain permanent immigrants. 1 Introduction Immigration has been a key driver of economic growth and cultural diversity for countries around the world. Over the past two decades, many countries have opened doors to immigration, recognizing the significant economic and social benefits that immigration can bring. However, immigration policies and their implementation have varied across countries, with some facing challenges in attracting and retaining permanent immigrants. One possible factor that could influence a country's ability to attract and retain immigrants is its employment rate. High employment rates means that the country has economic stability and job opportunities, making a country a more attractive destination for permanent immigrants. Whereas, low employment rates could signal economic instability, leading to a decrease in permanent immigrant inflows. In this paper, we will examine the relationship between Canada's employment rate and permanent immigrant inflows through a linear regression analysis. The estimand here is how employment rate and immigrants inflows are related. Specifically, we will focus on Canada, which has a relatively high immigration number. We will draw data from the OECD website (OECD 2023b). Our respondents of interest are the percentage of the working-age population, as they represent the potential labor force and have a significant impact on a country's economic growth and development. Based on the analysis, we found that there is a positive relationship between employment rate and permanent inflow rate. While existing literature has examined the impact of economic factors on immigration, this paper specifically focuses on the relationship between employment rate and permanent immigrant inflows. This exploration can provide valuable insights for government officials and policy makers in developing policies to attract and retain permanent immigrants and influencing permanent immigrant inflows. In addition, this research can impact economic development strategies which can benefit the labour market and immigration. permanent immigrant inflows, and we will examine the patterns and trends to highlight the similarities and differences in immigration patterns, analysis of the bias and ethical concerns, and weakness and steps. 2 Data 2.1 Data Description and Methodology The data used in this paper is obtained from the OECD Data (Organization for Economic Co-operation and Development) and is publicly available through the OECD website (OECD 2023b). Founded in 1961, the OECD is an intergovernmental organisation with 38 member countries collaborating to develop policy standards to promote sustainable economic growth. The organization's data is widely used by policymakers, researchers, and analysts to understand trends and inform policy decisions. The OECD has collected data regarding economy, education, employment, environment, health, tax, trade, GDP, unemployment rate, and inflation. It keeps records on a monthly, quarterly, and yearly data from the participating countries. The OECD collects data through member countries, partner organizations, and surveys. One of the primary sources of data for the OECD is its member countries, which provide data on a regular basis across a wide range of indicators, including GDP, employment, education, health, and the environment. This data is then aggregated and analyzed by the OECD to identify trends and inform policy recommendations. The three datasets that I will be using are: Permanent immigrant inflows (OECD 2023c), Employment Rate (OECD 2023a), and Population (OECD 2023d). All of them will be specifically Canada and have evolved from 2009 to 2019. Permanent immigrant inflows cover regulated movements of foreigners considered to be settling in the country from the perspective of the destination country. The data presented are the result of a standardization process in Canada. The number of variables recorded in the data was 253 101 in 2009, 262 773 in 2013, and 341 173 in 2019. Employment rates are a measure of the extent to which available labour resources (people available to work) are being used. They are calculated as the ratio of the employed to the working age population. The working age population refers to people aged 15 to 64. The percentage of variables recorded in the data was 71.5 in 2009, 2.7 in 2013, and 74.6 in 2019. Finally, the population is defined as all nationals present in, or temporarily absent from a country, and aliens permanently settled in a country. This indicator shows the number of people that usually live in an area. Growth rates are the annual changes in population resulting from births, deaths, and net migration during the year. Table 1: A summary table of cleaned data Country Year Employment Rate Permanent Inflows Rate 1 Introduction Immigration has been a key driver of economic growth and cultural diversity for countries around the world. Over the past two decades, many countries have opened doors to immigration, recognizing the significant economic and social benefits that immigration can bring. However, immigration policies and their implementation have varied across countries, with some facing challenges in attracting and retaining permanent immigrants. One possible factor that could influence a country's ability to attract and retain immigrants is its employment rate. High employment rates means that the country has economic stability and job opportunities, making a country a more attractive destination for permanent immigrants. Whereas, low employment rates could signal economic instability, leading to a decrease in permanent immigrant inflows. In this paper, we will examine the relationship between Canada's employment rate and permanent immigrant inflows through a linear regression analysis. The estimand here is how employment rate and immigrants inflows are related. Specifically, we will focus on Canada, which has a relatively high immigration number. We will draw data from the OECD website (OECD 2023b). Our respondents of interest are the percentage. of the working-age population, as they represent the potential labor force and have a significant impact on a country's economic growth and development. Based on the analysis, we found that there is a positive relationship between employment rate and permanent inflow rate. While existing literature has examined the impact of economic factors on immigration, this paper specifically focuses on the relationship between employment rate and permanent immigrant inflows. This exploration can provide valuable insights for government officials and policy makers in developing policies to attract and retain permanent immigrants and influencing permanent immigrant inflows. In addition, this research can impact economic development strategies which can benefit the labour market and immigration. In section 1, we discuss the source of data used in this paper, the strengths and weaknesses of OECD, methodologies that follow it, and data terminology. In section 2, we present the results of our analysis, focusing on the trajectory of employment rate and permanent immigrant inflows over the past 10 years in Canada. In section 3, we will analyze the trend by establishing a linear regression model. In section 4 we will present the result of the model in a graph. In the final section, we explore the factors that contribute to In this paper, the analysis will be carried out using the statistical programming language R (R Core Team 2020), using the haven and tidyverse (Wickham et al. 2019), devtools (Wickham, Hester, and Chang 2020) and dplyr (Wickham et al. 2021). All figures in the report are generated using ggplot2 (Wickham 2016). We run the model in R (R Core Team 2020) using the rstanarm package of (Goodrich et al. 2022). 2.2 Data Visualization 2.2.1 Permanent Immigrants Inflows Rate from 2009 to 2019 € 0.90- 0.85- 0.80- 0.75- Permanent Immigrants Inflows Rate from 2009 to 2019 CAN 2009 71.50833 0.7526295 CAN 2010 71.55833 0.8271222 CAN 2011 71.97500 CAN 2012 72.31667 0.7259664 0.7442108 CAN 2013 72.71667 0.7490048 CAN 2014 72.50000 0.7375731 CAN 2015 72.74167 0.7726317 CAN 2016 72.67500 0.8217785 CAN 2017 73.56667 0.7838149 CAN 2018 CAN 2019 74.02500 74.60000 0.8661575 0.9073453 Table 1 presents the cleaned dataset, which includes 11 variables and 5 observations in total. The variables in the dataset include Year (in years), Country, Employment Rate (in percentage), Permanent Immigrant Rate (in percentage). The Permanent Inflows Rate was calculated by dividing the Permanent Immigrant Flows by the Total Population of Canada. All percentages are based on the corresponding population for each variable. 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 Year Figure 1: Overall increase in Canada's permanent immigration inflow from 2009 to 2019 which is from 0.751 to 0.925. Figure 1 shows the overall trend of permanent immigrant inflows in Canada between 2009 and 2019. The overall trend of the plot shows an increasing pattern in permanent immigration inflow rate over the years, with fluctuations. From 2009 to 2019, the permanent immigrant inflow rate increased from 0.751 to 0.925. Furthermore, there have been fluctuations in the permanent immigrant inflow rate over time, the overall trend has been upward, with a notable rise in the rate following a decline from 0.85 to 0.71 in 2011. This trend highlights the increasing importance of immigration as a driver of economic and social growth in OECD member countries. These trends and analyzing the underlying drivers can provide valuable insights into the impacts of immigration on its member countries and help inform policy decisions related to immigration and integration.