Econometrics

Search for question

Questions & Answers

Using your results from Eviews answer the following questions: a. Use the augmented Dicky-Fuller (ADF) test to see whether log(Yt) has a unit root. In so doing, write down the unit root test equation for log(yt), the null and alternative hypotheses, and the decision rule. State your conclusion. b. Given your conclusion in Question 1, what can you say about the dynamic behavior of log(1.). Explain clearly. c. Specify and estimate the univariate model for the exchange rate. Is your model "moving average", "autoregressive", "random walk without drift", "random walk with drift", "integrated moving average", or "integrated autoregressive"? Explain clearly. d. Is the estimated univariate model dynamically stable? Explain clearly. e. Is the residual series of the univariate model white noise? Explain clearly. f. Is the trend in log(x), as shown above, deterministic or stochastic? Explain clearly. g. Use your univariate model estimates to forecast the exchange rate for 2000.01-2000.12. Find the ME, MAE, and MAPE for the forecast period. Use these results to make some conclusions about the performance of the univariate model in forecasting exchange rates.


Part 1 DESCRIPTIVE STATISTICS (a) For per capita income (variable name is "PerCapitalne") COMPUTE and INTERPRET the follow- ing[Hint: you will need to specify the 'detail" option after the summerize command (see the cheat sheet)]: (b) Sample mean (ii) Sample standard deviation (iii) Sample skewness (iv) Sample kurtosis and (v) Sample size. (vi) Standardize "PerCapitalne" and find the mean and variance. You do this by using the extended generate command (egen) in STATA in combination with the 'std' fuention. Are the values what you expected? (b) Use your printout from (a) to construct a 99% confidence interval for the population mean of per capita income. Use those results to construct the relevant confidence intervals by hand (using your calculator) and should type out that work for credit. (c) Do the same thing for the 95% confidence interval. What is the point in calculating these confidence intervals? (d) Construct a scatterplot of the county's unemployment rate in 2013 (UnempRate2013) variable on per capita income (PerCapitalne) and present this graph in your answer. (Note that you may need to scroll down the variable list to find the unemployment rate for the year 2013. Exporting the graphs from STATA is easy-simply follow the steps from class notes!). Does there appear to be a relationship between the variables and if so, is it an intuitive relationship? Explain your answer. (e) Construct a scatterplot of the "Metro2013" variable (which is a dummy variable for metro status versus non-metro in 2013) on per capita income (PerCapitalne) and present this graph in your answer. Does there appear to be a relationship between these variables and if so is it an intuitive relationship? Explain your answer. (f) Now, restrict to the subsample of counties that are metro areas (hint: those with "Metro2013-=1").COMPUTE and INTERPRET the: (i) Sample mean (ii) Sample standard deviation for both PerCapitalne and UnempRate2013. (iii) Sample size. (g) Now, restrict to the subsample of counties that are non-metro (hint: those with "Metro2013-=0").COMPUTE and INTERPRET the statistics similar to those calculated in (f) (h) Construct a 90% confidence interval for the difference between the means of per capita income across metro and nonmetro counties. Note that you must use the difference in means version of the standard error in order to compute properly. (i) Explain whether or not there is evidence to support the hypothesis that there is a statistically significant difference between the means of per capita income for the two groups based on metro/non-metro status. Relate your answer to your findings in (h).


Q 1 Food expenditure and its determinants have been extensively researched in social science. We intend to estimate the link between food spending and some of its factors in this exercise. The data file food.xlsx contains 200 observations for the following variables from a cross-section of households. Foodexp: Weekly expenditure on food (excluding restaurants) in dollars. Income: Weekly household income in dollars. Children: Number of dependent children living in the household. Retired: A binary (0/1) indicator of whether head of the household is retired {ret.-1). (a) Estimate the following two models and present your summary report for both models. What do you conclude about the fit of the two models? [4 marks] (1) Foodexp = a + a₂ Income + e; Foodexp = Y₁+ y₂log (Income) + u₂ (b) Now estimate the following regression model and answer all the following questions. Foodexp: = Bo + B₁ log(Income;) + B₂ Children; + Retired; + Gi Estimate the model using Grefl and provide the summary results (Gretl: Model →Ordinary Least Squares and then select the "Foodexp" as the dependent variable and "log(Income", "Children" and "Retired" as Regressors →OK.) (A summary results should include fitted equation with coefficients, standard error, t-statistic, p-value, sample size, F-statistic and R- squared). [4 marks] (c) Does the sign of the slope coefficients agree with your expectations? Comment. [4 marks] (d) Comment on the statistical significance of the estimates of the variables, Income, Children and Retired at 5% significance level. (No need to carry out hypothesis tests) [4 marks] (e) Test the overall validity of the regression model at the 5% significance level. State the hypotheses, restricted and unrestricted model, test statistics and its distribution when null hypothesis is true, critical value and your conclusion. [4 marks] (f) Construct 95% confidence interval for B₁, the slope of the log(Income) variable and interpret your results. [4 marks] (g) Based on your answer in part (f), without performing a hypothesis test, would you reject the hypothesis Ho: B₁ = 90, H₁: ₁90. Clearly states your conclusion? [4 marks] (h) Graph the residuals of least squares against log(Income) and describe the pattern. Do you find any evidence against the violation of any multiple regression assumptions? Explain. [4 marks] (i) Test for the existence of heteroscedasticity at the 5% significance level. Use the White's test (Squares only) and attach your Grefl results. Clearly states all steps in your test; null and alternative hypotheses, the auxiliary regression and the test statistic, critical value, your decision and the conclusion. [4 marks] (i) Based on your findings in part (i), is the model in part (b), valid? How would you rectify the problem? Attach your Gretl output. Compare your results with the output in part (b). Comment. [2 marks] (k) Now run the following regression model: Foodexp = a₁ + a₂ Income + a₂ Children, + a₂ Retired; + e¡ Compare your model that with part (a). Which model would And Why? [8 marks] you choose?


Part 2 REGRESSION ANALYSIS (a) Run a regression to determine the impact of the 2013 unemployment rate (UnempRate2013) on the per capita income (PerCapitalne) in a county. What is the estimated slope? Explain what this number means in words in terms of the unemployment rate and in terms of per capita income. Also indicate if the relationship is statistically significant at the 10%, 5%, and 1% levels. For this first pass, use homoskedastic standard errors. (b) Re-run the regression from part (a) but this time use heteroskedastic standard errors. Are your coefficients the same as in part (a)? Why? Are your standard errors (of your betas) the same as in part (a)? Why? (c) Run the same regression as in part (b) but now also include the following additional regressors: percentage of the population that is college-educated (Ed5CollegePlusPct), percentage of the population that is black (BlackNonHispanic Pct 2010), and percentage of the population that is Hispanic (Hispanic Pct 2010. Now, what is the estimated impact of unemployment rate in 2013 on per capita income? Also indicate if the relationship is statistically significant at the 10%, 5%, and 1% levels? Make sure that you are using heteroskedastic standard errors. (d) Provide economic/econometric intuition as to why the impact of the unemployment rate's impact on per capita income changed between parts (b) and (c). Note that I am asking you to think about the context (and hence the "story" behind these data). (e) Construct a 95% confidence interval for the slope coefficient on UnempRate2013 found in Part 2(c). Write out your calculations. Clearly indicate how this confidence interval relates to whether UnempRate 2013 is statistically significant or not in this context by relating your answer to your constructed confidence interval. (f) You recall from Part 1 that both the means of per capita income and of unemployment rate in 2013 are quite different across metro and nonmetro areas. You therefore want to explore this in more detail. Run the regression from Part 2(c) using only metro areas in 2013 (i.e., Metro2013--1). [Hint: You need to restrict the data based on a criterion before running the regression.] Now, what is the estimated effect of the 2013 unemployment rate on per capita income and also indicate if the relationship is statistically significant at the 10%, 5%, and 1% levels? Make sure that you are using heteroskedastic standard errors. (g) Now, run the regression from Part 2(c) using only non-metro areas in 2013 (Metro2013--0). [Hint: You need to restrict the data based on a criterion before running the regression]. Now, what is the estimated effect of the 2013 unemployment rate on per capita income and also indicate if the relationship is statistically significant at the 10%, 5%, and 1% levels? Make sure that you are using heteroskedastic standard errors. (h) What did you learn from the comparison between results in parts (f) and (g)? Explain your answer. Note that I again am asking you to think about the context (and hence the "story" behind these data). (i) Return to the full sample. Now, run a regression to determine the impact of changing the percentage of the population which is college educated (Ed5CollegePlusPct) on the per capita income (PerCapitalne) in a county. Include controls for the unemployment rate in 2013 (UnempRate2013), percentage of the population that is black (BlackNonHispanicPet2010), percentage of the population that is Hispanic (HispanicPet2010) and now also include a dummy variable for metro status (Metro2013). Now, what is the estimated impact of percentage with a college education on per capita income? Also indicate if the relationship is statistically significant at the 10%, 5%, and 1% levels? Make sure that you are using heteroskedastic standard errors. (j) It is quite common in econometrics to model income variables nonlinearly. Construct a new variable and call it "logine" or whatever you prefer, where logine-In (PerCapitalne). Provide summary statistics for this new variable. (Hint: Think back to how you constructed summary statistics in Part 1.) (k) Now run a regression model with logine as the dependent variable (and we are also going to start controlling for metro status in addition to the other controls). In other words, the control variables are unemployment rate in 2013 (UnempRate2013) as the main regressor, while also including the other regressors: percentage college educated (Ed5CollegePlusPct), percentage non-Hispanic black in 2010 (BlackNon HispanicPet2010), percentage Hispanic in 2010 (HispanicPct 2010), and metro status in 2013 (Metro2013). Now, what is the estimated effect of UnempRate 2013 in words? Also indicate if the relationship is statistically significant at the 10%, 5%, and 1% levels? Make sure that you are using heteroskedastic standard errors. [Careful not to leave out any variables in your regression specification in STATA] (1) What is the null hypothesis corresponding to the F-statistic as reported in the output for the regression in part (k)? What is the conclusion of the reported F-test? Explain (i.e. Do you reject or fail to reject the stated null hypothesis above and how do you know this?) (m) Construct a 95% confidence interval for the slope coefficient on UnempRate2013 in Part 2(k). As usual, write out your calculations. Clearly indicate how this confidence interval relates to whether UnempRate2013 is statistically significant or not in this context by relating your answer to your constructed confidence interval. (n) Discuss what the standard error of the regression (SER), R-squared and adjusted R-squared in part (k) are telling you in terms of the numbers that you have found. Using what you know about the difference between the two formulas, explain specifically why the R² and R² statistics so similar for this case. (0) Use an F-test to test the joint significance of the additional regressors: Ed5CollegePlus, BlackNon- Hispanic Pct 2010, Hispanic Pct 2010, and Metro2013. Find this test statistic and clearly indicate the conclusions of the test. (p) If you had more time to study this question and/or more or different data, what would you suggest doing next? Propose additional variables to add and/or different specifications to try and give specific reasons why you are suggesting these. Answers will vary for this part of the problem.


THE REGRESSION ANALYSIS 1 The Simple Regression Model [30 pts] 1.1 Regression to the Origin [6 pts] Consider the estimation of the following model using a random sample {(y₁, ₁): i=1,2,...,n}, where B₁ is estimated using ordinary least squares (OLS). a. Derive 3₁ using OLS. [1 pt] b. Show under the necessary assumptions that 3₁ is unbiased if the population regression function is y₁ = B₁+U₂. [2 pts] c. Show -under the same assumptions used above that B₁ is biased if the population regression function is Yi = Bo + B₁zi + Ui, with Bo 0. [1 pt] d. Show that 1.2 Regression to a Constant [4 pts] Consider the estimation of the following model using a random sample {(yi): i = 1,2,...,n}. a. Derive Bo using OLS. [2 pts] b. Provide an interpretation of o. [2 pts] Yi = Bo + ûi 1.3 Regression on a Binary Explanatory Variable and Average Treatment Effect [20 pts] Let y be any response variable and a binary explanatory variable. Let {(ri, Yi) : i = 1,2,...,n} be a sample of size n. Let no be the number of observations with ; = 0 and n₁ the number of observations with ₂ = 1. Let Yo be the average of the y, with = 0, and ₁ the average with z; = 1. a - Explain why we can write b. Argue that c. Show that the average of y, in the entire sample, y, can be written as a weighted average: y = (1-I)yo+zÿ₁. d. Show that when z is binary, e. Show that f. Use parts d. and e. to show g. Show, also,


4 Empirical Application [25 pts] To answer this question you are required to use the statistical software Stata. Make sure to create a do file with your code, an automated log file of your answers from that code, and write down in a separate document your answers. You are required to submit all three files (i.e., do file, log file, and Word document) before the due date. Load the same dataset nbasal using the bcuse command. 4.1 Simple versus multiple regression model [15 pts] Estimate coefficients ßo and 3₁ using the OLS estimator of a model that relates the marriage status (marr;) with the annual salary in thousand of USD (wage;): i. Which is the value of Bo? How do you (quantitatively) interpret this value? [2 pts] ii. Which is the value of ₁? How do you (quantitatively) interpret this value? [2 pts] iii. Answer i. and ii. defining the dependent variable as log(wage;) [3 pts] iv. From an economic point of view, is ₁ capturing a causal effect of z on y or purely a correlation? Why? [4 pts] v. Run the following regression, 4.2 Multiple regression model [10 pts] Estimate coefficients Bo, B₁, B2, and 3 using the OLS estimator of a model that relates the position of the player (guardi, center;, and forward;) with the annual salary in thousand of USD (wage;): wage; = Bo + Biguard; + ß₂center; + B3 forward; +u;. i. Which is the value of 8₁? How do you explain this result? Which assumption are we violating in this model? [5 pts] ii. How can you solve the issue in i.? How can I know the average (predicted) wage of a guard, a center, and a forward? [5 pts]


Estimation of Energy Demand Function i- Calculate gasoline per capita consumption (GPC) and draw a line graph and interpret the trend. Economic theory suggest the following, in relation to a demand function for the gasoline: GPC = f(PG, Y, PNC, PUC) With a regression model in log form as follows: log GPC = logA + B1 log PG + B2 log Y + B3 log PNC + B4 log PUC ii- Determine the signs of regression coefficients using economic theory and provide reasoning iii- Run a simple regression model (by using equation 1 above) and interpret regression coefficients and elasticities and provide reasoning for each coefficient. Are the regression coefficients similar to the economic theory? If not, what could be the reason? Consult some academic literature. iv- Now create previous year gasoline consumption (LPrevGPC) variable as an independent variable and re-run regression model (equation 2) as follows: log GPC = logA+B1 log PG+B2 log Y +B3 log PNC+B4 log PUC+B5 log PrevGPC, Where the expression B1/(1 − B5) is in fact long run elasticity of demand. Calculate this long run elasticity of demand and interpret the number by comparing it with short run price elasticity. What conclusion would you draw from these estimates? v- What sort of recommendations you would make as an analyst to an energy firm operating in the USA in relation to the pricing and disposable income in particular.


Exercice 4 ACF/PACF of AR/MA/ARMA models You can simulate trajectories from ARIMA models with functions. In this exercise, you will look at correlograms arima.sim in R, and you can produce correlograms with the acf and pacf and try to identify (as best as you can) whether they might correspond to AR, to MA or to full ARMA models.


Exercice 3 Random walk with drift Consider the model X₁ = 8 + Xt-1 + Wt, where Wt are independent Normal (0, 1) variables, for all t > 1, and assume that X₁ = 0.


Which variable in Question 2 would be considered as the dependent variable? O Theft O Cameras


For the following model, 95% of the time slope will fall between what values?


According to the correlation table, who is the best candidate predictor for Sales?


Based on the model below, which of the following is an example of data extrapolation?


In the model in Question 11, (1) what is the Standard Error of the Estimate? (2) What is the 95% Confidence Interval of the errors? (3) How do you interpret the 95% Confidence Interval of the errors? (4) We get the Lower/Upper 95% Indiv Performance values as follows. How do you interpret this range when Cameras = 7?


\text { 1. For the Linear Model } Y-X \beta+c \text {, define } X \text { and } \beta \text { as: } (a) Find the least squares estimator of ß. (b) Find the mean, variance, and distribution of the least squares estimator from part(₂) (c) Find the estimator of o². \text { (d) Give the formula for the }(1-\alpha) \% \text { confidence interval for } \beta_{1} \text {. }


2. For the Linear Model Y-XB+c, define Y, X, ß, and (a) Find the least squares estimator of B. (b) Find the mean, variance, and distribution of your least squares estimator from part (a).1 (c) Find the estimator of o². (d) Find the formulas for SST, SSM, and SSE for this particular model. (e) Give the ANOVA Table for this particular model. (f) Determine the appropriate null and alternative hypotheses for this particular model using the F-test statistic from the ANOVA Table.


Question 1 5 marks A boutique beer brewery produces 2 types of beers, Dark-ale and Light-ale daily with a total cost function: TC = 3QD + QD X QL + 4QL where: QD is the quantity of the Dark-ale beer (in kegs) and Q₁ is the quantity of the Light-ale beer (in kegs). The prices that can be charged are determined by supply and demand forces and are influenced by the quantities of each type of beer according to the following equations: PD = 32 QD + QL for the price (in dollars per keg) of the Dark-ale beer and P₁ = 42+2QD - -QL for the price (in dollars per keg) of the Light-ale beer. The total revenue is given by the equation:TR = PD XQD + PL X QL and the profit given by the equation Profit = TR - TC First, use a substitution of the price variables to express the profit in terms of QD and Q₁ only. Using the method of Lagrange Multipliers find the maximum profit when total production (quantity)is restricted to 192 kegs. Note Qp or Q₁ need not be whole numbers. Question 25 marks A farmer discovers that his land has been targeted as a chemical dumping ground with a chemical that is dangerous for growing any crops. It is known that the chemical concentration decays according to the exponential decay process. At the time of discovery, the concentration of the chemical was 15% of the original. One week later, the chemical content reduced to 14%. The police have two suspects, who were both in prison for 15 weeks each at different times for other offences but providing them with alibis (proof of innocence). Suspect A served his sentence ending 35 weeks before the time of the discovery and Suspect B was released from prison 40 weeks before the time of the discovery. Use the exponential decay model to determine whether any of the suspects are innocent.


2. In Table 2 you find estimation results of wage regressions using data from Austria from the years 2004 until 2006.Five specifications are estimated. The dependent variable is the logarithm of the hourly wage. The first specification includes a dummy that is one for females and zero for males as well as year dummies, then years of education, years of experience and years of experience squared, dummy variables for occupations, and dummy variables for industries are sequentially added. i)Write down the regression model for the fifth specification. Explain why we add the various variables to the regression model. Describe and explain the estimated coefficients of female in all specifications. Are the coefficients significantly different from zero? Why? What does the estimated coefficient of female mean? Why does the estimated coefficient of female change when adding additional explanatory variables? What are the reasons why the coefficient becomes smaller in absolute terms when adding additional explanatory variables? Comment on the estimated coefficients for education and experience in specification (5). Are the coefficients significantly different from zero? Why? What do the estimated coefficients of education and experience mean? Comment on the R-squared in all specifications.


\text { 2. Consider the panel regression equation } Y_{t}=\beta X_{t}+\alpha_{1}+\lambda_{t}+u_{t-} \text { [7 marks] } a) Describe in words what a¡ represents. \text { b) Describe in words what } \lambda_{t} \text { represents. } c) Suppose there are only two time periods, i.e., T=2. Describe three ways to estimate this regression in Stat a that will provide the same estimate of ß. Please provide your code for each way. Assume the following variable names: y for the dependent variable, x for the explanatory variable, id for the entity, and time for the time. [3 marks] d) Continue to assume that T=2. Would you expect the R² to be the same across the three methods you described in 2c)? Does this imply anything about which method is preferred? Explain your answer. [3 marks]


The reference mode of the US GDP is shown in Figure 1 1. Identify the variables that might be responsible for the growth of the US GDP (2 points) 2. Sketch a causal loop diagram to capture the behavior you identify (2.5 points) 3. Identify potential negative feedback that might halt growth in the system (0.5 point)


5. In the following table, enter the values for the entity demeaned values of the variables. [2 marks]


7. Suppose you are interested in estimating the causal effects of community serviceon earnings. You want to estimate the following regression: \ln (e a m i n g s)_{1}=\beta_{0}+\beta_{1} \text { CommSerı }+u_{1} where CommSer; is an indicator variable that takes the value 1 if the individual doescommunity service and 0 otherwise. [5 marks] a) Consider each of the five threats to internal validity listed in section 9.2 of the textbook. Critically discuss each of those five threats in this context. [2.5 marks] b) Suppose there is a lottery where the numbers 1 through 100 are randomly drawn for each individual in the population. Individuals with low numbers are told they will have to do community service unless they either make a charitable donation or pursue a graduate degree while individuals with high numbers are told they will not have to do community service. Is this randomly assigned lottery number a valid instrument for CommSer? [2.5 marks]


9. This question is about the paper "The long-run impact of bombing Vietnam" byEdward Miguel and Gerard Roland. You can find a copy of the paper on MyLS underContent -> Final exam. [4 marks] a) What evidence and or discussion do the authors provide about the relevance of their instrumental variable? [1 mark] b) What evidence and or discussion do the authors provide about the exogeneity of their instrumental variable? [1 mark] c) Compare columns (2) and (6) in Table 4. How does the coefficient on the variable of interest change across the two specifications? Is this what you would expect if the instrument was valid? [2 marks] d) Replicate the regression in column (6) of Table 4. Note that I could not perfectly replicate their reported standard error. I don't know why, but what I generated was extremely close with the same coefficients. Use the dataset "war_data_district.dta"available on MyLS under Content -> Final Exam. [1 mark]


No Question Found forEconometrics

we will make sure available to you as soon as possible.