Search for question
Question

/n Instructions You will be using the Real Estate data set to build a model to predict what a house should sell for. This model will be used by a real

estate agency to help their clients understand what their house should sell for so they can make an educated decision about listing price. Secondarily, the model will be used by a home contractor. S/he would like to be able to tell clients the selling value of adding an additional bathroom. Part 1 of the project involves the first three steps in the data mining process: sample, explore and modify. You will be preparing the data for model building, which will be done in Part 2 of the project next week. You will need to make decisions regarding data that is in text form, missing data, potentially incorrect data, the inclusion of potential outliers, binning strategy and variable transformation. Please make sure your decisions are justified. Note that the specific requirements and relative weights are outlined in the rubric. Compute descriptive statistics on all of the continuous variables. Briefly discuss. Compute the frequencies of ALL of the categorical variables. Briefly discuss. Run a correlation table. Discuss at least 3 correlations. The descriptive statistics, frequencies and correlation table should be professionally presented on separate, labeled tabs/worksheets. There is quite a bit of discussion in this assignment. Rather than putting the discussion in Excel, it is preferred that you prepare this assignment as a Word (or PDF) document and include the relevant Excel output as figures. You are required to submit both the Word/PDF file as well as your Excel file. Please note only the Word/PDF file will be graded. I am asking you submit the Excel file in case I have a question about your work. Again, only the the Word/PDF file will be graded so it should contain all the requirements. The Excel file will only be opened if necessary. Different people will make different decisions which may ultimately impact the model they develop. While there are wrong things you could do (like using a 0 for all missing values), there is not one "right" answer. Make sure you document and justify the decisions you make. It is fine (perhaps even ideal) to note decisions that were made because this is an academic project. For example "Given the appropriate resources, I would have ___ to get the missing values for Lacking the resources for this option, I elected to recognizing that this decision _______." I suspect there will be lots of questions this week as you start working independently with the data. Just a reminder that questions can be posted in "Ask the Instructor" Discussion Forum so that other classmates can participate and will also have the benefit of our discussion.


Most Viewed Questions Of Real Estate Planning

Consider the following data: Step 1. Find the expected value E(X). Round your answer to one decimal place. Step 2. Find the variance. Round your answer to one decimal place. Step 3. Find the standard deviation. Round your answer to one decimal place. Step 4. Find the value of P(X39). Round your answer to one decimal place. Step 5. Find the value of P X £ 7]. Round your answer to one decimal place.


Suppose the mean income of firms in the industry for a year is 80 million dollars with a standard deviation of 3 million dollars. If incomes for the industry are distributed normally, what is the probability that a randomly selected firm will earn between 83 and 85 million dollars? (Round your answer to 4 decimal places)


A zero coupon bond with a face value of $1,000 is issued with an initial price of $430.84 based on semiannual compounding.The bond matures in 20 years. What is the implicit interest, in dollars, for the first year of the bond's life? . $19.08 b. $25.25 E, $21.47 $18.53 e. $22.56


2. Consider the following gamblers ruin problem. A gambler bets $1 on each play of a game.Each time, he has a probability p of winning and probability q = 1-p of losing the dollar bet. He will continue to play until he goes broke or nets a fortune of T dollars. Let X,denote the number of dollars possessed by the gambler after the n-th play of the game.Then X_{n+1}=\left\{\begin{array}{l} X_{n}+1 \text { with probability } p \\ X_{n}-1 \text { with pro ability } 1-p \end{array} \quad \text { for } 0<X_{n}<T\right. X_{n+1}=X_{n}, \text { for } X_{n}=\text { or } X_{n}=T Defined in such a way X, is a Markov chain. The gambler starts with Xo dollars, where0 <Xo < T. (a) Construct the (one-step) transition matrix (b) Let T = 3 and p = 0.55. Find the probabilities of winning T dollars when the initial capital of the gambler is 1,..,T – 1 dollars.


1. In any one-minute interval, the number of requests for a popular Web page is a Poisson random variable with expected value 300 requests. If the number of requests in a one minute interval is greater than n, where n is the capacity of the Web server, the server is overloaded. Use the central limit theorem to estimate the smallest value of n for which the probability of overload is less than 0.05.


The amount of time a bank teller spends with each customer has a population mean of 3.10 minutes and a standard deviation of 0.40 minute. If you select a random sample of 16 customers, 1. What is the probability that the mean time spent per customer is at most 3minutes? 2. There is an 80% chance that the sample mean is less than how many minutes? 3. What assumption must you make in order to solve (Task 1) and (Task 2)? 4. If you select a random sample of 60 customers, there is an 95% chance that the sample mean is less than how many minutes?


The state education commission wants to estimate the fraction of tenth grade students that have reading skills at or below theeighth grade level. Step 1. Suppose a sample of 2089 tenth graders is drawn. Of the students sampled, 1734 read above the eighth grade level.Using the data, estimate the proportion of tenth graders reading at or below the eighth grade level. (Write your answer as a fraction or a decimal number rounded to 3 decimal places) Step 2. Suppose a sample of 2089 tenth graders is drawn. Of the students sampled, 1734 read above the eighth grade level.Using the data, construct the 98% confidence interval for the population proportion of tenth graders reading at or below the eighth grade level. (Round your answers to 3 decimal places)


Abdullah, the manager of computer shop wants to maximize the number of computers sold per month. He can hire an unskilled labor for 4,000 SAR a month, and the skilled one would cost him 6,000 SAR per month, and not to exceed his monthly budget of 32,000 SAR. The following table shows how the total number of computer sold varies with the number of employed (unskilled & Skilled labors): MBu= marginal benefit of unskilled labor MBs= marginal benefit of skilled labor Su= salary of unskilled labor Ss= salary of skilled labor 1. Fill in the blanks in the above table. 2. How many of unskilled and skilled labors would you hire to maximize computer sales? 3. What are the maximum computers would be sold per month? 4. Sate the equation where computers sale is maximized and the level of each activity is equal with the other?


17. Abandonment Value We are examining a new project. We expect lo sell 6,500 units per year at $43 not cash flow apiece for the next 10 years. In other words, the annual operating cash flow is projected to beS43 x 6,500 = $279,500. The relevant discount rate is 16 percent and the initial investment required is$980,000. a. What is the base-case NPV? h. After the first year, the project can be dismantled and sold for $810,000. If expected sales are revised based on the first year's performance, when would it make sense to abandon the investment? In other words, at what level of expected sales would it make sense to abandon the project? c. Explain how the $810,000 abandonment value can be viewed as the opportunity cost of keeping the project in one year.


Researchers often mark wildlife in order to identify particular individuals across time or space.A study of butterfly migration is designed to determine which location on the butterflies' wingsis best for marking. The six possible locations are those shown as A through F in the figure below. The butterfly in the figure is a monarch (Danaus plexippus). Because marks in certain locations may be more likely to attract predators or cause problems than marks in other locations, the goal is to determine whether the six marking locations result in equivalent chances of successful migration. To test this, researchers plan to mark 3,600butterflies and release them, then count how many arrive displaying each marking location at the end of the migratory path. 20. **What type of butterfly is represented in the figure? 21. How many butterflies does the researcher plan to mark and release? 22. **Why do the researchers need to mark butterflies in different locations? B. Describe location D on the butterfly. 4. How is location A different from location D? Why do researchers mark wildlife? 5. What is the goal of the study?