Explain the relationships according to your plots.

You need Excel, Analytic Solver and R up and running on your computers. I uploaded a practice problem (see Quizzes on the left panel) to give you an idea about what kind of problems you may expect. The questions will be open ended to test your ability to synthesize different concepts (see the practice problem in Quizzes module). There will be three questions: one covers data visualization and descriptive multiple linear regression; one covers predictive multiple linear regression; and the last one covers predictive logistic regression and KNN. For the first two questions, you are allowed to use either R or Analytic Solver. For the last question, you will use Analytic Solver.
For each problem you have to show your work otherwise no credit will be given. For each problem you have to upload the following files (all files for each problem are mandatory):
If you used Analytic Solver, an Excel spreadsheet for each problem (call them ‘Problem1.xlsx’, ‘Problem2.xlsx’ and ‘Problem3.xlsx’) that will be composed of as many sheets as needed to answer all the questions relevant to each problem. Make sure to name each sheet reflecting the question number (for example Q1 for question 1, Q2 for question 2 in Problem 1 etc.
If you used R, compile the R script to an html file. Submit both the script and the html file. Use descriptive names as explained above.
Q1:Descriptive Multiple Linear Regression:
You work at a consulting firm specializing in improving productivity at manufacturer facilities. Your director provided you with a data set Download data setthat includes information on several manufacturing facilities. She asked you to determine what factors affect scrap rate at these facilities. Scrap rate is a proxy for production quality and output. A lower scrap rate is desirable. Analyze this data set by answering the following questions. You can use Excel, Analytic Solver or R.
a)Create appropriate visualizations to investigate how scrap rate is related to number of employees at a plant, sales, whether the facility is unionized or not, and the number of different products manufactured at the facility. Explain the relationships according to your plots.
b)Find the mean, the standard deviation and the skewness of the scrap rate for the unionized and nonunionized facilities. Interpret these summary statistics by comparing them.
c)Find pairwise correlations of the variables with scrap rate in your data set. Show all your work and interpret the correlations.
d)Build a regression model to explain the relationship of scrap rate with the various predictors in the data set. Interpret the regression coefficients and model fit in detail. Consider including quadratic terms as needed. Be as descriptive as possible. Show all your work.
e)You think the facilities workers with a number of workers between 20 and 30 has a lower scrap rate than the facilities with less than 20 workers and more than 30 workers. Can you test this ? (Hint: Read “Categorical variables with Multiple Categories” Section of your textbook on page 218.)
f)Using a boxplot of scrap rate, remove all the outliers. Compare the average scrap rate before and after removing the outliers.
Question2:
You set out to build a predictive multiple linear regression model to predict the prices of houses listed on AirBnB. You have a data set Download data setthat includes information on price per day and various predictors. You decided to use the holdout method. The partition column in the dataset indicates which observations belong to the training sample (t) and which ones belong to the validation sample (v). You can use either Analytic Solver or R.
Using exhaustive search, and adjusted R^2 to evaluate model fit, find a candidate model to use for prediction
Using backward search, and Mallow’s Cp to evaluate model fit, find a candidate model to use for prediction.
Which of these two models is the best for predicting prices? Justify your answer. Report all your work clearly.
Question3:
You are working as a loan analyst at Goods-Fargo Bank. You decided to create a tool to determine whether a loan applicant would repay a loan he/she received. You have a data set Download data setwith several customer variables and whether a customer repaid (Repay variable) the loan.
Use the holdout method and Partition variable to create training and validation samples. Use Analytic Solver for this question.
a)Use logistic regression to predict who is likely to repay the loan. Evaluate the following two models in terms of their prediction accuracy. In the first model, use all the predictors, i.e., full model. In the second model, use 8 predictors. For the second model, you are free to choose which predictors to leave out. Then evaluate these two candidate models based on their overall classification accuracy. Recommend a final model and express the model as a mathematical equation relating the output variable to the input variables.
b)Use a k-NN model to predict who would repay the loan. Recommend a final model by carefully stating model parameters such as k and cutoff probability. How does this model perform compared to the final model in part 1?

Calculate the price of your order

550 words
We'll send you the first draft for approval by September 11, 2018 at 10:52 AM
Total price:
$26
The price is based on these factors:
Academic level
Number of pages
Urgency
Basic features
  • Free title page and bibliography
  • Unlimited revisions
  • Plagiarism-free guarantee
  • Money-back guarantee
  • 24/7 support
On-demand options
  • Writer’s samples
  • Part-by-part delivery
  • Overnight delivery
  • Copies of used sources
  • Expert Proofreading
Paper format
  • 275 words per page
  • 12 pt Arial/Times New Roman
  • Double line spacing
  • Any citation style (APA, MLA, Chicago/Turabian, Harvard)

Our guarantees

Delivering a high-quality product at a reasonable price is not enough anymore.
That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.

Money-back guarantee

You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.

Read more

Zero-plagiarism guarantee

Each paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.

Read more

Free-revision policy

Thanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.

Read more

Privacy policy

Your email is safe, as we store it according to international data protection rules. Your bank details are secure, as we use only reliable payment systems.

Read more

Fair-cooperation guarantee

By sending us your money, you buy the service we provide. Check out our terms and conditions if you prefer business talks to be laid out in official language.

Read more