Consider scoring new observations in the SCORE procedure versus the SCORE statement in the LOGISTIC procedure.
Which statement is true?
Identify the correct SAS program for fitting a multiple linear regression model with dependent variable (y) and four predictor variables (x1-x4).
An analyst investigates Region (A, B, or C) as an input variable in a logistic regression model.
The analyst discovers that the probability of purchasing a certain item when Region = A is 1.
What problem does this illustrate?
Refer to the exhibit:
The plots represent two models, A and B, being fit to the same two data sets, training and validation.
Model A is 90.5% accurate at distinguishing blue from red on the training data and 75.5% accurate at doing the same on validation data. Model B is 83% accurate at distinguishing blue from red on the training data and 78.3% accurate at doing the same on the validation data.
Which of the two models should be selected and why?
Which method is NOT an appropriate way to score new observations with a known target in a logistic regression model?
A company has branch offices in eight regions. Customers within each region are classified as either "High Value" or "Medium Value" and are coded using the variable name VALUE. In the last year, the total amount of purchases per customer is used as the response variable.
Suppose there is a significant interaction between REGION and VALUE. What can you conclude?
Refer to the REG procedure output:
Calculate the coefficient of determination, R-Square.
Enter your numeric answer in the space below. Round to 4 decimal places (example: n.nnnn).
An analyst fits a logistic regression model to predict whether or not a client will default on a loan. One of the predictors in the model is agent, and each agent serves 15-20 clients each. The model fails to converge. The analyst prints the summarized data, showing the number of defaulted loans per agent. See the partial output below:
What is the most likely reason that the model fails to converge?
Refer to the exhibit.
Output from a multiple linear regression analysis is shown.
What is the most appropriate statement concerning collinearity between the input variables?