Regression models for data mining (Individual project). In this project, you centre on applying logistic regression to build a classification model, based on the �Ebay Auction� data. Academic Essay

Regression models for data mining (Individual project). In this project, you centre on applying logistic regression to build a classification model, based on the �Ebay Auction� data.

Order Description

Regression models for data mining (Individual project). In this project, you centre on applying logistic regression to build a classification model, based on the �Ebay Auction� data.

The file eBayAuctions.xls contains information on 1972 auctions transacted on eBay.com during May-June 2004. The goal is to use these data to build a model that will distinguish competitive auctions from non-competitive ones. A competitive action is defined as an auction with at least two bids placed on the item being auctioned. The data include variables that describe the item (auction category), the seller (his or her eBay rating), and the auction terms that the seller selected (auction duration, opening price, currency, day of week of auction close). In addition, we have the price at which the auction closed. The goal is to predict whether or not the auction will be competitive.
Data pre-processing. Create dummy variables for the categorical predictors. These include Category (18 categories), Currency (USD, GBP, euro), EndDay (Monday-Sunday), and Duration (1,3,5,7 or 10 days). Note that SPSS can do this for you automatically.
Split the data into training and validation datasets using a 60%:40% ratio.
Why should the data be partitioned into training and validation sets? For what will the training set be used? For what will the validation set be used?
Explore the data set by running descriptive analysis, boxplots and histograms. Based on these methods describe the data and make relevant conclusions. Do the data need cleaning? Why?
Analyse and delete correlated variables, that create multicollinearity, if here are any.
Run a logistic model. Interpret the meaning of the coefficient for closing price. Does closing price have a practical significance? Is it statistically significant for predicting competitiveness of auctions?
Use stepwise selection to find the model with the best fit to the training data. Which predictors are used?
Use stepwise selection to find the model with the lowest predictive error rate (use the validation data). Which predictors are used?
What is the danger in the best predictive model that you found?
Report on the final model, which excludes variables causing any collinearity and obtained by the stepwise forward data input method.

Maximum words 500. I would like you to use only SPSS software.

Statistical Package for the Social Sciences (SPSS)

Is this question part of your assignment?

Place order