Chapter 2: | Segmenting Customer Transactions Using a Pattern-Based Clustering Approach |
approach. As mentioned earlier, our primary goal of developing an effective clustering approach is to better understand data and use it for more effective predictions. Consistent with these principles, we pay special attention to the evaluation of our overall modeling approach. In the first set of experiments (Experiment I), we take customer-level browsing and purchasing data from 10 different Internet retailers, and use 5 different approaches to build predictive models to predict customers’ future value for each retailer. In the second set of experiments (Experiment II), we further evaluate the clustering approach independently providing further evidence about why pattern-based clustering works. The results from the experiments suggest that pattern-based clustering can be an effective segmentation technique and can be used to build better predictive models.
2.5.1 Experiment I: Building Segmentation-Based Predictive Models for Online Retailers
In this set of experiments, we assume that a certain Internet retailer can collect a user’s demographics and past browsing and purchase information on its Web site (site-centric data), and that the retailer can use this information to predict future customer value. The dependent variable is the total amount of money spent on a Web site within a period of time in the future (see Appendix 4 for the description of the variables used in this experiment). The 10 Internet retailers are amazon.com, bestbuy.com, bmgmusic.com, expedia.com, hotwire.com, landsend.com, orbitz.com, qvc.com, sears.com and ticketmaster.com. We have 6 months of data for each retailer. We use the independent variables in the first two months and the dependent variable in the middle two months to train the models, and then use the independent variables in the middle two months to predict the dependent variable in the last two months based on the calibrated predictive models. M5 Regression trees (Quinlan 1992) were used for building all the predictive models.
We use our segmentation-based modeling approach and four other approaches to build predictive models. The four approaches are