Chapter 2: | Segmenting Customer Transactions Using a Pattern-Based Clustering Approach |
models need to be built for individual segments and future customers need to be assigned to existing segments in order to identify strategies and models to implement.
In the previous two sections, we focused primarily on developing the pattern-based clustering approach for segmentation. In this section we show how pattern-based clustering can be used to build better models for consumer behavior. In particular, we present a framework for model building using pattern-based clustering and signature discovery techniques.
We assume that there is an overall goal of modeling a specific outcome variable (such as whether or not a customer will make a purchase). The framework (Figure 2) consists of three stages: a clustering stage, a signature and model building stage, and a prediction stage.
In the clustering stage a set of transactions is grouped into clusters based on an appropriate pattern-based clustering algorithm. The second stage has two parts. First, for each cluster we extract a “signature” that can describe the cluster based on its salient behavioral patterns. There are several methods to represent signatures, and this is an active area of research. One approach is to use a subset of the frequent patterns discovered from each cluster as its signature. The second part in this stage is to build a predictive model (e.g., decision trees (Quinlan 1993), regression models) for the outcome variable separately for each cluster. Hence, each cluster ci will have its signature sigi and corresponding model modi. The third stage is to make predictions for new transactions for which the outcome is not known. In this stage the new transaction is compared to each signature to determine which signature most closely matches the transaction. Based on this comparison, the appropriate model is used (or the models combined using a weighting scheme) to generate the final predictions.
2.5 Experiments
We conducted two different sets of experiments to evaluate our pattern-based clustering approach and the segmentation-based modeling