The Online Customer:  New Data Mining and Marketing Approaches
Powered By Xquantum

The Online Customer: New Data Mining and Marketing Approaches By ...

Read
image Next

enable statistical inference. In general model-based clustering, the data is viewed as coming from a mixture of probability distributions, each representing a different cluster. If we view the patterns representing a cluster as a type of model generating the observable data (with some noise) within that cluster, the spirit of our approach is similar to that of model-based clustering. Although our approach does not allow statistical inference, it can capture more interesting behavioral patterns.

Pattern-Based Clustering

With pattern-based clustering, data in the same cluster normally should share common patterns. There are numerous pattern-based clustering models due to various definitions of patterns and distance measures. The definition of a pattern could be the correlation between attributes of objects to be clustered. It can also be other commonly used pattern representations in data mining such as itemsets, association rules, sequential patterns, etc. The distance measures used in pattern-based clustering are normally different from the traditional distance measures (e.g. Euclidean distance, Manhattan distance and cosine distance, which are not always adequate in capturing patterns among the objects). Most pattern-based clustering methods only utilize pattern similarity. We incorporate pattern difference and similarity at the same time. We define an objective function that we maximize in order to achieve a good clustering of customer transactions and present a method that groups customer transactions such that patterns represented in itemsets generated from each cluster demonstrate homogeneity within but heterogeneity between representations.

Segmentation-Based Modeling

In statistics and econometrics, there are various models that split the input space (instead of objects to be clustered) according to the observed input variables, and a regression model is fit in each subspace. Typically, a cut in one of the input variables is introduced, and in each