Chapter 2: | Segmenting Customer Transactions Using a Pattern-Based Clustering Approach |
This is a limited free preview of this book. Please buy full access.
Table 2. Features of a Session/Transaction
Categories | Metric | Definition |
Time-related | Total time | |
Average time per page | Total time/# of pages | |
Average time per site | Total time/# of sites | |
Average time per category | Total time/# of categories | |
Starting time | ||
Starting day | ||
Most visited site | ||
Most visited category | ||
Quantity-related | Number of pages | |
Number of sites | ||
Number of categories | ||
Average # of pages per site | # of pages/# of sites | |
Average # of sites per categories | # of sites/# of categories | |
Order-related | First site | |
Second site | ||
Last site | ||
First category | ||
Second category Last category | ||
Others | Whether or not visited a certain category | 0 – no visit |
(total 27 categories) | 1 – at least one visit |
Page: individual Web page, each hit is a page; Site: domain name, such as, www.yahoo.com; Category: such as “travel site”, “news site” etc.
news, finance) in a focused manner such that the total time spent is low. Another common pattern for this (same) user may be {starting_time = night, most_visted_category = games}, reflecting the user’s typical behavior at the end of the day. Here, we treat each attribute-value pair as an item (e.g., starting_time = night). A set of attribute-value pairs is treated as an itemset (e.g., {starting_time = night, most_visted_category = games}). A frequent itemset is an itemset that occurs in a large number of transactions. In order to capture the typical behavioral patterns in Web transactions, we use itemsets as the representation for patterns. In general, we assume that the items in the itemsets can involve both categorical and numeric attributes, as described in the examples above,