Technical Discussion of Segmentation and Clustering Methods
This article will be a discussion of segmentation and clustering methods. Finding the best segmentation requires careful attention to strategic goals and is a process of exploring numerous alternatives until the best one emerges.
Technical Discussion of Segmentation and Clustering Methods
What Makes a Good Segmentation Solution?
Segmentation can be the keystone of an efficient marketing strategy, defining audiences and establishing the elements of successful appeals.Finding the best segmentation requires careful attention to strategic goals and is a process of exploring numerous alternatives until the best one emerges.
In addition, segments also should be:
- Stable over a reasonable amount of time
- Recognizable in meaningful ways
- Large enough to be meaningful, as may be obvious.
Segmentation follows a multi-step procedure to ensure reaching the best solutions. Each study may vary somewhat, but these basic elements are essential in reaching a successful solution.
Select candidate variables for the basis of segmentation
- Basis variables are the set that defines the nature of the segments and include the areas in which key behavioral and attitudinal differences are expected to be found.
- Selecting basis variables is the key factor in the study. In extensive reviews of the literature and of many efforts undertaken by different organizations, choosing inadequate basis variables emerges as a leading source of study failure.
- The strategic goals of the study ultimate will inform the sets of basis variables. Setting the frame for a study is critical as too many goals will cause the study to collapse under its own weight.
- Many ways of categorizing segmentation studies have been proposed. None has improved substantially on the typology outlined by Wind and Claycamp (1976), adapted below:
- For studies providing a general understanding of a market:
- Benefits sought
- Needs the product will fill (needs and perceived benefits may not be synonymous)
- Product purchase and usage patterns
- Brand loyalty and switching patterns
- For studies focusing on product/service positioning:
- Product usage
- Product preferences
- Benefits sought
- Unmet needs
- Product, user-, and self-perception
- For studies of new product concepts (and introduction)
- Reaction to new concepts (measures of incentives to buy, preference over current brand, etc.);
- Benefits sought;
- Product usage patterns;
- Price sensitivity
- For studies of pricing decisions
- Price sensitivity, by purchase and usage patterns;
- Product, user and self-images associated with products at different prices;
- Product usage patterns;
- Sensitivity to “deals,” coupons, etc.
- For advertising decisions:
- Benefits sought;
- Psychographics/“life styles”;
- Product-, user-, and self-perceptions;
- Responses to creative elements
- For distribution decisions:
- Store loyalty and patronage;
- Benefits sought in store selection;
- Sensitivity to “deals”
Define the descriptor variables
- These are the variables that will describe and help locate the segments once they are defined, and typically include demographics, media usage, and broad based interests or favorite activities.
Take extreme care in data preparation
- Examine data for any anomalies and clean thoroughly. This may seem self-evident, but segmentation schemes have proven to be highly sensitive to anomalies of any type in the data.
- Screen variables as needed using methods like factor analysis, categorical principal component analysis, discriminant analysis, or specialized measures of variable importance and similarity.
Compare many solutions
- Examine many solutions to ensure that you are reaching the best one. The charts below come from a special program that compares 13 different methods of clustering study participants into groups. We suggest developing an analysis like this for each solution with different numbers of groups (from 3 to 10 groups), and ultimately select the best of all.
- These charts show mathematical comparisons, a crucial first step, but not the only one in selecting a best solution.
- The top chart compares the 13 methods for how well they match all the other methods tried. This is an important measure because a solution that does not match most of the others well is likely influenced by anomalies in the data. On the other hand, the solution that replicates the others best tends to reflect salient patterns in the data.
- The bottom chart shows how well the solutions perform on this measure and seven others, along with the one criterion given most weight in the solution. (This is the “mean F ratio,” an overall measure that summarizes mainly how well groups are separated and how dense or internally coherent they are.)
Find groups that are reachable selectively
- Solutions are good only if they lead to groups that are reachable selectively. If a solution cannot find ways to reach groups efficiently, it is like trying market to everybody, rather than reaching segments effectively.
- Several strong methods for locating segment members have arisen based on work in direct marketing. These methods typically use classification tree analysis (e.g., CHAID or CART), either as a first step or as the final analytical method. These methods are unparalleled in delineating the combinations of characteristics that make a given person more or less likely to belong in a segment.
- These methods take characteristics that can be used to reach the segments, and show how to combine them to define groups that are rich in target segment members.
- These combinations appear in a set of simple “If-then” rules, which require no equations and which are easy to program into databases to “score” people into a segment if they did not participate in the study.
- A “gains analysis” makes targeting simple by arraying the groups from high to low in terms of how strongly the segment is represented in each group. It shows precisely how groups are characterized, and how the groups compare in incidence of the target segment to the overall population.
- For instance, in the first group below, where the target segment has an incidence of 61%, the target segment is 3.62 times more prevalent than overall.
- The cumulative data to the right shows what happens if two or more groups found in the classification tree analysis are selected for targeting. For instance, choosing the first and second group together, the cumulative incidence of the target segment is 3.59 times the average.
- Taking all five groups shown together would lead to a vast boost in efficiency compared with not having the segmentation model. Efficiency is calculated as:
- 2.35 (the boost incidence)/0.28 (because efforts will be expended only among 28% of the population).
- This is 8.4 times (840%) the efficiency of not having the segmentation model, a tremendous gain.
Characterize group differences by analyses, such as discriminant analysis, logistic regression, mapping and AIM analysis
- Mapping includes correspondence analysis, bi-plots and MDPREF, as well as maps from discriminant analysis itself, all of which can provide very valuable visual insights on how groups are different and similar.
- Discriminant analysis and logistic regression hone in on the key differences among the segments, showing what differentiates each group the most strongly from the others, and the strengths of differences among groups on the variables that discriminate among them
- Both methods also generate classification results, showing how well segments are separated, how clearly individuals fall into each cluster.
- AIM is a newer method that shows graphically how groups differ and which variables are most important in differentiating each group from all the others.
Sample of a correspondence map showing key segment differences
- In this map, groups are closest to the statements they agree with most strongly. The map also shows the relative sizes of the groups based on the sizes of the bubbles representing them.
- Groups that are most similar are closest and those that are least similar are furthest apart.
- Statements that apply to two groups about equally fall between them.
- Statements that do not differentiate among groups (that all groups tended to endorse at about the same level) fall into the middle of the map.
- The group near the middle has no distinctive opinions. The central region is shown by the dotted circle.
Looking for More Detail?
This is a brief overview of a truly complex subject. It shows the basics of one approach. Other analysts may have different approaches, and they may work well with certain data sets. Some analysts will endorse a certain method of clustering as always superior, for instance, but our experience—across hundreds of studies—shows that no one method invariably works best. There are many specific methods we could showcase or discuss, and depending on outcomes with the first methods tried, we might need to explore other methods, or add to these.
We can discuss the details in much greater length along with the pros and cons of alternative approaches. The basic aim, though, always remains this:
- To develop groups with different responses to products or services
- To identify these groups in meaningful ways
- And to reach these groups selectively.
- One can reach or approach this goal by many different paths
Some Basic Definitions
Cluster Analysis and Latent Class Analysis
- These techniques group individuals into clusters based upon their similarity of responses.
- Clustering analysis now includes complex algorithms that can analyze both categorical and continuous variables.
- With any method, additional analyses (e.g., discriminant analysis, mapping, etc.) can then used to determine what makes these groups similar and different from one another, and so give a better understanding of what defines these groups.
- Some newer forms of grouping respondents, in addition to latent class analysis: EM clustering, 2-step clustering, fuzzy clustering, Bayesian clustering, clustering from various modeling algorithms and many others.
- These techniques are perhaps most popular among the “tree analysis methods” that split the population into increasingly distinct groups.
- The total population is like the trunk of a tree.
- Based upon some characteristic (such as product use, brand used, etc.), the population is split into a few subgroups that most differentiate the groups, in terms of a dependent variable, like cluster membership. (In the case of 4 clusters, for instance, there would be a variable with 4 categories, one for each cluster. Splitting the sample would lead to some groups in which some segments are more prevalent and others are less prevalent.)
- These few subgroups are like the main branches of a tree. These groups then are further split into smaller branches, producing segments that are increasingly more refined.
- CHAID/CART has tremendously powerful and refined methods for analyzing categorical data.
- It works extremely well in analyzing many demographic variables, such as occupation, ethnicity, region of the country, state and even ZIP codes) and so can show unparalleled power in finding and describing segments in the population.
- This technique produces “perceptual maps” that describe the relationships among groups and characteristics, groups and brands, brands and characteristics, and so on.
- It shows which characteristics (and/or groups and/or brands) go together, which do not, and degrees of similarities.
- Each characteristic becomes a point on a map. The closer two points are, the more closely they go together.
- This technique reduces the correspondence between many characteristics into a manageable and visual two-dimensional relationship of the distance between points.
- Techniques that produce similar maps include bi-plots and MDPREF. Any one of these may work best in a given circumstance.
Discriminant and Logistic “Scoring Models”
- These techniques can develop a “scoring model” or a “predictive model” by using measured attributes (such as demographic information or attitudes as measured by survey questions) to predict some characteristic of interest (such as segment membership).
- The model also shows the relative importance of each attribute and the degree to which it contributes to predicting the characteristic of interest, such as belonging to a segment.
- The technique is valuable in enabling “new” customers to be scored into segments where they most likely belong.
Factor Analysis/Principal Components Analysis
- These closely related techniques discover underlying “factors” or “themes” that were not explicitly measured, but are assumed to underlie sets of variables.
- Each of these factors or components is a weighted combination of many characteristics which are interpreted to define an underlying “construct.”
- Although factor analysis in theory differs from principal components analysis, these two approaches tend to find patterns in variables in identical or nearly identical ways.