White Paper:
Survey of Analysis Methods Part I
by Rajan Sambandam, TRC
Practical marketing research deals with two major problems: identifying key drivers and developing segments. In this two-part series TRC looks at key driver analysis and segmentation.
Key driver analysis is a broad term used to cover a variety of analytical techniques. It always involves at least one dependent or criterion variable and one or (typically) multiple independent or predictor variables whose effect on the dependent variable needs to be understood. The dependent variable is usually a measure on which the manager is trying to improve the organization’s performance. Examples include overall satisfaction, loyalty, value and likelihood to recommend.
When conducting a key driver analysis, there is a very important question that needs to be considered: Is the objective of the analysis explanation or prediction?
Answering this question before starting the analysis is very useful because it not only helps in choosing the analytical method to be used but also, to some extent, the choice of variables. When the objective of the analysis is explanation, we try to identify a group of independent variables that can explain variations in the dependent variable and that are actionable. For example, overall satisfaction with a firm can be explained by attribute satisfaction scores. By improving the performance on those attributes identified as key drivers, overall satisfaction can be improved. If the predictors used are not actionable, then the purpose of the analysis is defeated.
In the case of prediction, we try to identify variables that can best predict an outcome. This is different from explanation because the independent variables here do not have to be actionable, since we are not trying to change the dependent variable. As long as the independent variables can be measured, predictions can be made. For example, in the financial services industry, it is important to be able to predict (rather than change) the creditworthiness of a prospective customer from the customer’s profile.
Beyond the issue of explanation versus prediction, there are two other questions that help in the choice of analytical technique to be used:
- Is there one, or more than one, dependent variable?
- Is the relationship being modeled linear or non-linear?
In the remainder of this article we will discuss analytical methods that would be appropriate if one or both of these questions is answered in the affirmative.
Single dependent variable
Key driver analyses often use a single dependent variable and the most commonly used method is multiple regression analysis. A single scaled dependent variable is explained using multiple independent variables. Typically, the scale for the dependent variable ranges from five points to 10 points and is usually an overall measure such as satisfaction or likelihood to recommend.
The independent variables are some measures of attribute satisfaction usually measured on the same scale as the dependent variable, but not necessarily. There are two main parts to the output that are of interest to the manager: the overall fit of the model and the relative importance.
The overall fit of the model is often expressed as R2 or the total variance in the dependent variable that can be explained by the independent variables in the model. R2 values range from 0 to 1, with higher values indicating better fit. For attitudinal research, values in the range of 0.4-0.6 are often considered to be good. Relative importance of the independent variables is expressed in the form of coefficients or beta weights. A weight of 0.4 associated with a variable means that a unit change in that variable can lead to a 0.4 unit change in the dependent variable. Thus, beta weights are used to identify the variables that have the most impact on the dependent variable.
While regression models are quite robust and have been used for many years they do have some drawbacks. The biggest (and perhaps most common) is the problem of multicollinearity. This is a condition where the independent variables have very high correlations among them and hence their impact on the dependent variable is distorted. Different approaches can be taken to address this problem.
A data reduction technique such as factor analysis can be used to create factors out of the variables that are highly correlated. Then the factor scores (which are uncorrelated with each other) can be used as independent variables in the regression analysis. Of course, this would make interpretation of the coefficients harder than when individual variables are used. Another method of combating multicollinearity is to identify and eliminate redundant variables before running the regression. But this can be an arbitrary solution that may lead to the elimination of important variables. Other solutions such as ridge regression have also been used. But, if in fact the independent variables truly are related to each other, then suppressing the relationship would be a distortion of reality. In this situation other methods, such as structural equation modeling, that use multiple dependent variables may be more helpful and will be discussed later in this article.
Categorical values
What if the dependent variable to be used is not scaled, but categorical? This situation arises frequently in loyalty research and examples include classifications such as customer/non-customer and active/inactive/non-customer. Using regression analysis would not be appropriate because of the scaling of the dependent variable. Instead, a classification method such as linear discriminant analysis (or its equivalent, logistic regression) is required. This method can identify the key drivers and also provide the means to classify data not used in the analysis into the appropriate categories.
Key driver analyses with categorical dependent variables are often used for both explanation and prediction. An example of the former is when a health care organization is trying to determine the reasons behind its customers dis-enrolling from the health plan. Once these reasons are identified, the company can take steps to address the problems and reduce dis-enrollment.
An example of the latter is when a bank is trying to predict to whom it should offer the new type of account it is introducing. Rather than trying to change the characteristics of the consumers, it seeks to identify consumers with the right combination of characteristics that would indicate profitability.
Multiple dependent variables
As mentioned above, one problem with multiple regression models is that relationships between independent variables cannot be incorporated. It is possible to overcome this by running a series of regression models. For example, if respondents answer multiple modules in a questionnaire relating to customer service, pricing etc., individual models can be run for each module. Following this an overall model that uses the dependent variables from each model as independents can be run. However, this process can be both cumbersome and statistically inefficient.
A better approach would be to use structural equation modeling techniques such as LISREL or EQS. In these methods, a single model can be specified with as many variables and relationships as desired and all the importance weights can be calculated at once. This can be done for both scaled and binary variables.
By specifying the links between the independent variables, their inherent relationships are acknowledged and thus the problem of multicollinearity is eliminated. But the drawback in this case is that the nature of the relationships needs to be known up front. If this theoretical knowledge is absent, then these methods are not capable of identifying the relationships between the variables.
Non-linearity
All of the methods discussed so far have been traditionally used as linear methods. Linearity implies that each independent variable has a linear (or straight-line) relationship with the dependent variable. But what if the relationship between the independent and dependent variables is non-linear? Research has shown that in many situations, linear models provide reasonable approximations of non-linear relationships and thus tend to be used since they are easier to understand. There are situations however, where the level of non-linearity or the predictive accuracy required is so high that non-linear models may need to be used.
The simplest extensions to linear models use products (or interactions) of independent variables. When two independent variables are multiplied and the product is used as an independent variable in the model, its relationship with the dependent variable is no longer linear. Similarly, other non-linear effects can be obtained by squaring a variable (multiplying it with itself), cubing it or raising it to higher powers. Such models are referred to as polynomial regression models and they have useful properties. For example, squaring a variable can help model a U-shaped relationship such as the one between a fruit juice’s tartness rating and the overall taste rating. Other variations such as logarithmic (or exponential) transformations can also be used if there is a curved relationship between the dependent and independent variables.
The methods described above are not strictly considered to be non-linear methods. In real non-linear models the relationship between the dependent and independent variables is much more complex. It is usually in a product form and linearity cannot be achieved by transforming the variables. Further, the user needs to specify the nature of the non-linear relationship to be modeled. This can be a very important drawback, especially when there are many independent variables. The relationship between the dependent and independent variables can be very complicated, making it extremely hard to specify the type of non-linear model required. A recent development in non-linear models that can help in this regard is the multivariate adaptive regression splines (MARS) approach that can model non-linear relationships automatically with minimal input from the user.
Non-linear models are particularly useful if prediction rather than explanation is the objective. The reason for this is that the coefficients from a non-linear regression are much harder to interpret than those from a linear regression. The more complicated the model, the harder the coefficients can be to interpret. This is not really a problem for prediction because the issue is only whether an observation’s value can be predicted, not so much how the prediction can be accomplished. Hence, if explanation is the objective, it is better to use linear models as much as possible.
Artificial intelligence
The title of artificial intelligence covers several topic areas including artificial neural networks, genetic algorithms, fuzzy logic and expert systems. In this article we will discuss artificial neural networks as they have recently emerged as useful tools in the area of marketing research. Although they have been used for many years in other disciplines, marketing research is only now beginning to realize the potential of these tools. Artificial neural networks were originally conceived as tools that could mathematically emulate the decision-making processes of the human brain. Their algorithm is set up in such a way that they “learn” the relationships in the data by looking at one (or a group of) observation(s) at a time.
Neural networks can model arbitrarily complex relationships in the data. This means that the user really doesn’t need to know the precise nature of the relationships in the data. If a network of a reasonable size is used as a starting point, it can learn the relationships on its own. Often, the challenge is to stop the network from learning the data too well as this could lead to a problem known as overfitting. If this happens, then the model would fit the data on which it is trained extremely well, but would fit new (or test) data poorly.
While complex relationships can be modeled with neural networks, obtaining coefficients or importance weights from them is not straightforward. For this reason, neural networks are much more useful for prediction rather than explanation.
There are many types of neural networks, but the most commonly used distinction is between supervised and unsupervised networks. We will look at supervised networks here and at unsupervised networks in the next article. Supervised neural networks are similar to regression/classification type models in that they have dependent and independent variables. Back-propagating networks are probably the most common supervised learning networks. Typically they contain an input layer, output layer and hidden layer.
The input and output layers correspond to the independent and dependent variables in traditional analysis. The hidden layer allows us to model non-linearities. In a back-propagating network the input observations are multiplied by random weights and compared to the output. The error or difference in the output is sent back over the network to adjust the weights appropriately. Repeating this process continuously leads to an optimal solution. A holdout (or test) dataset is used to see how well the network can predict observations it has not seen before.
Recent advances
Several recent advances have been made in key driver methodology. The first of these relates to regression analysis and is called hierarchical Bayes regression. Consider an example where consumers provide attribute and overall ratings for different companies in the marketplace. Different consumers may rate different companies based on their familiarity with the companies. An overall market-level model can be obtained by combining all of the ratings and running a single regression model across everybody. But if we one could run a separate model for each consumer and then combine all of that information, the resulting coefficients would be much more accurate than what we get from a regular regression analysis. This is what hierarchical Bayes regression does and is hence able to produce more accurate information. Of course, this type of analysis can be used only in situations where respondents provide multiple responses.
For classification problems, there have been a series of recent advances such as stacking, bagging and boosting. In stacking, a variety of different analytical techniques are used to obtain classification information and then the final results are based on the most frequent classification of data points into groups in each of those methods. Bagging is a procedure where the same technique is used on many samples drawn from the same data and the final classifications are made based on the frequencies observed in each sample. Finally, boosting is a method of giving higher weights to observations that are mis-classified and repeating the analysis several times. The final classifications are based on a weighted combination of the results from the various iterations.
Variety of tools
This article has touched upon both traditional methods and recent developments in key driver methodology that may be of interest to marketing research professionals. The particular method to be used often hinges on the primary objective — explanation or prediction. Once this determination is made, there are a variety of tools that can be used that include linear and non-linear methods, as well as those that employ multiple dependent variables.
This content was provided by TRC. Visit their website at www.trchome.com.
Other content shared by TRC
Better Questions For Segmentation: Use of MAX-DIFF
by Rajan Sambandam, TRC
Using Maximum Difference Scaling as a method in designing surveys may ensure more useful results in your market research. It is a comparative method based on importance that sidesteps the problems associated with traditional importance scales. TRC explains the mechanics behind this method through a detailed example in this white paper. Read Article »
Database Scoring with Object Based Segmentation
by Rajan Sambandam, TRC
Segmentation created from company databases are often lacking the rich segmentation schemes formed by attitudinal surveys. A new approach is Object based segmentation that uses database variables at the basis for forming attitudinal segments, leaving both markets classifiable with clear demographic segments. TRC compares traditional segmentation analysis with Object based. Read Article »
Asymmetry in Product Features: Use of the Kano Method
by Rajan Sambandam, TRC
The presence or absence of product features strongly affect consumer satisfaction with the design. Comparing these features using asymmetry analysis can help identify satisfiers and dissatisfiers from among the features of a product. The Kano method is similar but results in categorizing each respondent's answers. TRC presents this essential method of deciding new product features in detail. Read Article »
Conjoint Analysis versus Self-Explicated Method: A Comparison
by Rajan Sambandam, TRC
Determining feature importance in a product can be divided into two techniques - top-down methods where a customer evaluates the whole product at once, and bottum-up methods where features are evaluated individually or in sets. The former method, Conjoint Analysis, is more common while the latter method, Self-Explicated Method, is not widely used but has practical advantages. TRC compares the two methods in this white paper. Read Article »
Product Configurator
by Rajan Sambandam, TRC
To help customers purchase the right product, companies often use product configurators - tools that let customers design their purchase before ordering. This method is employed as a market research technique, similar to conjoint analysis but without some of the constrictions. This white paper from TRC explains an appropriate use of the product configurator method. Read Article »
Market Segmentation: One Method, Four Examples
by Rajan Sambandam, TRC
Effective market segmentation requires an understanding of the market and the skilled art of finding the appropriate segments. TRC gives four examples of this method's application with results. Read Article »
How to Measure the Value of a Brand
by Rajan Sambandam, TRC
Brand name evokes an inherent value; finding a way to reliably measure that value is crucial in determining product development. A technique called discrete choice conjoint analysis is described in this paper by TRC. Read Article »
Deriving Value from Research: the Use of Conjoint Analysis for Product Development
by Rajan Sambandam, TRC
Marketing research has been used by firms over the last several decades to provide information for decision making. Over time, increasingly sophisticated statistical methods have been developed and deployed in the service of this goal. This article focuses on one such method - conjoint analysis - and its application to product development. Read Article »
Cluster Analysis Gets Complicated
by Rajan Sambandam, TRC
Segmentation studies using cluster analysis have become commonplace. However, the data may be affected by collinearity, which can have a strong impact and affect the results of the analysis unless addressed. This article investigates what level presents a problem, why it's a problem, and how to get around it. Simulated data allows a clear demonstration of the issue without clouding it with extraneous factors. Read Article »
Identifying Feature Importance: A Comparison of Methods
by TRC
Understanding what customers want is fundamental to the new product development process as well as to the process of keeping existing products fresh and relevant. To be successful in this area we need to be able to correctly identify what features are important to consumers. Feature importance can be measured using a variety of methods of differing effectiveness. In this paper we will deal with the following methods: Importance Scales, Pick data, Pairwise Comparisons, and Max-Diff.
Read Article »
Monadic Price Testing vs. Price Laddering
by TRC
Compares two popular pricing methods to understand the difference in take rate information. Read Article »
New Product Development: Stages and Methods
by Rajan Sambandam, TRC
TRC identifies the best methods for each stage of the product development process, from Idea Generation through Feature Development, Product Development and Product Testing. Read Article »
Understand Choice in Banking: Use of Discrete Choice Conjoint Analysis
by TRC
Conjoint analysis provides incentive for survey respondents to determine which features must not be omitted in their final purchase. The method closely mirrors decision-making in the real world, and as shown by TRC in this white paper, is applicable to many situations including how customers choose their bank. Read Article »
Want better product ideas? Try smart incentives
by Rajan Samandam, TRC
Idea generation from survey respondents is strongly dependent on incentive. Introducing competition strengthens the quantity and quality of creative responses. TRC provides examples of smart incentives in this white paper. Read Article »
An alternative method of reporting customer satisfaction scores
by Rajan Sambandam and George Hausser of TRC
Though customer satisfaction evaluations are widely used, reporting of these scores has varied from one study to another. This is likely the result of each method’s advantages and disadvantages, as well as the personal preferences and habits of the researcher. In this article we review various reporting methods and outline our method with an example. Read Article »
Identifying the Key Drivers of Brand Image
by TRC
Measuring brand image requires looking at direct effects as well as indirect effects of a company's performance. TRC compares traditional multiple regression with SatiscanTM, a method that can review all possible path models. Read Article »
Improving Call Satisfaction: A Case Study
by TRC
TRC presents a case study of analyzing and improving a call center as an on-going data collection process. Read Article »
Improving Claim Satisfaction: A Case Study
by TRC
A case study on applying full-service market research to help an insurance company improve their client satisfaction with claim handling. Read Article »
Non-Response Bias In Survey Sampling
by TRC
Market research accounts for many scenarios to ensure high quality of data. One of the most overlooked problems is non-response bias. TRC describes ways to reduce its effects through survey design and data adjustment in this white paper. Read Article »
Segmentation Success
by Michael Sosnowski, TRC
This paper explains the basic building blocks of the segmentation process and its implementation. Read Article »
Survey of Analysis Methods Part II
by Rajan Sambandam, TRC
This is Part II of a series looking at aspects of practical marketing research: identifying key drivers and developing segments. This content describes specific segmentation methods: cluster analysis, neural networks, self-organizing map (SOM), and mixture models. Included is a discussion on ideas for developing good segments. Read Article »
Validating Satiscan Using A Split Sample Approach
by TRC
TRC's SatiscanTM model is tested for validity using call center data and a split sample approach. This shows that SatiscanTM produces similar models when run on random halves of an energy industry dataset. Read Article »
Satiscan and Regression Analysis: A Comparison
by TRC
The comparison shows the advantages of SatiscanTM, an analytical method from TRC, over regression in identifying the correct and cost efficient action steps. Read Article »
TURF: New Methods for Implementation
by Westley Ritz, TRC
TURF is a long-established and quite useful marketing research tool, but not everyone is familiar with how it works, or with the latest developments that can make TURF even more effective. The purposes of this paper are twofold: (1) to explain the technique and (2) to describe the latest methods for implementation.
Read Article »
Product Configuration: A Research Approach for the Times
by Rajan Sambandam & Pankaj Kumar, TRC
The marketplace has shifted in the last decade with the ability of consumers to configure the product they want. This white paper explains the basics of configuration, an approach that mimics the real world of customer driven product design to obtain insight into consumer decision-making. Read Article »
Product Configuration: Evidence for Effectiveness
by Rajan Sambandam & Pankaj Kumar, TRC
This white paper looks at the examples from one product configuration study, the kinds of information that can be derived and the possibilities provided by statistical analysis. Read Article »
New Product Research: A Dynamic Approach to Feature Prioritization
by Pankaj Kumar, Westley Ritz and Rajan Sambandam of TRC
Feature prioritization is a very common new product research problem. Over the last few years, the most popular technique has been Max-Diff. However, as the number of features increases it becomes difficult to use. Bracket is a tournament-based approach that produces Max-Diff like results and can easily prioritize fifty or more features. Read Article »
Doing More with Less: Getting Greater Value from Mobile Quant
by TRC
What “more with less” means with respect to mobile MR, and examples from traditional online studies to challenge existing assumptions about what will and will not work on a mobile device. Read Article »
How to measure the value of a brand?
by TRC
Knowing how to price your product that you can optimize your ROI is key. This video explains various ways to measure the value of a brand and talks about a discrete choice conjoint technique as a perfect approach to measuring the value of a brand. Read Article »
Product Configuration with Michael Sosnowski
by TRC
Consider a person who wants to buy a personal computer. The customer can select exactly the combination desired, subject to a price constraint. Would it be possible to use such a process for research? Read Article »
How to Improve Your Market Segmentation
by TRC
Bob Hull from TRC talks about a market research technique for market segmentation and ways of improving them. Read Article »
Rich Raquet Market Research Consulting
by TRC
Rich Raquet is introducing TRC, a research & analytics firm, specializing in new product research, conjoint, segmentation, brand equity, sat & loyalty. Read Article »
[Video] Building Realistic Models of Choice in Practice Part 1 of 4
by TRC
Most choice research and modeling studies in practice fail to account for variations in choice processes. We discuss some of these issues in this presentation and show how we have made advancements in choice research design and modeling through some practical examples. Read Article »
[Video] Building Realistic Models of Choice in Practice Part 2 of 4
by TRC
Most choice research and modeling studies in practice fail to account for variations in choice processes. We discuss some of these issues in this presentation and show how we have made advancements in choice research design and modeling through some practical examples. Read Article »
[Video] Building Realistic Models of Choice in Practice Part 3 of 4
by TRC
Most choice research and modeling studies in practice fail to account for variations in choice processes. We discuss some of these issues in this presentation and show how we have made advancements in choice research design and modeling through some practical examples. Read Article »
[Video] Building Realistic Models of Choice in Practice Part 4 of 4
by TRC
Most choice research and modeling studies in practice fail to account for variations in choice processes. We discuss some of these issues in this presentation and show how we have made advancements in choice research design and modeling through some practical examples. Read Article »





