Integrating Data into a Holistic View
This article will outline strategies for taking full advantage of market research initiatives from multiple sources in order to create comprehensive action plans that will have real results in real market scenarios.
Market researchers and their clients face several challenges when incorporating data from multiple sources in order to craft a usable set of action plans. Yet without first creating a plan on how best to harvest the findings from multiple data sources, action plans for the organization may end up suffering from analysis paralysis. This puts researchers in the uncomfortable position of having to justify even more research when prior research initiatives sit on the shelf without being implemented. This article will outline strategies for taking full advantage of market research initiatives from multiple sources in order to create comprehensive action plans that will have real results in real market scenarios.
Overview of Issue
As market researchers, we work in various industries and have clients in various market segments. These different industries and segments naturally have different needs; researchers need to be able to respond to these needs effectively in order to maintain and grow our businesses. The one thing we all have in common though is that we all have customers! They may be internal stakeholders in our organizations or they may be external clients who have their own internal customers. Either way, we are regularly faced with the challenge of presenting market research findings to our clients in ways that enable them to reach their business objectives.
Often, this involves integrating results from several sources into a unified package of action plans. Determining what is important from among these disparate sources of findings is critical in developing a plan on how best to integrate the data. Many pertinent findings may be evident among prior projects but not all are necessary to the main objective as the image below illustrates.
This actual sign, located in southern Florida, does an effective job of warning passers-by that caution is needed, that the sign has sharp edges, and that they should avoid touching the edges of the sign. These are worthwhile messages indeed; however, they pale in importance to the footnote on the sign that alerts passing motorists and walkers that the bridge is out ahead! Clearly, the message with the greatest impact is lost in the fine print. As researchers, we also encounter situations where we need to pick through the various findings available to us in order to find the most pertinent information.
This becomes particularly difficult when trying to incorporate findings from several sources. Without developing a cohesive and sensible plan to guide the way, we may find ourselves on the road looking out for signs with sharp edges and fall off the bridge.
While we create this plan, we also need to be aware of avoiding the analysis paralysis “pit of despair.” Organizations still struggle with these issues:
- Harvesting and integrating multiple sources of data, including various forms of unstructured data
- Linking the Voice of the Customer (VOC) to other key business data and metrics
- Using VOC to drive action and improvement
The best analysis plans aim to link VOC input to other feedback mechanisms as part of the progress towards integration. These feedback mechanisms are many and varied.
In addition to linking to other feedback mechanisms, organizations also need to link VOC input to internal data systems such as operational performance metrics, financial data, apps like PeopleSoft, and Customer Relationship Management (CRM) data. No wonder we feel overwhelmed!
Conceptual and Practical “How To”
Finding the true North in this challenging forest of information requires defining which business objectives the research is to support. Priorities must be set from among various choices, such as sales growth, cost savings, or risk reduction. Consideration of the alternative business solutions available must also be incorporated into the action plan. The action implementation plan is not likely to be adopted if the proposed solution is not realistic in the current business environment.
Organizations tend to do fairly well with these first two tasks and then stumble when looking at the challenge of incorporating data from multiple sources. It is easy to understand why when we realize that each different data source may have any or all of the following challenges associated with it:
- How complex is the data? Structured, unstructured,
- How complete is the data?
- Where is the data located?
- What security, quality, and governance issues are associated with the data?
- How disparate is the data? That is, is the data kept in more than one format and are these formats inconsistent with each other?
- How does the data illustrate the organization’s performance on the key business metrics being explored?
Making sense of the different sources of data requires taking an inventory of the available sources of information. More often than not, organizations have most of their information in unstructured formats and less in structured formats. The structured information tends to be heavily oriented towards transaction-based information that focuses on what happened with that customer at a specific point in time.
This information then needs to be sorted into categories of information:
- Financial data about the organization
- Specific data about customers
- Specific data about employees (as to how they interact with customers or have perceptions about customer attitudes)
- Information about operational issues (operational excellence)
Many different sources of client data exist. The diagram below illustrates how one organization parsed their data. They formed four broad categories and then drilled down into the details from there creating a data paradigm suitable to themselves.
The integration process begins with understanding the data, gleaning key findings and patterns from it, and formatting it into a usable structure. First, look for commonalities among the data sets, being as granular as possible. Select a unit of analysis such as company, location sites, individuals, households, customer groups, product groups, or others, just to name a few options. The units of analysis chosen will vary by business objective. For example, if your objective is to increase cross-sales, looking at aggregated purchases for the year will not be as helpful as looking at each purchase event (what was purchased and when).
When you integrate data, you may have holes among certain data elements where data may not have been captured consistently. You will need to determine what level of accuracy will be acceptable to your audience and how best to handle missing data from a statistical point of view. Be prepared to spend some time making the data sets ready for integration!
Each source of closed-ended data has specific identifiable variables. Catalogue the various types of data available from each information source and locate commonalities. The key to these various sources of data is a method of identifying each individual customer in each source of data. For example, in primary research, financial information, customer service inquiries, customer profiles and demographics, information from call centers, and other sources of information, you may find unique customer IDs in each data source. Other sources may be able to be linked by unique client company IDs. Locating these linking variables is critical in being able to effectively merge the data from multiple sources into one analysis file. Be aware that data from secondary research or company culture information will not have specific customer information. These data sources are most useful in providing guidance for business-related decisions.
After you clean and format the data sets separately, you can use a statistical software package with merging criteria to combine the data sets. Any analyses you conduct will likely need to be iterative in order to discover and analyze patterns and trends in the data. At this point you can proceed on to integrating the data, mining it, and setting up your hypotheses, testing them, and analyzing the information.
The challenge is finding the best way for you to identify your customers. Does each customer have a unique Customer Identification Number? This can be alpha characters, numeric, or a combination of the two. Can you tie employee information to specific customers or accounts? If several identifiers are available, does your company have a master set of instructions on how they link to each other? Does each source of data have at least one of these identifiers?
If some sources of data do not have the primary sets of identifiers, what other options are available for linking the sources? For example, how consistently is customer name captured across all sources (i.e., is “Robert” the same as “Bob”)? Is customer segment available in each source of information? Is aggregated customer information by segment granular enough to address the questions you need to answer? Is customer persona available? If not, is it available in enough sources for you to get the information you need?
It is important to remember that companies do not need to invest in a completely new Customer Relationship Management (CRM) system in order to effectively integrate multiple sources of information in the ways we are describing. The primary need is simply a thorough understanding of the information available in each of the diverse sources you do have.
Getting the Answers You Need
Once you have your data gathered and integrated, you are ready to start the process of data mining. These are some of the most commonly used techniques in data mining:
- Artificial neural networks – These are non-linear predictive models that learn through training and resemble biological neural networks in structure.
- Decision trees – These are tree-shaped structures that represent sets of decisions, which generate rules for the classification of a dataset.
- Examples include Classification and Regression Trees (CART) and Chi Square Automatic Detection (CHAID).
- Genetic algorithms – These optimization techniques use processes such as genetic combination, mutation, and natural selection in a design based on the concept of evolution.
- Nearest neighbor method – These technique classifies each record in a dataset on a combination of the classes of the K record(s) most similar to it in a historical data set (where k ³ 1). This technique is sometimes called the k-nearest neighbor technique.
- Rule induction – This involves the extraction of useful if-then rules from data based on statistical significance.
- Complexity science – This technique is based on Chaos Theory of Mathematics and is used to ferret out relationships between unrelated variables.
Let’s examine a simple example involving Net Promoter analysis. Net Promoter results by segment provide an initial look at which segments are most likely to recommend your company to their colleagues. Let’s assume you are using a predictive analytics tool that looks across many of your data sources by segment.
How can you translate these results into a holistic view? Drivers of willingness to recommend your company may vary, not only by segment but also by whether they are a Promoter, Passive, or a Detractor. In order to get a more holistic view of your customers and how you can best meet their needs, examine these key drivers by overlaying them on the various sources of data.
Positive performance can increase willingness to recommend while at the same time, negative performance has the opposite effect. Determining which drivers have the greatest impact, either positively, negatively, or both, will help guide you in best serving your customers.
Linking information from different sources can also assist in identifying areas of concern. For example, if Detractors experience increasingly slow deliveries (from primary research information) the larger their orders are (from financial information), this may explain their unwillingness to recommend the company. Correcting these issues can then be demonstrated to have a quantifiable impact on the company’s bottom line.
Linked information can also be tracked over time to illustrate improvements in both internal and external measures.
Discussion of Four Case Studies
Now we’d like to look beneath the surface and share some actual case studies with you to provide additional illustrations of how integrating data from multiple sources can help to achieve primary business objectives such as growing market share, improving performance, eliminating barriers to growing market share, and reducing churn. We will present one case study for each of these business objectives.