Making Magic Happen with Markov Analytics

First, let me clarify…I’m not talking about MCMC which most of us use. I’m talking about Markov matrices. You will be amazed by the predictive power embedded in a switching…

Making Magic Happen with Markov Analytics

First, let me clarify…I’m not talking about MCMC which most of us use. I’m talking about Markov matrices. You will be amazed by the predictive power embedded in a switching matrix like this one.

From this matrix you can estimate:
1. Each brand’s market share
2. Each brand’s cumulative penetration

In particular, you might be shocked (as I was) to discover you can estimate each brand’s penetration without knowing a brand’s market share.

Here is how it all works.

What is a Markov switching matrix?

A switching matrix is basically a cross-tab of what all consumers bought on their last two purchases (all brands down the side, and all brands across the top). Note that the diagonal represents the repeat rate of each respective brand, and the off-diagonal terms are the switching probabilities of going from one brand to any other.

Estimating market shares

Actually, you are estimating shares at steady-state but most brands in well-established categories are close.

Here’s how. Think of switching matrix, M, as something that can transform a vector of market shares from time t to time t+1 via the equation. M*v(t) = v(t+1).

All switching matrices are square and have eigenvector/eigenvalue structures which is the key.

What are eigenvalues and eigenvectors?

An eigenvector is a special kind of vector that solves the equation, M*v1 = λ1v1 (1),where λ1 is the eigenvalue associated with that particular eigenvector v1. One pair stands out; there is always one eigenvector whose eigenvalue is 1.

Plugging λ = 1 into equation (1), we get M*v = v (2)
In words, you wind up with the same shares you started with when you find this magic vector where v(t) = v(t+1) = v(t+2), etc., which is the definition of steady state. It can also be proven that the steady state shares are independent of current market shares. That is a powerful statement and can be used to spot brands that are likely to trend up or down from their current share.

The table below shows the comparison of Numerator data vs. “eigen-predicted” market shares for brands of frozen pizza (actuals from Numerator receipt scanning data).

Table 1: predicted vs. actual shares (10 months of data from 2020-21)

Simulating brand penetration without measuring brand market shares

Most readers are familiar with the principle that brand penetration and market share are strongly correlated but how can penetration be estimated without knowing who the big vs. small brands are? Actually, it IS possible to predict penetration for each brand with high accuracy just by knowing the Markov switching matrix. From the Markov switching matrix, one can construct something called “the Fundamental Matrix”. This is a matrix of waiting times which is based on:

• Creating a new switching matrix by removing the row and column of the switching matrix that contains the brand of interest (conventionally called the Q sub-matrix)
• The Fundamental matrix is then the inverse of (I-Q), where I is the identity matrix
• This then gives the waiting times for the average consumer to leave the competitive set and buy the brand of interest.

Here is how well this worked.

Table 2: Predicted vs. actual penetration

Frozen pizza brand   Predicted 10 month penetration    Actual 10 Month penetration
1                                            41.5%                                                        37.9%
2                                           45.1%                                                        39.7%
3                                           28.4%                                                       29.9%
4                                            15.2%                                                       14.5%
5                                            12.9%                                                       13.5%
6                                            13.5%                                                       14.5%
7                                              7.8%                                                        8.5%
8                                              5.5%                                                        6.9%
9                                             10.7%                                                      11.4%
10                                             7.6%                                                       7.7%
11                                             8.5%                                                     10.6%
Source for actual data: Numerator receipt scanning

I first saw this trick used for calculating R naught for Covid. There are infected classes and non-infected classes and this linear algebra method was used to estimate how long it takes for someone to stay in the set of infected classes. (I then found a marketing paper from 1962 by Dr. Ben Lipstein, a genius I had the pleasure of knowing, that did the same thing!)

In marketing analytics, the waiting time of interest is how many purchase cycles does it take on average for the set of competitive brands to “send their customers” to your brand? Then if we know how long the average category purchase cycle is, we can calculate the half-life of waiting times which can be converted into cumulative penetration.

Related

How to Interpret Standard Deviation and Standard Error in Research

Why do repeat and transition probabilities lead to accurate penetration estimates? Think of balls in a box in an arcade game bouncing around due to air flowing from the bottom of the box where there is a hole at the top. The balls will bounce around inside the box but eventually, a ball will randomly bounce out of the box. If the hole is larger, that will happen sooner.

When repeat rates are high (i.e. lots of brand loyalty), it’s like the hole is small. For smaller share brands, the hole is small, For large share brands, the hole is large. It can be proven mathematically why this MUST be but that is a bit more than I can share in a blog.

By the way, continuing the metaphor, I think you can consider the force of airflow as marketing activity. The higher your advertising and promotion budget the more forceful the air flow.

Like movies and books have plots and themes, the plot here is prediction via Markov-based linear algebra but the theme is the importance of repeat rates which is the main controller of a brand’s market share and its penetration. Forget what Byron Sharp and Les Binet tell you about penetration and broad reach marketing. It’s all about the repeat rate.

All trackers should contain questions to get at the switching matrix using either direct questioning or a constant sum question. You should even estimate this for each given need state or ethnic group because all brands are small brands in some context which is your route to unlocking growth.

Light bulbs turn on when you think like a Markovian!

Comments

Comments are moderated to ensure respect towards the author and to prevent spam or self-promotion. Your comment may be edited, rejected, or approved based on these criteria. By commenting, you accept these terms and take responsibility for your contributions.

Disclaimer

The views, opinions, data, and methodologies expressed above are those of the contributor(s) and do not necessarily reflect or represent the official policies, positions, or beliefs of Greenbook.

More from Joel Rubinson

When Stat Testing Is like a Head Fake
Research Methodologies

When Stat Testing Is like a Head Fake

Stat testing can mislead business decisions—small differences on low-margin items may look “significant,” while profitable results on high-value goods...

The Paradox of the Paradox of Choice
LevelUP Your Research

The Paradox of the Paradox of Choice

Discover how to navigate consumer choices effectively. Learn to leverage behavioral cues and refine ad targeting to enhance brand visibility and drive...

How to Improve Ad Attentiveness Measurement
LevelUP Your Research

How to Improve Ad Attentiveness Measurement

Explore the hierarchy of advertising effects, from impressions to sales. Discover how consumer attentiveness and relevance drive effective marketing s...

Are You Using Synthetic Data for Analytics?
LevelUP Your Research

Are You Using Synthetic Data for Analytics?

Explore the use of synthetic data to bridge the gap between sales and ad exposure data. Learn how it can enhance targeting and validate ad effectivene...

Sign Up for
Updates

Get content that matters, written by top insights industry experts, delivered right to your inbox.

67k+ subscribers