Categories
February 25, 2025
Explore five key truths about sampling, uncovering fraud, low-quality respondents, and transparency issues that have eroded data quality over two decades.
The online panel landscape has undergone a dramatic transformation over the past two decades. The industry's focus on volume, speed, and cost has led to a decline in high-quality double opt-in panels, resulting in the commoditization of sample and an increase in fraud. These changes, often overlooked by client-side researchers, have significantly compromised research quality and, in turn, eroded trust in the industry.
This article aims to empower researchers with the knowledge to demand transparency and accountability from sample suppliers, ultimately driving higher quality standards within the industry.
In the early 2000s, online panels relied on established offline recruiting methods. These methods included collecting detailed identifying and profiling information to verify the authenticity of participants. While the primary goals were to match respondents with suitable surveys and ensure incentives were delivered to a physical address, these methods also served as a means to validate respondents through third-party databases.
By 2010, most firms had shifted to digital incentives and stopped collecting Personally Identifiable Information (PII), as it reduced the appeal of joining panels. Today, an email address is generally sufficient to join, with few panels conducting authenticity checks.
In the past, some suppliers limited the number of surveys a respondent could complete within a specific timeframe. For instance, large CPG clients often mandated a 3-week gap between surveys in the same category.
With an average of 22 survey attempts per day, there are no limits today on the number of surveys a participant can complete daily. A CASE4Quality study revealed that a small subset (3%) of devices accounted for 19% of all survey completions, with 40% of these devices completing over 100 surveys daily while passing all quality checks. Research shows that frequent survey takers can skew results, with higher survey attempts linked to lower brand awareness, higher brand ratings, and higher purchase intent.
In the early 2000s, panel recruitment heavily relied on partnerships with established brands and loyalty programs. Companies like e-Rewards primarily built panels by inviting members through emails, newsletters, and co-branded offerings. This resulted in largely exclusive sample pools for each supplier.
Today's sample landscape is more complex, with suppliers using diverse recruitment strategies like affiliate networks, mobile platforms, and programmatic algorithms. Most suppliers (even those with proprietary sources) have shifted to an aggregation model, sourcing from multiple providers to meet quotas, budgets, and timelines. As a result, it is challenging for researchers to trust the origin of their sample without rigorous vetting and strong supplier relationships.
Human-assisted fraud, including large-scale operations like click farms and smaller efforts by individuals, is a growing challenge in data quality. Unlike the primarily automated "bot" methods prevalent a decade ago, modern fraudsters combine human input with technologies such as browser extensions and form-fill tools. Furthermore, disengaged respondents, while not intentionally malicious like fraudsters, contribute to data quality issues.
Several studies have demonstrated that bad actors can seamlessly integrate into a dataset. Their familiarity with common quality checks and ability to exploit system vulnerabilities make them difficult to detect. Additionally, advancements in AI have made it increasingly difficult for us to detect them.
The online sampling ecosystem suffers from a lack of transparency. Suppliers often make inflated claims about panel size and quality in their marketing materials, making it difficult for buyers to make informed decisions.
Suppliers often exaggerate the size of their panels, leading to the misconception that they have large, highly engaged respondent pools. While panels may advertise millions of members, only 5-10% are typically active, and the actual pool for recruitment can be much smaller. Furthermore, when suppliers claim access to millions of respondents, this often includes from multiple sources, not just their proprietary panel.
The quality and consistency of panelist profiling information can be significantly lower than expected. While it's often assumed that panel providers maintain comprehensive and accurate profiles, several factors contribute to the availability of this information. These include incomplete profiling surveys, high panelist turnover, and changing demographics. For example, many advertised data points have low opt-in rates, with only around 1% of panelists providing this specific information.
Claims about data quality are frequently unclear and lack sufficient evidence to support them. When sample companies conduct research-on-research and publish white papers comparing sample sources, the results often favor their own offerings and provide limited details on methodology. Buyers should view these claims with skepticism and seek independent research for a more impartial evaluation.
Quality pledges often fail to drive meaningful improvements because they lack enforcement. While these formal commitments to uphold data quality standards and ethical practices are generally seen as positive, the absence of accountability often turns them into marketing tools rather than genuine efforts to improve quality.
Over the past two decades, online sampling methodologies have changed significantly, raising concerns about the erosion of ethical standards in the pursuit of profitability. Data quality issues have reached a critical level, compromising the integrity of our research. Only through a commitment to transparency can we effectively address these deeply ingrained problems and restore confidence in our work.
Client-side researchers are increasingly aware of the critical issues plaguing online research and sampling. Passive reliance on suppliers to address these challenges is no longer acceptable. Now is the time for researchers to actively engage and drive meaningful change:
1. Unite and Speak with One Voice: Brands can unite to champion industry-wide and global initiatives that foster transparency and accountability.
2. Proactively ask sample suppliers to provide quality metrics. For each study, brands can demand transparency on key metrics to help them assess the quality of the sample. This includes sources, amount and type of targeted sample, fraud rejection rates, and reasons for terminations. In addition, they should request that any fraud system information be appended to their dataset.
3. Track sample supplier performance across all studies over time. By consistently monitoring these metrics and incorporating in-house data cleaning statistics, researchers can systematically evaluate each supplier, providing a clear view of whether sample quality is improving or declining over time.
4. Help build industry benchmarks. Brands can contribute their data to collaborative initiatives like the Global Data Quality (GDQ) Initiative, or work together to publish regular industry-wide reports on fraud levels. This collective effort will provide valuable insights into the evolving landscape of fraud within the industry, based on real-world data.
5. Ask for evidence-based research from the industry and suppliers: Don’t settle for vague promises —insist on clear, data-driven evidence of sample quality and a strong, transparent methodology.
By addressing the core issues, we can drive meaningful change. Many client-side researchers are already uniting through CASE4Quality to make their voices heard. Together, we can build a research ecosystem grounded in accountability and transparency. Visit www.CASE4Quality.com to learn more and join the movement.
*Special thanks for CASE4QUALITY members: Efrain Ribeiro, Mary Beth Weber, Tia Maurer, Carrie Campbell
Comments
Comments are moderated to ensure respect towards the author and to prevent spam or self-promotion. Your comment may be edited, rejected, or approved based on these criteria. By commenting, you accept these terms and take responsibility for your contributions.
Disclaimer
The views, opinions, data, and methodologies expressed above are those of the contributor(s) and do not necessarily reflect or represent the official policies, positions, or beliefs of Greenbook.
More from Karine Pepin
Learn key principles of content design that enable researchers to distill insights into assets, fostering stakeholder influence and sustainable busine...
Discover the challenge of identifying AI-generated open-ended responses and the potential consequences for researchers and the market research industr...
This article discusses how the online sampling ecosystem favors professional respondents and bad actors. It advocates for a transformative shift towar...
Over the years, significant time and resources have been dedicated to improving data quality in survey research. While the quality of open-ended respo...
Sign Up for
Updates
Get content that matters, written by top insights industry experts, delivered right to your inbox.