Hot Fraud Summer: What 4 Billion Surveys Reveal About Data Risk

by Karen Lynch

Head of Content

Worried about survey fraud? Rep Data’s Steven Snell shares 2025 trends, what’s changed, and practical steps researchers can take to stay ahead.

Check out the full episode below!

Listen to the episode

Survey fraud is having a moment—and not the good kind. Karen Lynch sits down with Steven Snell, PhD, Head of Research at Rep Data, to unpack findings from the forthcoming State of Fraud 2025 initiative, which analyzed 4.1B+ survey attempts.

Steven explains how Research Defender detects evolving tactics (from hyperactivity spikes to location spoofing and batch fraud), why inattentiveness isn’t the same as fraud, and what varies across B2C vs. B2B and by region.

Most importantly, he shares a pragmatic three-part playbook for researchers: better design, always-on fraud prevention, and principled data cleaning. If you care about data quality, respondent trust, and keeping your insights credible, this conversation will help you stay one step ahead—and a lot smarter than the fraudsters.

Key Discussion Points:

Fraud vs. inattention: distinct problems that often get conflated
“Hot fraud summer”: hyperactivity surges and what drove them
Regional patterns: diverse, tech-enabled fraud in large markets vs. volume tactics elsewhere
B2C vs. B2B: duplicate entries vs. compound/batch, and why incentives matter
The playbook: user-friendly survey design, proactive fraud defense, and pre-defined cleaning rules

Resources & Links:

Rep Data
Research Defender — fraud prevention platform
Webinar recap: “The State of Fraud” Q&A highlights
Research on Research hub (methodology deep dives)

You can reach out to Steven Snell on LinkedIn.

Many thanks to Steven Snell for being our guest. Thanks also to our production team and our editor at Big Bad Audio.

Transcript

[00:00:10] Karen: Hi everyone. Welcome to another episode of the Greenbook Podcast. I’m Karen Lynch hosting this episode. Super excited to be here, but I want to give a little disclaimer before we begin about what we’re talking about today, and then introduce you to our guest. You know, if you haven’t been paying attention, survey fraud is one of the most persistent threats to our industry, to the credibility of research right now. There’s been a stressful several months, I would say, since some scandals have come out, and I’ve been talking to the team at Rep Data, really since May, about some of the work they’re doing in this space. They have now published their “State of Fraud 2025” report—or maybe it hasn’t been published yet, but I’ve seen a pre-copy—so we can talk about that. More than 4 billion survey attempts worldwide were scanned for this initiative. And Steven Snell, who’s the head of Research at Rep Data is here to join us to talk a little bit about what they uncovered in this report. We’ll talk about the methodology, we’ll talk about the implications, but really, the big caveat is, what does the industry do next? So, first of all, Steven, welcome to the show.

[00:01:15] Steven: Thanks so much, Karen. It’s great to be here to talk about such an icky topic, right [laugh]?

[00:01:21] Karen: Icky top—just, it’s—yes, it’s one of those things where it could make us really crazy if we think too much about it because it’s infuriating, right? It’s infuriating that fraud exists. So, before we get too into that, though, why don’t you—you know, I briefly just said, you know, you’re Head of Research at Rep Data, but why don’t you just take a minute and kind of level set for everybody, your role, kind of what Rep Data is known for, your role there, and then we’ll get into some other stuff.

[00:01:45] Steven: Sure. So, Rep Data specializes in quantitative data collection. We own Research Defender, which is a fraud prevention platform. We can talk more about what that looks like. But I actually joined Rep Data because I wanted to work on Research Defender. I was really excited about the prospect of getting my hands dirty, so to speak, in terms of finding fraud and staying on top of what fraud looks like at any given time. It’s evolved a ton. Prior to joining Rep, I was at Goldman Sachs for a few years. I ran a global survey research consultancy at the firm. I still say at the firm; it’s a little bit of my trauma from being [laugh] in a bank, [unintelligible 00:02:25] at Goldman Sachs. Prior to that, I ran a global survey consultancy at Qualtrics. I helped to stand up the Research Services Division. When I arrived, Qualtrics was already reselling Panel, but we really built out that full-service side of the house. And before that, I am a recovering academic. I was at Duke, running—you can guess—a survey consultancy. So, my background is very much in survey methodology, questionnaire design, sampling. I went to school originally to become a pollster, and I really specialized in how people think about survey questions, how they answer them, what that means, and I’ve brought sort of that interest in survey methodology to a career now in market research.

[00:03:06] Karen: I—thank you by the way. I love that. It’s so interesting when people bring themselves to life a little differently than you know what you can read on paper. So, I love that intro. And I am really curious, when you say, you know, you joined the company to really work on Research Defender, I’m curious about what it was that kind of brought you to that. Is it a personal philosophy? Is it a—did you get a thorn in your side there? Like, what was it that made you say, “You know what? I want to work in this space, in this area of the survey business?”

[00:03:34] Steven: Yeah, I think it was just, I worry so much about data quality generally. And I want to be really clear, fraud is just one piece of that, but it’s the piece that, for me, was missing because a lot of data quality begins with survey design and asking the right questions to the right people in the right way, having as low a respondent burden as possible. We want to make surveys easy to answer, and you get better data quality when you do. That’s all the stuff that I felt like I really understood, and I was expert in from my training, from my time at Qualtrics and Goldman. I really love that stuff. But the part that was always a little bit more opaque that I wanted to understand better is we’ve been in this system now for several years where people are aggregating sample from different sources. And you know, you can’t say everybody’s doing it, but almost everybody’s doing it. And there is sometimes a lack of transparency about when that happens, and I think there’s even more a lack of transparency about why it matters. And so, you know, when I was at Qualtrics, obviously we were aggregating a sample; Qualtrics doesn’t own a panel. They’re doing some interesting things now with synthetic but again, they don’t own a traditional panel asset. And so, when you are a consumer of panels, as so many of us are, and you might be a consumer of many panels all at once, what’s going on? And for me, that’s why I wanted to work on Research Defender was to better understand what’s going on in all these panels that we’re working with.

[00:05:05] Karen: Yeah, and I want to just take a minute and go back to kind of what you explained about the data quality issue. Because as you’re explaining how fraud is one part of the data quality issue—and fraud is what we’ll be getting into momentarily—it just dawned on me, there’s so much that is in your control when it comes to data quality as a researcher, and then there’s what is not in your control. And it seems to me that that is actually one of the discernment tasks that a researcher has today, is okay, what can I do because there’s so much that I’m not in control of when it comes to fraudsters. So anyway, I just felt like sharing that and seeing if that resonates with you because that’s what happened in my brain when you were talking.

[00:05:47] Steven: Totally. It’s a big problem, and I don’t want to spend too much time. I love to spend time on it, but people are like, “Whoa.” Because it’s like, if we are writing bad surveys, if we’re writing really long surveys, if we’re not writing tight screeners to ensure we have the right people, if we’re not doing our diligence as researchers, we’re always going to get bad data quality. And it is going to be, you know, regardless of the panels we’re using the fraud detection, we’re going to get bad data quality. And I do think—and again, I don’t want to get beat up for this or dragged in the comments—but you know, you can have all the fraud prevention in place, but you can get bad data quality because of the research design. And there is an inclination right now to call anything we don’t like fraud. And the fraud is out there, it’s real. What I show in a lot of my research on research is that fraud and inattention are separate constructs. There’s some overlap. There’s some fraudsters that look inattentive, there’s some inattentive respondents that are fraudsters, but there’s a lot distinct, too. And as researchers, we can help people be more attentive. We can give them a better survey experience. But separately and right we should—

[00:06:57] Karen: Yes. This is a big yes-and [laugh].

[00:07:00] Steven: Yes, and—right?—and we should worry about fraud.

[00:07:03] Karen: Yeah. So, let’s talk about this report. So, “State of Fraud 2025” which feels like the grand title that we all need right now because it’s been a year, we first started talking about this around the time of IIEX North America when I became privy to the fact that you were doing this research. And we talked about when it was going to be, you know, fielded, completed, and here we are. So, why don’t you tell a little bit about, kind of, the “State of Fraud” Report, the initiative that you took, and tell everybody what we’re going to talk about for the rest of our time.

[00:07:37] Steven: Absolutely. So, I’ve done a lot of research on research where the way that the Research Defender, our fraud detection platform works is we scan, we score, we block. We see people coming in, and we have a bunch of checks we’re running via API. We’re looking at the technology that people are bringing to bear on the survey. We’re looking at every knowable signal about their device, their browser, what we know about them as a respondent, we’re looking at how they engage with an open-ended response—excuse me, an open-ended question, we’re looking at how hyperactive they are across projects and panels. So, we’re using all of those signals to make a decision in real time: should they be allowed in or not? You know, I always make… I make a lot of bad jokes, Karen. You should know, I have a lot of kids and I’m very much, you know, tell dad jokes, but I say just pick your Willy Wonka movie adaptation of preference. It’s like, good nut, bad nut, good egg, bad egg, whatever it is, that’s what Research Defender is doing. We see the traffic coming in, and we decide in real time, should they be allowed in, good egg, good nut, or not? And I do research on research where I purposely let these bad nuts in because I want to quantify how bad is the problem and how weird are the fraudsters. So, all the research on research I’ve presented at iIIEX, at other places, webinars I’ve done in the past, have been about my research on research. But it occurred to me earlier this year that we are sitting on a mountain of not survey data, but paradata: data about how the data are collected. And we have this for millions of survey attempts a day and billions a year. And so, we don’t have their survey responses because those belong to our clients, but we have all sorts of information about the disposition of the respondent: what tech are they bringing to bear? Are they hyperactive? Did they engage with our pre-survey open-ended question in a reasonable way? And so, that’s when we really got the ball rolling earlier in the year. Last year, we scanned more than 3 billion survey attempts. And I thought, well, how many are we going to scan this year? Between January 1 and the end of August, we had scanned 4.1 billion-plus survey attempts. And I thought, well, if instead we go one step up the funnel, so to speak, we’re not going to look at survey responses, but we’re going to look at these survey attempts, what can that teach us about fraud? And the answer is, a lot.

[00:10:06] Karen: A lot [laugh]. So, what were you hoping to uncover? And then we’ll talk about what you did uncover. Like, did you have hypotheses going into this? Did you have just kind of a gut, like, “I really want to see this in action,” or what?

[00:10:22] Steven: Yeah, you know, I think I had a pretty curious nature going into this. I didn’t have a ton of hypotheses, per se, but I did want to know, like, what are the differences over time? You know, January through the end of August, you know, that’s eight months of data. Have things changed in the past eight months? The answer is yeah, but of course, it’s yes, right? I mean, the way that we think about how technology has changed, just thinking about eight months ago, some of our work processes, how different they are, of course, fraud has changed. I also wanted to know, are there big regional differences? And there are. I expected maybe a little bit more fire there, just based on the smoke, but there are some important differences. And we can get into that. But really, I wanted to know, what are the differences across time—over time—and across place? And there’s some interesting findings, for sure.

[00:11:13] Karen: So, let’s get into some of those findings, right, you know, kind of big picture. And I want to talk about regionality, I want to talk about changes over time. But what was the most surprising thing you saw, or the thing that really, you know, kind of opened your eyes and said, wow. Because there’s always some of that in an initiative with 4 billion data points, I’d think [laugh].

[00:11:33] Steven: Yeah, yeah. I think that one thing that just really stands out is the diversity of fraud tactics. You know, we talked about looking at the tech profile of the respondent, and there are some really exotic flags, right? And these are things that just people… you know, normal people, you and I, maybe we use a VPN for work, right, and that’s the example that a lot of people like to give me, but there are a lot more exotic flags that we see all the time. Developer tools. I mean, if we start to think about sort of the less exotic to the more exotic, you know, maybe there’s folks at Greenbook that are using developer tools on a regular basis. It’s weird to have developer tools open when you’re taking a survey. You’re probably looking for some hidden logic and things within that survey. You know, but then we start to see emulator usage and web proxies and really strange things that you don’t necessarily see in an individual survey. If you are running a survey of a thousand people, maybe you have one or two people who are using an emulator, but if you have 4 billion scans, all of a sudden, you’re looking at millions of people—or at least hundreds of thousands of people—using these more exotic forms of, you know, tech-enabled fraud. And so, that is something that really stands out, just as we look at the 4 billion scans. Now, that means that for any given survey, it’s probably less than 1% of their attempts, but these things—and these users—are smart. If they see a vulnerability, also they can recalibrate an attack. So, that’s why, you know, I think it’s important to evaluate and stay on top of these even, sort of, more fringe threats.

[00:13:21] Karen: Yeah. One of the data points from the report, and I want to start to dig into other types of fraud that you saw as well because, so interesting. I think people, most people, are not as educated into the myriad of types of fraud that we’re talking about here. So, one of the highlights is that a third of survey attempts are fraudulent. Another 27% come from inattentive respondents, so… maybe not fraudulent, but not… not what you want, right [laugh]?

[00:13:51] Steven: Not what you want, yeah.

[00:13:52] Karen: Like, you know, maybe their intentions just are really just to get the money, right, at the end. So, talk to me a little bit about how you interpret those numbers, and you know, kind of what your thoughts about them are.

[00:14:04] Steven: Yeah, thank you. So, those numbers actually come from several research on research studies I’ve done now where I have the benefit of the responses, and I can say, all right, maybe a third of these respondents got flagged by Research Defender, but absent Research Defender, would they have been picked up in data cleaning? And the answer often is no. So, that 27 and that 33% number, there is some overlap, but it’s 10% of all the respondents are overlap. So, 17% of the inattentive respondents aren’t fraudsters. You know, 17 of that 27 are not fraudsters. And on the other hand, 23 of that 33 picked up by Research Defender is clean, it looks good. And so, I think the really difficult thing, and it’s kind of where we started, which is not all bad data quality—not all bad data is fraud. I do think all fraud is bad data. And that’s where I’m pretty, you know—I wax evangelical on this point to some degree because sometimes we will find fraud that people say, “Well, it looks okay.” Like, of course it is. Fraudsters aren’t stupid, on average; they’re getting smarter all the time. They know not to speed, they know not to straight line, they know not to give garbage open-ends. So, a lot of the things that survey researchers have used traditionally—and not just in market research, polling and all sorts of user research—a lot of these checks that we’ve had for data quality are really inadequate to address fraud because fraudsters are increasingly giving good-looking responses. In my research on research, it works out to about 70% of fraud gets through data cleaning, which is really problematic. On the other hand, if you just have really good fraud detection, you still need to clean your data because there are people who are qualified respondents who just give garbage responses. They’re actually slightly more likely to speed and straight line and give garbage open-ends, and non-differentiated responses than are the fraudsters because these are real people and they’re not professionalized fraudsters. So, it’s just kind of an interesting thing. I think there’s this idea that fraud and inattention are really one, when in fact, the overlap is a lot smaller, I think, than people realize.

[00:16:17] Karen: Yeah, I heard from somebody recently that they thought the rates of fraud were even higher, significantly higher. And I just want to talk about that a bit because, you know, how do you feel, just kind of hearing that? You know, and this is somebody who, you know, maybe I’ll be able to interview them at some point on air, but they’re on the brand side, and they’re skeptical, they’re suspicious right now. They’ve had some alarm bells going off. So, is it that bad? Are we safely saying, “Well, you know, you’ve got to clean up, you know, you’ve got to clean up less than 40%,” or something? Like—

[00:16:57] Steven: It might be more than 40. I mean, in my research on research, if you take away the fraud and the inattention, you’re left with about 50% of the B2C sample. So, you know, that’s going to vary. And I think the rates at which Research Defender flags fraud varies pretty widely. In any given survey, we’ll block somewhere between 20 and 40% of traffic. And again, we scan, we score, we block before it gets in. And that 20 to 40 holds most of the time. You can imagine some things to make it 20 are, like, when you’re going to a US based B2C audience, and your sampling criteria are pretty loose. That’s going to be a low-value target in lots of ways, and where incentives are lower, we tend to see a little bit less fraud. We can talk about what the fraud looks like. The fraud between high and low incidents looks different. I’ll set that aside. We can come around to it later.

[00:17:55] Karen: [laugh]. There’s so much to come back to.

[00:17:58] Steven: [laugh]. Where you see a lot—

[00:17:59] Karen: And we’ll talk about that [laugh].

[00:18:01] Steven: —totally. When you see a lot more fraud, maybe we’re blocking 30, 40, 50, I’ve seen Defender block rates upwards of 50%, but it tends to be a very narrow audience with a really large sample size. Like, essentially whenever researchers are really straining the ecosystem, we tend not to get droves more qualified respondents; we tend to get a lot more fraudsters and so Research Defender has to work a lot harder. You know, APIs don’t really work hard or not, but you know what I mean. We’re going to see a block rate of 40 to 50%. An example I like to give is we were doing a study about a year ago where we were trying to find 10,000 unique respondents with some targeting criteria in one US state. It’s kind of a middle-sized state, it’s politically very important, which is part of the reason that our client was there. But we were seeing a block rate of, I think, 60%. And the number one block reason code was duplicate entrance because our client was really pushing the upward limit of what was possible in that market.

[00:19:09] Karen: Interesting, very interesting. I want to get into markets, but I also want to go back to the point of, kind of, what shifts you’ve noticed over time, right, and then I want to get into what shifts you’ve seen from region to region. So, we’ll kind of sequence it that way for my own brain [clear throat]. What have you noticed? What is the data revealing about how fraud rates are shifting, how fraud methods are shifting? Is it growing as exponentially? Or is this, like, this has been the case, you’re just quantifying it?

[00:19:41] Steven: Yeah, I will say that there’s been an uptick of fraud through the summer months. It started in May, really took off in June, July, August. It’s starting to table off a little bit. I don’t cover it in the report; the paradata in September, October, but I can see those now, and things have gone back almost to the pre-summer levels. You know, my team, they’re probably tired of hearing it. I call it hot fraud summer.

[00:20:08] Karen: [laugh].

[00:20:09] Steven: The fraud just really took off this summer, and it looked a little bit different. And the big thing that changed is the rate at which we were blocking people for hyperactivity. And so, Research Defender, it has visibility into all the surveys that Rep Data is running. It also has visibility—again, I should say, not into the surveys, but the data collection process—it also has visibility into the data collection process for all of our clients who are using our research desk, which is our self-serve sampling tool. It also has visibility into various panels and exchanges that are using Research Defender directly for fraud prevention. And so, that’s where we get to this 4.1 billion number. So, it’s got this very broad visibility into the survey ecosystem. And one of the really nice things about having such broad visibility is we have a module called ‘activity.’ And the activity module says, how many times have we seen this respondent across all surveys in the ecosystem in the past 24 hours? How many surveys have they attempted just in the past 24 hours? And traditionally—well, I should say, those values range from, like, 0, 1, 2, 3. Those are unique, fresh eyes. You know, if someone’s attempted three surveys, they probably haven’t even taken one because we know the conversion for online surveys is lower than we’d like, right? That they’ve probably been screened out, whatever. So, if someone has fewer than ten, they’ve maybe taken a survey, maybe two surveys in the last 24 hours. If someone’s attempted or been exposed to 70, 80 surveys, ehh, then they’ve probably taken about ten, you know? And that’s probably maybe more than we want them to take. But we see every day people that attempt 200, 300. Not every day, but often we’ll see someone who’s attempted 1000 or more. So, you can imagine that distribution has a really long right tail, you know—for those at home, you know, I’m always trying distributions with my hands—but there’s a long right tail, which is to say, there’s a couple weirdos in every survey who’ve attempted hundreds or thousands of surveys. And that number has gone up considerably from sort of the Q1 into Q2 and started to trail off just a little bit now that we’ve, you know, Q3 into Q4.

[00:22:30] Karen: There are implications—I’m sure the people listening are going to be thinking through the implications of that. Like, you know, like, I wonder if that level of detail and data now will influence when surveys launch. Because that’s interesting to me.

[00:22:44] Steven: It is interesting. I mean, we just saw—and it was a global phenomenon—there was a lot of hyperactivity. At the beginning of the year, hyperactive respondents as a percentage of the fraud we blocked was 4 or 5, 6, points. At the high point this summer, it was 16, 17, 18% of the traffic we were blocking were because respondents had attempted dozens or hundreds of surveys in the past 24 hours. You could see there was this demand on the survey ecosystem. And it wasn’t, like, new people were being empaneled to take those surveys. They were, you know, fraudsters who are heroically stepping in to attempt hundreds of surveys.

[00:23:25] Karen: Interesting. All right, let’s talk about regionality too because I think that’s another really interesting to call out. So, you know, what were you seeing in terms of increased fraud, decreased fraud, or just prevalence of fraud?

[00:23:37] Steven: Yeah, it’s interesting. So, of course, we know that the distribution where surveys are conducted is not even, right? So, much survey research is conducted in the US relative to other markets. We see a lot, obviously, in the UK, Germany, Japan, Mexico, Brazil, there’s sort of markets where a lot of people are surveying. And those markets have slightly different fraud patterns than the smaller markets. What we see is where there’s more demand, there’s a more diverse set of fraud flags. So, in the US, for example, where there’s the most survey research going on, hyperactivity is important, and so are duplicate entrants, people just taking—trying to—attempting the same survey again and again. Those are really important, but relative to the total amount of fraud, they’re much smaller. So, in the US, we see a lot more location-based fraud, people using tech to pretend they’re in the US when they’re not. We see that in other big markets, like Germany, the UK, Japan, where we see a lot of location-based fraud. There’s just a lot more of tech-enabled fraud in these markets. Whenever we move into—again, another long right tail—but when we move into sort of tier-two, tier-three countries where there’s less survey research going on, but there’s still really business critical for lots of companies, the fraud looks different. There’s just a lot more duplicate entrance, there’s a lot more hyperactivity, but we don’t see the professionalization of fraud in those markets as much, you know? We don’t see as much people using emulators and web proxies. We don’t see as much batch fraud, which is a new kind of fraud we started measuring in about May of this year. Those are—

[00:25:22] Karen: And are you seeing those things? Where are you seeing that type of fraud?

[00:25:26] Steven: We see those things, especially batch fraud in the big markets, right, where there’s a lot going on. So, it’s a volume game, you know? In the US, especially for a low value target, so it’s, you know, a gen-pop sort of survey where incentives are low, it’s more a volume game. People are trying to manipulate the survey and take it again and again and again because they’re not getting much from each survey response. So, it has to be, like, the broken vending machine where you’re getting Cheetos, Cheetos, Cheetos, Cheetos, right? And so, we started detecting batch fraud in May, and we’ve rolled that out now. And batch fraud is where we get… essentially, we see an attack of a survey where we might get 30 entrants in a minute or two, and the entrants are identical in terms of the survey responses that they provide our client, but the markers for the digital fingerprint are incrementally different. So, people are manipulating, sort of, elements in their user agent, and we’re talking by one letter or one number. And they’re getting in and essentially, like, sort of copying and pasting, so to speak, their responses to the entire survey, but just manipulating one piece of their user profile—or, I should say, their user agent—to be able to get into that survey again and again and again. And that has—again, it accounts for less than 1% of the fraud we block, but it’s only happening in the US and a couple of top markets really because it’s a volume game, and that’s why they do it.

[00:27:05] Karen: Yeah. So, let’s talk about the differences in the types of studies: B2B, B2C, specifically because I think this is also really interesting to ponder, and knowing that our audience spans both, or serves kind of both, a business and consumer market. It stated in your report that B2C was having higher rates of duplicate entries and B2B research was showing more compound and batch fraud. Talk to me a little bit about that. That makes me go, “Mmm?” [laugh].

[00:27:35] Steven: Yeah. So, the compounds are an interesting thing. The compounds are… I guess I should position this and say there are some things that we’re looking for in someone’s tech setup that are deterministic. If you use, you know, an emulator, web proxy, developer tools, those are highly correlated with an intent to commit fraud, and so if we see you using those things, you’re more or less blocked. There are some other things, like VPNs and other sorts of configurations that are more probabilistic. And so, a VPN alone isn’t going to get you kicked out, but if we start to see strange combinations—and this is important because we do see this compound search, excuse me, this compound block in our B2B audience—you’re not going to get kicked out for just using a VPN, but if we see you using VPN with some other weird stuff, that’s essentially what the compound block is. And so, we see with the B2B and the high value targets and more expensive CPIs more of a fraud cocktail.

[00:28:41] Karen: [whispering] Fraud cocktail.

[00:28:48] Steven: Yeah. I don’t know that I want to be associated with that term, but—[laugh].

[00:28:47] Karen: [laugh]. We’ll just put that on our shelf with the other topics [laugh].

[00:28:53] Steven: But essentially, there’s more effort. And it just makes sense that there would be more effort. The incentives are higher. It’s not a volume game, it’s a more precise game. So traditionally, outside of the compound block, we see, with the B2B and the high CPI fraudsters that they are doing more with fraud in other parts of the internet. So, something that we subscribe to, and we maintain lists of known fraudsters, known bad actors. And the lists that we subscribe to, like, the MaxMind list especially, some of these are really meant to pick up fraud in other parts of the internet. And those also light up when we look at this B2B and high-value fraud. Because these are people who are committing fraud in other places, and they’re not going to bother with the very inexpensive B2C survey, but if they see a really highly specialized ITDM, at a company with 2000-plus employees, whatever, like, this is more the professionalized fraudster who’s got a lot of tech in place to commit fraud and is just really looking for a big payout, less often, as opposed to the high volume fraud, which is, you know, lower incentive paid out, boom, boom, boom, boom, boom.

[00:30:08] Karen: It’s so interesting to, and you know, disturbing, but also interesting to think about, like, research fraudsters being linked to, you know, fraud. I’m sitting here thinking, like, we’re talking about bank fraud, identity fraud, we’re talking about fraud. Like, [laugh] and I think it’s really interesting to suddenly picture that being correlated in my mind.

[00:30:28] Steven: Fully correlated.

[00:30:29] Karen: Fraudsters are fraudsters, and they’re going to go big or go home, I guess.

[00:30:34] Steven: Yeah. And we worry about fraud because we worry about research, but fraudsters don’t care about research.

[00:30:40] Karen: They care about—

[00:30:41] Steven: They’re looking for payouts, you know? People ask, like, why do we have so much fraud? Is it because we pay respondents? I don’t think that’s—that’s only half of the equation, right? We’ve been paying respondents forever. I mean, you know, mail surveys used to send $2 bills. It didn’t mean they got more fraud because they were sending mail to people. So no, I think what makes fraud so prevalent now is the combination of paying people and then tech enabling people to get in and get those same incentives again and again and again. So, it’s not just the money. It is the money, to be clear. You know, Vignesh Krishnan, our CTO, he says, you know, “Robbers rob the bank because that’s where the money is.” And you know… he says it to be cheeky. And so, money is the reason we have fraud because—

[00:31:27] Karen: Or the Louvre because that’s where the jewels are [laugh].

[00:31:29] Steven: Exactly. Exactly. But money is only part of it. It’s money and the ability to pretend to be an IT decision maker at a Fortune 500 company. It’s that combination that’s deadly.

[00:31:45] Karen: Yeah. Yeah, it’s really interesting. It brings to mind a lot of questions I have about, you know, things that have been discussed about, how do we fight fraud? How do we combat fraud? Obviously, you know, there are tools, like your own, so you know, that’s kind of a given. Also, I think about the debate about, should we pay people more? And I’m like, gosh, if we start paying participants more, then you know what’s going to happen? Like, my brain is all of a sudden, like, yeah, then more fraudsters are going to be like, “Ooh, better payouts.” So, I don’t know that that’s the solution. So what’s, kind of, your take on, you know, what else can be done? Other than, you know, put programs in place to, you know, clean data, prevent fraud, you know, from a tool standpoint, in the industry. What else can be done by survey designers, survey researchers?

[00:32:29] Steven: Yeah. I mean, I think there’s at least three things that every researcher can do, right? And the first—and then this is the one that people don’t like to hear—but the first is, design really good studies. You know, I get in trouble. Sometimes I’ll feel a little salty, and I’ll say something on, like, LinkedIn or, you know, whatever social media we’re allowed to use, to say, “There’s no such thing as a good survey over 20 minutes.” And people get so mad, like, “Of course there is.” Like, would you take a 30-minute survey, you know? So, the first thing that every researcher can do is design a really user-respondent-friendly survey. Have the respondent in mind. Imagine that they’re probably taking this on a cell phone, imagine that this isn’t the most exciting topic to them. And so, if we consider the respondent and write a really user-friendly, low-respondent-burden survey, we’re already better. The next thing is to have fraud prevention in place. Every survey needs fraud prevention, frankly. If you have incentives, you need fraud prevention. Full stop. And then the third thing is to clean the data. It’s always good to have, going into your data collection, standards of what constitutes high quality data before you go in. A concern is, the data start to come in, you don’t like the results, and then maybe you sort of are putting together sort of post-hoc reasoning to throw out respondents. That’s not great science, right? So, the better thing is, you know, to have standards going into data collection of, like, you know, these are the things that we’re watching for. Maybe there are these questions that we think are incompatible with one another. I’m not the biggest advocate for really strong attention checks. We can set that aside. See point one about respondent experience, right? But if every researcher thinks about designing for the experience, having fraud prevention, and then having standards for data cleaning, I think they’re going to get a much better signal-to-noise ratio.

[00:34:28] Karen: Yeah, yeah. So, remember I told you that, like, you know, at about a certain time I’m going to start to be like, man, we got to wrap. And we are nowhere near done with this interview, but this is the point where I’m going to say to you, like, what else from this report didn’t we talk about? Is there something else we’re missing that’s, you know, a high level that was really interesting data to you?

[00:34:51] Steven: You know, I think one of the biggest findings we did talk about was this hyperactivity. I think it’s really interesting. I think another thing we didn’t talk about that’s in the report is that flags vary by supplier, but every supplier has flags, which is to say that the fraud is pervasive. And we don’t know if it’s because the supplier themselves are, in turn, you know, aggregating and we don’t see it, but it seems like, from what we observe, that there is some amount of fraud—sometimes single digit, often double digit—in every supply that we look at. And so, it’s not to say there’s no such thing as a clean sample, but they’re hard to come by, and so everyone should be vigilant regardless. Our approach is to work with suppliers that we trust, but we don’t trust them to deliver fraud-free. And that’s why we exist, is to put that fraud detection on.

[00:35:51] Karen: Yeah, yeah. Well, that’s a good mantra, actually, so thank you for sharing that. Talk to us about, you know, how people can get this report. What is the plan for distribution of it? What do you have for folks?

[00:36:05] Steven: Sure. So, we’ll be rolling it out pretty widely. Right now, we’ve already done a webinar sort of outlining the main findings of the report that’s available on the Rep Data website. I’m sure you can find it on LinkedIn, but that’s available to folks now, today. We also have a couple of different reports that we’re rolling out over the next few weeks. Those will all be available on the Rep Data website as well. And I’m sure they’ll get pushed out to the social media.

[00:36:32] Karen: Absolutely. All right. Cool. Cool, cool. So, now here’s kind of the big takeaway. You know, we talked about what people could do. You know, Greenbook, we always talk about the future of insights, right? That’s sort of our motto, our mantra, and part of our mission is to make sure that people are ready for it. So, what does the future look like when it comes to data integrity, trust in data quality, and fraud prevention? Like, what does the future of the industry look like to you with that lens?

[00:37:01] Steven: Yeah, you know, I’ll say I’m reasonably optimistic. I spent time this week with a leader of a large research group at a tech company, and she expressed concern that, you know, AI and innovation makes it easier for us to detect fraud, and some of that’s true, but AI also makes it easier for people to commit fraud, which we’re seeing, you know? We’ve built into Research Defender checks for markers of AI-enabled open-end responses, right, for example. So, I’m still reasonably optimistic, but the future, one hundred percent, is going to be very technical. And I don’t think researchers need to be experts in API calls. I don’t think we need to necessarily do those things, but I think we need to be eyes wide open, that we need to be at least as smart as the fraudsters. The fraudsters are motivated by money [laugh]. So, are we, right?

[00:38:00] Karen: [laugh].

[00:38:01] Steven: Truly, right? So, is everybody who’s in this business, and so we need to be at least as smart as the fraudsters, and we need to continue to make investments into tech to stay in front of fraud. And, you know, we, as a company, that is what we do, so of course, we’re going to do it. But I think everybody who’s interested in high quality insights is going to need to think, you know, “What am I buying, and is there a sufficient investment in the tech to be smarter than the fraudsters?”

[00:38:31] Karen: Yeah. That’s the goal. Smarter than the fraudsters. Be smarter than the fraudsters, folks. Steven, thank you. This has been just a great hour talking about this, like, deeply important, critical topic, so I’m so glad you joined us. I’m so glad we dug into this report in this way.

[00:38:48] Steven: Yeah, thank you.

[00:38:49] Karen: Yeah, and I… you know, I can’t wait to just keep learning more. Thank you for doing research on research. I think that that’s something at the heart of every researcher is a curiosity for what’s out there, so thank you for sharing it with us.

[00:38:59] Steven: Absolutely. Thanks for the invitation, Karen.

[00:39:02] Karen: My pleasure, my pleasure. And to everybody else, thank you to our editor, Big Bad Audio, who I always have to thank, Brigette for showing up to produce at all hours of the day. We appreciate you. And to our listeners, thank you for tuning in. We are just, you know, happy to be bringing you these important topics, and like I said, this particular conversation is one we need to be having. So, thank you so much for joining us. We’ll see you next time. Bye-bye.

Listen and watch on your favorite platform