Beyond the Checkbox: Why Identity Data Needs a Reboot in Market Research

When nuance is lost to speed, insight suffers. This call to action challenges research norms and urges deeper, more inclusive approaches to identity.

Beyond the Checkbox: Why Identity Data Needs a Reboot in Market Research

Editor's Note: I recently saw a thread unfold with MRxPros, after Yogesh Chavda shared a New York Times article he'd written on how Americans are increasingly rejecting rigid racial and ethnic labels. As my friend and colleague Susan Griffin pointed out, "As an industry we have to get better at embracing and understanding the marvelous complexity of humans that defies segmentation in these blunt categories around ethnicity."

In full agreement, I 
invited Yogesh to expand on the thread in an article for Greenbook and share it here as a call to action for our industry. I'm grateful to the many industry colleagues who shared their lived experiences in the original thread and am thrilled Yogesh included some details of the thread in his piece. (Thank you for jumping at my request, Yogesh. As we say at Greenbook, together we are indeed co-creating the future of insights.)


Where are you from? It’s never been a simple question for me. Every time I answer this question, people’s eye glaze over and I can sense their regret in asking me the question. 

Born in Bahrain to Indian parents, I’ve lived in 7 countries, now teaching and consulting in the U.S., I live in a space that doesn’t fit neatly into demographic categories, whether it’s the US census survey or any market research survey. And yet, every time I fill out a form, I’m asked to flatten that complexity into a checkbox.

I was quoted in a New York Times “How Do You Self-identify? For Many Americans, Checking a Box Won’t Do.” about identity boxes, sparked by the controversy surrounding Zohran Mamdani's college application. I knew the piece might stir debate. What I didn’t expect was the avalanche of response it would trigger — on the New York Times Comments section and within the market research community. What followed was nothing short of extraordinary: multicultural, multigenerational, multi-opinionated. And it revealed one thing clearly: the way we collect and use identity data is due for a serious rethink.

The Problem with Boxes

Let’s start here: Race and ethnicity checkboxes are trying to do too much. As one reader of the Times piece put it (Ryan, from Brooklyn): “This single question is attempting to capture three different things that, while correlated, aren't in lockstep: (1) your ethnic backgrounds it has shaped you personally, (2) how your community may have been historically disadvantaged via various racist laws, and (3) how other people are likely to see you, which may affect how they treat you.” 

It’s not just overly simplistic, it’s misleading. That mismatch has real implications when the same checkboxes show up in research screeners, segmentation studies, or sample quotas.

Identity is fluid. Culture is layered. And yet our tools remain rigid, built for a time when a few blunt categories were seen as good enough.

Voices from the Market Research Community: A Richer Mosaic

MRxPros is a group of market research/insights professionals who meet up weekly online. The response from MRxPros community was diverse in its opinions, as well as rich in their individual stories. Their perspective, is reflective of what consumers are likely facing as well. Here are just a few of the stories that reflect the diversity of identity, and the challenges of asking ethnicity or race questions:

  • Zontziry shared that while her mother is Latina (and by extension, so is she), she was raised during a time when cultural assimilation was the norm, not the exception. Though her mother maintained elements of their heritage, Zontziry’s childhood was largely defined by American cultural markers: spaghetti dinners, fried chicken, and English as the household language — except when her mom was angry, when Spanish would suddenly reappear. As a result, checking the “Hispanic or Latino” box on forms always felt a bit disconnected from her lived experience until she realized that claiming that identity could be advantageous on college applications, and later, when embracing her heritage became something to be proud of. Now, watching her own child grow up, Zontziry reflected on how her daughter was surprised to learn she was Latina, a moment that made her question not only how her daughter sees herself, but also what she’ll choose when she’s asked to check those same boxes in the future.

  • Janet reflected on moving from the UK to the U.S. and how race is framed very differently between the two countries. In the UK, terms like “Black” or “White” weren’t used on forms in the same way. Instead, class, education, and accent were far more defining of identity. She challenged the utility of the “White” category, calling it hollow and meaningless. As she put it, "White tells us nothing except maybe who burns in the sun." It flattens a spectrum of cultural and ancestral histories  (e.g., Irish, Italian, German, Scandinavian) into one amorphous label, erasing meaningful distinctions along the way. She questioned whether we should even be asking demographic questions at all, especially when the data isn’t actionable or meaningful. She pointed out that we never ask about eye color or sexual preference on forms — why do we cling so tightly to ethnicity and race? She suggested moving away from collecting profiling data for its own sake and toward understanding how people see themselves and what drives their decisions. In her words: “Time to stop collecting data that profiles people for no benefit, and start understanding individuals as they perceive themselves.”

    Janet also observed that while descent used to be tracked (e.g., German-American, Irish-American), those distinctions have long since faded in white populations. Now everyone is just “white,” despite vast differences in cultural upbringing, values, and heritage. She questioned why we still do that for newer immigrant populations.

  • Sequoyah made it clear: removing race from research is a luxury some can afford, but many cannot. For African-American communities like hers, race is always a factor, whether they choose it or not. It affects how they're seen, treated, and served. In her words, “removing race from a survey is a privilege I personally don't have.” She also highlighted that nuance exists within racial categories: for instance, the cultural and behavioral differences between Black Americans descended from enslaved people and Black immigrants from Nigeria or the Caribbean. Without acknowledging those subtleties, we risk treating Blackness as a monolith, another kind of statistical erasure. And perhaps most importantly, she made a case for Black researchers leading research about Black communities. Methodology matters. But who is asking the questions matters just as much.

  • Pricilla, a dual U.S. and Hong Kong citizen, described how identity questions routinely erase her reality. Though ethnically Chinese and fluent in Cantonese, she was born during Hong Kong’s British colonial era and has never lived in China. She holds both a British (BNO) passport and a Hong Kong passport, but not a Chinese one. Growing up in the U.S., forms that asked about ethnicity usually offered checkboxes like “Chinese,” “Vietnamese,” or “Korean”—but never “Hongkongese.” That left her with two bad choices: Check “Chinese,” which lumps her in with mainland China despite key differences in language, education, migration patterns, and political history. OR, check "Other" and write in “Hongkongese” which effectively makes her statistically invisible due to small sample sizes. As she put it, "Neither option helps researchers capture the nuances of our identity."

    And this isn’t just her story. It a story shared by many Americans with heritage from Hong Kong, Taiwan, or Macau. In these cases, the complexity isn’t about race. It’s about the politics of recognition. Lumping Hongkongers with mainland Chinese doesn’t just flatten, it misrepresents. And in a dataset, misrepresentation looks a lot like erasure.
  • Begonia offered 2 distinct perspectives, both of which highlighting the complexity and the necessity to get identity right in surveys. She highlighted how Gen Z and Gen Alpha are the most multicultural, multiethnic generations in U.S. history, with many identifying as second-generation Americans. They fluidly blend individualistic and collectivist values depending on the context (home vs. society), creating what she called "omnicultural" identities. She emphasized that acculturation models—especially generational ones (e.g., 1st gen = foreign born, 2nd gen = born in the U.S. to immigrant parents)—can help researchers understand behavioral shifts and cultural nuance. But she also warned: using acculturation alone risks missing the full picture. It must be layered with self-identified culture, immigration history, and context. She also drew attention to the oversimplification of Hispanic identity in most research frameworks. She gave examples like: White Cubans and Afro-Caribbean Hispanics being grouped under the same “Hispanic” banner, Mexican-Americans, Argentinians, and Central Americans being treated as interchangeable, and lage California communities (e.g., Persians, Israelis, Armenians) who are technically “white” on forms but behave and identify very differently than Anglo-European whites.

    Begonia explained that language spoken at home, generation status, and immigration pattern are often more revealing than “race” or “ethnicity” alone when trying to understand behavior, cultural values, or media preferences. Begonia called the “Other” checkbox a “digital dustbin”, a place where nuanced identities get discarded or ignored. She has seen firsthand how open-ended responses get under-coded or excluded in analysis, rendering communities invisible in the data that drives business and policy decisions.

"As an industry we have to get better at embracing and understanding the marvelous complexity of humans that defies segmentation in these blunt categories around ethnicity."

- Susan Griffin, Principal at

  • Kayte offered a firm and passionate critique of the term “Other” in identity questions, whether in screeners, surveys, or demographic forms. She urged researchers to stop using “Other” entirely, emphasizing its psychological impact. “Other” doesn’t just fail to capture nuance, it others people in the most literal sense, reinforcing marginalization. Kayte referenced research and shared a video from the IDEA Council that explored this issue, noting that the harm applies not only to SOGI (Sexual Orientation and Gender Identity) questions but to all forms of identity data, including race and ethnicity. She highlighted the real-world consequences: every human interacts with surveys at some point in life, not just during job applications or formal research. So when poorly designed questions show up, they perpetuate damaging norms. Kayte now replaces “Other” with “Prefer to self-describe” in every demographic question she edits, whether or not a client approves. “It might seem minor,” she wrote, “but it can make a huge difference for someone who needs to feel seen.”

  • Josh, a gay, white-identifying male researcher, reflected on how identity is not static, and how forms often fail to capture that complexity. He described the way he code-switches between different versions of himself depending on the context: at work, socially, with family. This fluidity is a big part of his lived experience, yet surveys and forms force a false fixity, reducing him to one checkbox that ignores his multidimensional self. He acknowledged the privilege of being able to “pass” as straight or conforming, which he doesn’t want to do. Most poignantly, he framed the issue as a matter of human dignity, quoting Emerson: “To be yourself in a world that is constantly trying to make you something else is the greatest accomplishment.”

  • Susan spoke about her own family: her sister, who is white American, married a man born in India. Their children (her nephews) are technically biracial. These biracial kids are now growing up, getting married, and having kids of their own. Her question was: “What ethnicity will these grandchildren be?”

    That seemingly straightforward question is at the heart of this entire debate. Susan pointed out that the current labeling system can't keep up with how mixed and multi-dimensional American families are becoming. She also noted that even if you could assign someone an "ethnicity," it tells you nothing about how they behave as consumers, citizens, or humans to “others” in their ethnic category. And what does this mean for static segmentations, that profile consumers? 

  • Layla, a Canadian-born researcher with Palestinian and Dutch heritage, shared how ticking identity boxes has always felt like a negotiation. Growing up with darker skin than most of her peers but a shared cultural context, she constantly questioned how to classify herself. Over time, she noted that the act of naming got marginally easier —especially after changing her last name from “Mallouk” to “Shea” through marriage. But still, every form prompts a moment of pause: “Why are they asking? What will they do with this information?” She emphasized how meaningful it is when surveys allow more than one selection because, as she put it, “we don’t all fit nicely into boxes.” That quiet validation of complexity can make a respondent feel seen instead of sorted.

Together, they painted a picture that was complex, contradictory, and deeply human. In other words: real.

The Real Work Begins With Discomfort

If you’re feeling uneasy reading this, good. That’s not a bug — it’s the point.

Because the real danger in market research isn’t bad data. It’s the illusion of clean answers where none exist. It’s the unchecked use of tools that flatten, reduce, erase — all in the name of simplicity, speed, or statistical neatness.

The stories shared here aren’t just edge cases. They’re signals. Cracks in the foundation we’ve long treated as neutral. And if we’re honest, many of us have helped normalize these limitations, not out of neglect, but out of operational momentum. When the brief is tight, the timeline tighter, and the dataset needs to be clean by Friday, nuance is the first thing to get left behind.

It’s a moment to stop and ask harder questions.

  • What blind spots are baked into our current research architecture?

  • Where are we trading speed for substance—and calling it insight?

  • And how might we lead brands and teams toward a future where identity is engaged, not just classified?

Because if we agree that identity drives behavior, loyalty, voice, and choice, then flattening it into a checkbox isn’t just bad research. It’s a missed opportunity.

cultural insightsmulticultural research

Comments

Comments are moderated to ensure respect towards the author and to prevent spam or self-promotion. Your comment may be edited, rejected, or approved based on these criteria. By commenting, you accept these terms and take responsibility for your contributions.

JS

Janet Standen

July 23, 2025

Great stuff Yogesh. You are masterful at crystallizing the story in shared thoughts. Your Marketing / AI students at University of South Carolina are lucky to have you!

Disclaimer

The views, opinions, data, and methodologies expressed above are those of the contributor(s) and do not necessarily reflect or represent the official policies, positions, or beliefs of Greenbook.

Sign Up for
Updates

Get content that matters, written by top insights industry experts, delivered right to your inbox.

67k+ subscribers