Synthetic Data & Digital Twins – what’s going on?

AI & Restech
March 2026

By, Novema Pte Ltd, February 2026

Synthetic data is transforming the market research landscape. Generated by algorithms that mimic the patterns and structures of real-world data, it eliminates the need to gather fresh survey responses. While synthetic data can be used across many types of datasets, its potential in survey research is particularly compelling: it reduces costs, unlocks insights from hard-to-reach audiences, and because it contains no personal information it is much safer to share.

Findings from the 2024 Asia Research Stakeholder Survey show that 50% of stakeholders have begun experimenting with synthetic data, although most remain in the early stages of adoption. We will re-evaluate this in our 2026 stakeholder survey (watch this space).

Beyond cost efficiency, synthetic data accelerates insight generation. Common applications include ad and product pre-testing, simulating customer journeys, and building synthetic personas to model online behaviour. Some agencies are even developing “digital twins”, virtual replicas of real consumer segments, to analyse behavioural patterns and sentiment.

Digital twins are a dynamic, virtual replica of a real-world entity such as a consumer segment, brands, or market, built using synthetic data but often combined with real data. They can allow researchers to model ‘what if’ situations before making real-world decisions that can be applied to product launches, price changes, new market entry, packaging designs, and promotional strategies.

Agencies can expose digital twins to different creative concepts simulate emotional reactions, message recall, and purchase intent. Other applications can model how consumers move through the purchase path through various touchpoints and even undertake segment deep dives into segments to test messaging, identify unmet needs, and shifts in sentiment.

However, these technologies have their limitations.

Synthetic data is only as robust as the datasets used to train it, which are often grounded in Western markets. As a result, it may overlook cultural subtleties and struggle to reflect the irrational or emotional drivers of consumer decision-making. Adoption rates reflect this imbalance: based on the Asia Research stakeholder survey, 73% of Western researchers report using synthetic data, compared with just 33% elsewhere. Even so, scepticism persists.

Often perceived as a “black box,” synthetic data lacks the transparency associated with traditional survey methods. As one stakeholder in the Asia Research survey commented, “It took years to validate panel surveys internally. Synthetic data is another battle entirely.” These concerns can be mitigated through rigorous validation against real survey data, for example, demonstrating that synthetic outputs fall within acceptable margins of error.

For some clients, the promise of significant cost savings outweighs the trade-offs. The appeal of gaining “80% of the insights for 20% of the price” is strong. Yet this mindset carries risk: the missing 20% may hold the critical nuance that determines whether a product launch or market entry succeeds or fails. A single high-profile misstep could set the technology back years.

Adoption also varies by sector. Public sector organisations tend to approach synthetic data cautiously, while challenger brands are more willing to experiment. Among suppliers, concerns linger that automation could displace analysts and consultants. Survey results suggest that management consultants, in particular, may be more hesitant. At the same time, human-led qualitative research (areas where synthetic data falls short) could become more valuable, positioning deep qualitative expertise as a premium offering.

Ultimately, synthetic data is unlikely to replace researchers; instead, it will enhance their capabilities. Agencies can deploy it to create new services, such as “synthetic boosts” to deepen segment analysis, or to revitalise legacy datasets with fresh perspectives.

As the technology evolves, research teams may become leaner. Synthetic data can take on repetitive, labour-intensive tasks, allowing human researchers to focus on what machines still cannot replicate: critical thinking, empathy, and creative insight.