AI Promises vs. Research Reality: Five Lessons From the Front Lines

AnswerLab

Max Symuleski

February 10, 2026 · 12 min

We all know the potential for AI in customer research is enormous. But at AnswerLab, we're focused on closing the gap between what we thought AI would do and what it can actually deliver. We've seen plenty of wins as we've integrated AI into our research capabilities. There have also been some lessons learned as we've helped clients recalibrate their AI expectations against real-world implementation.

When it comes to AI adoption, the stakes for research teams are high. Organizations want to move fast, but moving fast without intention creates risk — risk to data quality, to client trust, and to the integrity of the insights themselves. From our extensive experience integrating AI across research workflows, we've identified five key areas where the promise of AI diverges from the reality of implementation — and what we've learned about navigating the difference.

Five Lessons From the Front Lines of AI in Research

Promise: Encourage teams to dabble. Breakthroughs will follow.

Reality: Pilots are important. But they rarely make a dent without rethinking core processes.

At AnswerLab, AI is now integrated into about 60% of research workflows — up from just 10% a year ago. That didn't happen because we encouraged random experimentation and waited for magic. It required rethinking the way we build research artifacts like templates, guides, and reports to make them AI-ready.

One of the most effective things we did was create a shared prompt library. Researchers use it constantly, and it's become a living resource. We built the first version internally, then refined it with team feedback. The result is a set of prompt templates for common research tasks — drafting end-of-day reports, building thematic outlines, creating discussion guides. For new users, it lowers the barrier to entry. For advanced users, it's a useful starting point. In other words: dabbling only works if you capture and scale the best of it.

We've also run internal workshops by research vertical — like social media and gaming — to share what's working and where AI struggles. The combination of grassroots experimentation and structured knowledge-sharing has been critical to moving from scattered dabbling to real adoption.

Promise: AI enables one brain across the organization.

Reality: Data silos and safe-data agreements keep walls in place.

The vision of a single, seamless AI "brain" that unites teams is appealing, but it doesn't reflect reality. Some AnswerLab teams live in Google-only environments. Others have the flexibility to utilize a diverse range of tools and systems. Clients often dictate what's safe and allowable, which means every project comes with its own rules.

We've learned that negotiation matters. The only way to push through silos is to prove to clients that the tools we're using are secure, which means enterprise-grade accounts, strict safeguards, and clear data policies. Even then, it's a moving target. One client's comfort zone might be ChatGPT, another's might be Google-only, and another's might exclude both.

This creates extra work. We're not just building AI workflows — we're constantly educating our internal teams about what tools are safe for which clients, and why. And yes, breach headlines in the news heighten concerns and can even create a little paranoia. Transparency and documentation are the only ways to keep everyone aligned.

Promise: Build it yourself and own the edge.

Reality: Internal builds can't keep pace with vendors.

We've tried both the "buy" and the "build" model. There are cases where building makes sense, especially when cost is an issue. For example, transcription of in-person sessions is a recurring challenge. One of our favorite off-the-shelf tools can separate speakers accurately, but at $60 per 60-minute session, costs pile up fast. Running the same audio through Google APIs costs about $6. That's the kind of tradeoff where a lightweight in-house solution is worth it.

But overall, vendors are advancing faster than any internal team can. Take NotebookLM, which is purpose-built for research. Within months of launch, it solved problems we had spent almost a year trying to address with our internal tool, like linking quotes directly to source documents. Vendor speed blew up our roadmap overnight.

That's the reality: internal builds are valuable for specific, repeatable needs, but they can't keep up with the velocity of vendor innovation.

Promise: Enterprise-approved equals best.

Reality: Approval provides safety, but limits freedom.

Enterprise licensing protects client data — and that matters. But it can also frustrate the teams working hardest on AI innovation. Researchers might want to try Perplexity or another promising tool, but if we don't have an enterprise license that guarantees security, the answer is no.

That's a hard line we have to draw. Early on, we told teams: experiment all you want, but don't put sensitive client data into unapproved tools. As time went on and more options became available, we adopted enterprise versions of ChatGPT and Gemini, opening those tools up for client data. Guardrails are a fundamental requirement.

The tension is real. We want to encourage experimentation, because that's where true innovation comes from. But we also need to be able to look a client in the eye and say, "Your data is safe." That means balancing curiosity with discipline, and documenting the rules so no one is left guessing.

Promise: AI can replace human depth.

Reality: Models are fluent, fast, and convincing — but shallow.

One of the biggest risks is that AI outputs feel smarter than they are. Spend a few hours iterating with a model, and it's easy to get mesmerized. The language is plausible. The flow is polished. Even when it doesn't read like AI slop, slick phrasing doesn't make up for thin substance. And when you fact-check, hallucinations creep in. We've seen this most clearly with quote-finding. If a model generates a "participant quote" that never existed, trust breaks down instantly — and every output needs to be verified.

That's why peer review and human oversight are non-negotiable. And it's not just about accuracy. Humans bring context AI can't replicate: the emotional heat of a frustrated user, the visual cues from a product demo, or the client's strategic goals discussed offhand in the room. No prompt can capture all of that.

AI can surface patterns and draft outlines. But only people can add the nuance, empathy, and context that make research truly insightful.

AI Can Scale Output. It Can’t Replace Judgment

Clients should expect transparency about how we use AI — where it makes sense, where it doesn't, and what safeguards are in place. The wiring may stay under the hood, but the standards should be clear. At its best, AI helps us draft and surface patterns. But the depth, empathy, and context that makes meaningful research still comes from people, for the people.

About the author

AnswerLab

Max Symuleski

AI Product Manager

Max is the AI Product Manager and a former Senior UX Researcher at AnswerLab with 10 years of research experience in emerging tech and its social and cultural impacts.