The AI product value framework built for the era of accountability
The era of "just ship it and see" in AI is over. The board wants proof. The CFO wants a number. And the stat that's followed everyone into every boardroom for the past year, that 85% of AI projects fail because they have little or no P&L impact, isn't moving the conversation forward.
Jennifer Heape, Product Leadership, Voice Agents at DevRev, has been building in this space since 2017. At MXP London this year, she opened her talk by saying she was bored of that stat. Then she spent the session explaining what the conversation actually needs to be.
Her argument: proving value requires building the whole system around the feature, not just picking the right metric.
Value is harder to define than it looks
Product teams know that building costs have collapsed. A competitor can replicate your AI feature in a fraction of the time. So what sets you apart? Jennifer’s answer: precision. Knowing exactly what problem you're solving, for whom, and why it matters to your bottom line.
The bigger complication is that value is no longer a single number. She made this concrete with a customer service example. Containment is up, cost-per-contact is down, and the dashboard is green. But the calls that do escalate are now longer, angrier, and harder to resolve, and your highest-value customers are the most frustrated. Two true numbers. Which one wins?
When that kind of complexity hits, teams retreat to implementation metrics like model choice or evals because they feel concrete. But that's not where value is won or lost. "Without seeing the whole system," Jennifer said, "you default to a safe and generic build. That's why so many AI features kind of feel a little bit the same." Seeing that system is now the product leader's job.
➡️ If you're still working out which metrics apply, read this piece on How to know if your AI features are actually working.
The framework: measurement, governance, adoption
Jennifer's response to this complexity is a three-part leadership framework consisting of measurement, governance, and adoption, arguing they sit at the core of where AI creates value and where it doesn't.
Measure against outcomes, not outputs
Any product leader knows you need to trace features back to outcomes the CFO cares about. The harder question now is which outcomes actually qualify.
Cost and revenue still matter. But AI has widened the aperture. Teams now need to assign metrics to things that don't have obvious numbers yet: workflow compression, decision quality, and for agentic systems, cost per automated decision. These are real value signals but "these metrics don't show up in your existing analytics," Jennifer said. "You have to build the measurement layer for them."
That's the harder implication: measurement has to be part of how the feature is conceived from the start. Product and engineering teams need to be working closely together from the start to understand not just what they're building, but how they're going to know if it's working.
Understanding how you’re going to measure it needs to happen at the product definition stage, not the bit afterwards.”
Revenue from AI initiatives can also be deferred, sometimes across full cycles of innovation before it shows up as a business result. That means teams need to be looking at leading indicators and intermediate signals, not just waiting for the P&L to reflect what they built.
For a practical list of what to track, start with these 30+ AI product metrics that cover both the standard signals and the newer ones teams are starting to adopt for agentic and generative features.
Govern the path to production
Jennifer described governance as "the corridor between the AI lab and getting stuff out in the real world." You own that corridor as a product team, or things go there to stall.
The blockers she sees today aren't model quality. That was the challenge back in 2017 and 2018, when getting conversational AI systems to perform at even a basic level required, in her words, "a lot of smoke and mirrors." That's not the problem anymore. The bottlenecks now are internal: legal, compliance, procurement, finance, and the need for predictable ROI.
The mistake teams make is treating governance as a hurdle to clear at the end of the process. Jennifer's argument is that it should be woven into the production workflow from as early as possible—not for bureaucratic reasons, but for practical ones. It's what lets you move fast when something goes wrong.
She recalled a story when launching an AI product for Lysol. Legal and compliance were built into the team from the start. When COVID hit weeks after launch, completely reshaping the landscape for a product about keeping families safe, they were already one team. They could validate and amend content in real time against fast-moving WHO guidance. No scrambling, no reactive governance under pressure.
"Governance is something you build into a workflow from day one," she said. "It's not a hurdle that you cover off at the end. Having that structure is most valuable when something goes wrong."
This applies just as much to the data layer as it does to legal and compliance. Clean, well-governed data is what lets you see the system Jennifer was describing earlier, and it's what makes your measurement layer trustworthy in the first place.
✅ If your team is building the measurement layer from scratch, we recommend the posts on data governance for product teams and why you need data governance for AI-powered analytics.
Earn adoption, don't assume it
Jennifer saved what she called the most important part for last: adoption. And she started with the obvious.
"AI that nobody uses has no value," she said. "You can put whatever system in place, assign all these success metrics, but if you're not going to achieve adoption, it's just all pointless."
The less obvious point is that adoption exists on a spectrum, and most teams treat it like a switch.
She identified four behavioral modes that users fall into with AI features: rejection, vigilance, apathy, and full adoption. Rejection is the most visible. Vigilance is the dangerous one.
With vigilance, usage looks fine on a dashboard. People are clicking, generating outputs, engaging with the feature. But they fundamentally don't trust what the AI produces, so they're double-checking every result. You're not saving anyone time. You're adding a step. And it all reads as success in your usage metrics.
Value stops being a single number and becomes a system, and a system riddled with tradeoffs.”
Apathy is quieter still. The feature exists. People don't reject it, but they haven't built it into their workflow either. Use is inconsistent, no habits are forming, and there's no compounding value. Sometimes this is a workflow integration problem—the feature isn't sitting in the right place in someone's daily work. Sometimes it's just that no one made the case for it beyond novelty.
"Pure usage data shows you rejection and adoption," Jennifer said. "It doesn't actually show you vigilance and apathy. They look like success on a chart."
What's underneath vigilance, almost always, is trust. And trust is brittle. She cited numbers from high-volume B2C AI deployments that put it plainly:
- 68% of people aren't confident in how businesses use generative AI
- More than half have concerns about the ethics
- 81% believe businesses use AI primarily to save money, with no real benefit to service
That's the baseline. Adoption is something you earn against a skeptical starting position, not a default you start with. And that applies internally too. If the team building around an AI feature doesn't trust it, they won't get the value from it either.
This connects directly to the broader PM challenge of rethinking how you approach AI features. Mixpanel's The PM playbook for AI gets into this in more depth.
The role has expanded, not shrunk
Jennifer closed with a reframe that cut against the anxiety a lot of people in product roles are carrying right now.
She's heard more conversations lately about AI as a threat to the product function than she has about how the role is changing. Her take: that's backwards.
"AI hasn't shrunk the remit of product," she said. "It raised it. It's concentrating it on the role of the highest-value, hardest-to-replicate work."
The role now is to decide where value is, build the system that realizes it, and own the path between idea and adoption. Problem selection, measurement, governance, and adoption, all connected into one coherent system, not just the feature itself.
The hardest part of AI was never intelligence. It’s integration.”
If AI projects are failing at the 85% rate that stat suggests, she argued, it's because somewhere along the way, the hard work didn't get done. Value wasn't defined precisely. Nobody owned the corridor to production. Real adoption wasn't earned. The failure is a product failure. And a product failure has a product fix.
If you want to see how Mixpanel approaches the measurement layer Jennifer described, connecting behavioral data to the business outcomes that matter, explore Mixpanel AI.

