What I've seen work. What I've seen fail. No frameworks dressed as insight.
Most AI pilots in enterprise start the same way. Someone in leadership sees a demo, gets excited, and a team gets asked to "find use cases." Three months later there's a nice deck and nothing in production.
This engagement was different because we started with a question: where are people doing the same thing more than ten times a week, and hating it?
The answer was three workflows buried in the ops team. First: weekly performance reports. An analyst was pulling data from four dashboards, copying it into slides, and writing commentary. Every Monday, four hours gone. We built a pipeline that pulled the data automatically, generated the narrative using an LLM, and dropped a draft into the team's Slack channel by 7am. The analyst still reviewed and edited — but the job went from four hours to forty minutes.
Second: vendor contract summaries. Every time a new contract landed, someone in procurement had to read sixty pages and pull out the key terms for the legal team. We built a workflow that extracted obligations, renewal dates, penalty clauses, and SLA commitments into a structured template. What used to take a full afternoon became a ten-minute review.
Third: customer escalation triage. Support tickets flagged as escalations were being manually reviewed by a senior manager to decide routing. We built a classifier that read the ticket, assessed severity against historical patterns, and suggested a routing with a confidence score. The manager still made the call — but instead of reading every ticket in full, they were scanning recommendations and approving or overriding. Thirty minutes a day became five.
None of these were glamorous. None of them would make a good keynote. But fifteen hours a week back to the business, running in production within six weeks of starting. That's what AI actually looks like when it works.
Every product org I walk into right now has the same conversation happening somewhere: "How should we be using AI?" Usually it's framed as a technology question. It's not. It's a prioritisation question — and that makes it a product management problem.
Where AI is genuinely changing how I work: requirements analysis. I can feed an AI a messy Confluence page, three Slack threads, and a recording transcript, and get a structured brief back in minutes. It doesn't replace the thinking — I still have to decide what matters — but it compresses the synthesis work dramatically. What used to take me a morning now takes thirty minutes.
Stakeholder communication. First drafts of status updates, decision logs, even tricky emails where tone matters. AI gets me eighty percent of the way there. I edit, sharpen, and send. The time saving across a week is significant when you're managing multiple workstreams.
Data interrogation. When I need to understand what's happening in a dataset before a planning session, AI lets me ask questions in plain English and get answers I'd previously have waited two days for an analyst to pull. I'm not replacing the analyst — I'm arriving at the conversation with better questions.
Where it's still theatre: strategy. I've seen teams ask AI to generate product strategies, roadmap priorities, even OKRs. The output looks polished and sounds reasonable. That's exactly the problem. It pattern-matches against what a strategy document should look like without understanding your market, your constraints, your team's actual capacity, or the politics that will determine whether anything ships. A convincing-looking strategy that nobody believes in is worse than no strategy at all.
The other trap is using AI outputs without editing them. The moment your stakeholders start recognising the AI voice in your documents — the slightly over-structured paragraphs, the generic confidence — you've lost credibility. AI should be invisible in your output. If people can tell you used it, you didn't use it well enough.
The product leaders who are getting this right treat AI as leverage on execution, not a shortcut on judgment. The thinking is still yours. The decisions are still yours. AI just means you spend less time on the admin around those decisions and more time on the decisions themselves.
Every backlog I inherit tells the same story: good intentions, no curation. Hundreds of tickets, half of them duplicates, a quarter of them obsolete, and somewhere in the middle the things that actually matter.
I run the same audit on day one of every engagement. It takes about three hours and it's the highest-leverage thing I do in the first week.
Step one: age check. Anything older than six months that nobody has touched gets moved to an archive column. Not deleted — archived. If it was important, someone would have mentioned it in the last six months. If nobody screams when it disappears from the active board, it wasn't important. They never scream.
Step two: duplicate sweep. I sort by theme and read titles. You'd be surprised how often the same request exists three times, filed by three different people, with slightly different descriptions. Merge them into one ticket with the best description and link the requestors.
Step three: the "why" test. Every ticket in the top twenty gets this question: can someone explain in one sentence why this matters to a customer or the business? If the answer is "because [stakeholder] asked for it" — that's not a why. That's a who. Push back until you get a real reason or move it down.
Step four: size reality check. Any ticket estimated at more than two weeks is not a ticket. It's a project. Break it down or admit it needs a proper brief before it goes near a sprint.
Step five: dependency map. I draw out which tickets are blocked by what. This always reveals the actual bottleneck — usually one team, one approval, or one technical decision that's holding up five things downstream. Fix that and the board starts moving.
After this, what you're left with is a backlog that's half the size, twice as clear, and actually represents what needs to happen next. The team can finally see the work instead of drowning in it.
The pattern is so consistent it's almost boring. A company announces an AI strategy. A cross-functional team is assembled. They identify use cases, run a pilot, build a proof of concept. The demo goes well. Leadership is impressed. Then nothing happens.
Six months later the pilot is still a pilot. The team has moved on to other priorities. The proof of concept sits in a staging environment that nobody maintains. The company announces a new AI initiative and the cycle starts again.
I've seen this play out across multiple teams and business units. The problem is never the technology. It's always the same three things.
First: the pilot was never scoped for production. Proof of concepts are designed to prove something is possible. Production systems are designed to be reliable, maintainable, and integrated into existing workflows. These are fundamentally different briefs with different requirements, different timelines, and different stakeholders. Most pilots skip the boring questions — who owns this when it's live? How does it handle errors? What happens when the data changes? — because those questions make the timeline longer and the demo less exciting.
Second: there's no product owner. AI initiatives get sponsored by a senior leader and staffed by whoever's available. Nobody's job is to take this thing from experiment to operating capability. Without someone accountable for adoption, integration, and iteration, the pilot is an orphan the moment the demo is over.
Third: success is measured in impressiveness, not impact. The pilot that wows in a boardroom is not always the one that saves the most time or money. The most valuable AI applications I've shipped have been deeply unglamorous — data cleaning, report generation, document triage. They didn't make good demos. They made good businesses.
The fix is to stop treating AI as a special category. Treat it like any other product initiative. Define the problem. Scope for production from day one. Assign an owner. Measure outcomes, not outputs. If you wouldn't ship a feature without a product manager, don't ship an AI workflow without one either.
Discovery is essential. I'm not arguing against it. But I've watched too many teams use discovery as a place to hide from the discomfort of making a decision.
The pattern looks like this. The team identifies a problem worth solving. Someone suggests doing discovery. Great — that's the right move. They run user interviews. They map journeys. They synthesise findings. All good so far. Then someone says "we need more data." They run a survey. They do competitor analysis. They build a service blueprint. They present back to stakeholders, who ask good questions, which triggers another round of research.
Eight weeks have passed. The slide deck is beautiful. Nothing has shipped.
The trap is that research always feels productive. You're learning things. You're being rigorous. Nobody gets criticised for being thorough. But at some point, the marginal value of the next interview or the next analysis drops below the cost of not building anything. The team knows enough to make a bet — they just don't want to, because making a bet means you can be wrong.
The fix is putting a timebox on discovery before it starts. Two weeks. Maybe three if the problem space is genuinely new. At the end of that window, you make a call with what you've got. Not a perfect call — a good enough call. Then you build something small, ship it, and learn from real usage instead of hypothetical scenarios.
The best product teams I've worked with are fast at this. They move from "we think this is true" to "let's find out" in days, not months. They treat discovery as fuel for decisions, not a substitute for them.
If your discovery phase is longer than your build phase, something has gone wrong. Ship the smallest version of the thing. The market will tell you more in a week than another month of research ever will.