The Irrelevance Window: Why PMs Who Don't Code Are Already Obsolete
Most PMs think AI expansion means more strategic thinking. More roadmap work. More "now we finally get to think three quarters ahead." It's the opposite. In an AI-enabled world, the gap between the PM who talks about…
Product leader & founder, ProductManagerHub
Writes on product strategy, AI decision quality, and PM leadership—grounded in real operating experience, not generic AI takes.
Key takeaways
- Your moat as a PM is understanding user needs—but that moat only compounds if you're the one translating it into implementation decisions others can't replicate.
- Staying hands-on (PRs, evaluation design, interaction analysis) is how you remain indispensable in an AI-driven world; staying distant is how you become decorative.
- The irrelevance window is real and finite; PMs who don't move into the implementation layer lose the asymmetry that made them valuable in the first place.
the window is closing (not opening)
Here's what's actually happening. Model capabilities are leveling up. Retrieval systems are getting better. Infrastructure is commoditizing. What that means in practice: the technology decisions—"should we use GPT or Claude?"—matter less and less.
The decisions that now matter are the ones only you can make.
Which signals should the model watch? What does "right" look like in this specific user context? When the system gets it wrong, what was it missing about the user's actual workflow? These questions don't have generic answers. They're embedded in how your customers really work—the things they don't say in user interviews, the workarounds they've built, the friction that's become invisible to them.
That's the knowledge only a PM who's in the weeds can spot.
A PM three steps removed from implementation—a PM writing tickets, reviewing decks, attending planning meetings—is watching that window close. Because someone else on your team is making those micro-decisions now. Someone else is deciding what the system learns. Someone else owns the interaction loop between user behavior and system refinement.
And every cycle where they own that loop instead of you, they're encoding their judgment into the product. Not your user understanding. Theirs.
the asset was never the roadmap
Stop and sit with this for a second.
What do people actually think a PM's value is? "Product strategy." "Saying what we should build next." "The roadmap." It's almost a joke at this point—how many roadmaps have you watched shift four times in a quarter? How many "priorities" survived first contact with reality?
That's not the asset. That's not even close.
The actual asset is something much harder to quantify: you understand what the user is trying to accomplish. You know why their current solution is broken. You've sat in their workflow long enough to see the places where they cobble together three apps to do one job. You know the cost of that friction because you've watched them curse about it.
In a world of generic AI, that knowledge doesn't matter much. The model doesn't know your user, so your user insight is just context—useful, but not defensible.
In a world of AI baked into your product, that knowledge becomes the actual differentiation. Because now the question is: does your system understand what this user is really trying to do? Does it know what "done right" looks like? Can it predict what the user will do next?
All of those questions live in the space between "we know the user" and "the system behaves accordingly." That's the space a hands-on PM occupies. That's where the moat gets built.
Annnnnd here's the part most PMs don't fully reckon with: if you're not in that space, someone else is building the moat using their judgment about what the user actually needs. They might be right. They might be wrong. But either way, it's not your knowledge getting encoded into the product anymore. It's theirs.
stay in the implementation layer
This isn't about becoming a full-time engineer. It's not a regression. It's a reframing.
You need to stay operationally close to where decisions get made. That means: submit PRs. Not large ones. Small, focused ones. Design the interaction patterns. Write the evaluation criteria for the model. Review the logs and spot the pattern nobody else has flagged yet.
Here's the rhythm that actually works.
You're shipping a feature that uses an LLM to summarize user activity. You have a hypothesis about what "useful summary" means for your user—based on workflows you've watched, conversations you've had, the complaints you've heard. You write that down as evaluation criteria: "Is the summary useful enough that the user acts on it?" "Does it surface information they didn't already know?" "Does it catch the thing they always miss?"
Then you test it with real users. You read the logs. You see the summaries the system generated. You compare them against your criteria.
That's when you spot it: the system is summarizing what happened, but it's not capturing what matters. It's missing the context—the user's goals that week, the project timeline, the person who was supposed to follow up but didn't. The system doesn't know those things because the product doesn't give it access to them.
You don't wait for engineering to figure this out. You spot it. You write a small PR that surfaces the missing context to the model. You test it with users again. You verify it helped.
That's not extra work. That's the actual work. And the moment you step out of that loop, someone else is making those micro-decisions. Someone else is deciding what context matters.
what "staying relevant" actually means (the harder version)
Let's be direct about what's really at stake here.
There are two types of PMs in an AI-driven organization. Type One decides what the product does. When they learn something new—from logs, from users, from their own testing—they can change it. Fast. They can say "I see the problem, here's the fix," and it's live in a few hours because they're operationally close enough to move.
Type Two requested what the product does. They find out three months later that nobody understood the brief, or that the feature solved a different problem than intended, or that it broke something they didn't anticipate. By then the narrative is set. The team has moved on. The window to adjust is closed.
Which type keeps their leverage with leadership? Which type is still making decisions about the roadmap? Which type do you want on your team?
In AI products especially, you need velocity. You need to move fast when something isn't working—when the model is missing context, when the interaction pattern is confusing, when the system is solving the wrong part of the user problem. And you can't move fast if you're three levels removed from where decisions get made.
You can't move fast if you're waiting for PRs to get reviewed by people who don't own the user intent.
The PM who codes—who's in the logs, who's written the evaluation criteria, who's submitted a fix based on user feedback—is operating at a different speed. Not because they type faster. Because they've collapsed the loop between "I see a problem" and "it's fixed." They've removed the translation layer. They've made themselves indispensable because they're the only one moving the needle on user experience and able to explain why.
the three-part play (how to get there without breaking your role)
This isn't a reorg. You don't become an engineer. Your job title doesn't change. You just change where you work.
First: Own one AI feature end-to-end. Pick something real. Not a pet project. Something your team is shipping that uses a model to solve a user problem. Own the evaluation criteria. Not "does the feature work?" but "Does the user act on this?" "Is this faster than their workaround?" "What did we miss?" Write it down. You're going to measure it.
Second: Spend 4–6 weeks in deep user research and logged interaction validation. Talk to users. Watch them use the feature. Read the logs. See which summaries people acted on. See which ones they ignored. See where the system got the user's goal wrong. See where it nailed it. You're not looking for "the feature is good" or "the feature is bad." You're looking for the pattern. Where does the system consistently miss? What context does it need? What does "right" actually look like in this context?
Third: When you spot the gap, write a PR. Or a detailed spec that's a PR in spec form. You've read the logs. You've talked to users. You know what the system is missing. You write a small change—surface more context, tweak the prompt, adjust the signal the model is watching. Get it merged. Test it with users. Verify it helped.
Then repeat.
Your team stops seeing you as someone asking them to build blindly. They see you as someone building alongside them, using live user signal to refine what's actually working. Your judgment becomes embedded in the product because you're making decisions based on evidence, not assumptions.
the risk if you don't (the red flag)
Here's what happens if you stay distant.
You're a PM at a company shipping AI features. You haven't personally validated the model's outputs against real user workflows in the past month. You're waiting for metrics. You're waiting for the feedback surveys. You're waiting for engineering to tell you if it's working.
Meanwhile, a PM at a competitor is reading the logs themselves. They're identifying the pattern. They're writing the spec. They're asking the right questions about what the system should know. Their feature is getting better every week. Yours is shipping, and then six weeks later you find out it's solving the wrong problem.
By then the window is closed.
The narrative is set. The feature is on the roadmap. You can't admit that your understanding of the user need was incomplete because that means you weren't paying attention in the first place. So the feature stays, half-baked. The team moves on. The competitor gets stronger.
You didn't lose because your feature was bad. You lost because you weren't in the loop. You didn't see the gap until it was too late to close it.
the question underneath
So here's what you're sitting with now.
Do you know if the AI feature your team shipped last quarter is actually solving the user problem? Not "is it shipped" and not "are people using it"—do you know if it's solving the right thing?
If you had to explain why the system got it wrong in one specific case, could you? Would you point to something you saw in the logs, or would you be guessing?
And here's the harder version: if your answer to those questions was "I'd have to ask someone else," do you know what that costs you?
Good luck friends.
Want this kind of structure inside your day-to-day product decisions? Use MCP for grounded retrieval, then add Pro for web chat + growth loops.