writing

The 70/30 Method: Building With AI Agents Without Betting the Company on Them

Jun 15, 2026· 4 min read· Roger Stringer

There are two ways AI projects fail, and they look like opposites.

The first is AI theater: a slick demo that wows everyone in the room and never makes it to production, because nobody did the unglamorous 30% — the data, the edge cases, the question of who's accountable when it's wrong.

The second is AI recklessness: handing an autonomous system real decisions with no judgment wrapped around it, then acting surprised when it confidently does the wrong thing at scale.

I've spent the last few years building agentic systems that actually run in production, and the thing that keeps them on the road is a ratio I keep coming back to: roughly 70% of the work goes to AI agents, and 30% stays with senior judgment. The 30% is the part everyone wants to skip. It's also the part that decides whether the other 70% is worth anything.

What the 70 actually covers

The 70 is everything AI is genuinely, unreasonably good at — once it has the right context.

Reading and summarizing messy inputs. Drafting. Classifying. Pulling structure out of unstructured data. Running multi-step workflows that used to need a person babysitting them. Qualifying a lead, routing a ticket, enriching a record, generating a solid first pass at almost anything.

When people say "AI isn't about prompting," this is what they mean. The model is the easy part. The work is getting the right data, in the right shape, at the right moment in front of it. Get that context layer right and agents will handle the bulk of a workflow reliably enough to put in front of real customers.

That's the 70. It's a lot. It's most of the labor. It is not, however, the part you can't afford to get wrong.

What the 30 protects

The 30% is judgment, and it doesn't delegate. It's:

Architecture. What the agent can touch, what it can't, where the boundaries are, and what happens when it fails. An agent with access to your production database and no guardrails isn't a feature — it's an incident waiting for a date.

Data and context design. The difference between a demo and a system is almost always here. Which sources, how they're shaped, how fresh they need to be, and what the agent should do when the data is missing or contradicts itself.

Evaluation. How you know it's working — not "it looked good in the demo," but measured against real cases, with the failure modes named in advance. If you can't tell me how you'll catch the agent being wrong, you're not ready to ship it.

The call to ship. Someone senior decides this is good enough to put in front of customers, and owns that decision. That's not a model's job.

Skip the 30 and you get AI theater or AI recklessness. There's no third outcome.

What this looks like in practice

Take something concrete: an automated lead-qualification system. The agents read every inbound lead, score it against your criteria, enrich it from a handful of sources, route it to the right person, and draft a tailored first follow-up. Hundreds a day, nobody babysitting. That's the 70 — and it's genuinely transformative for a sales team drowning in inbound.

The 30 is everything around it. The decision about what "qualified" actually means for your business. The guardrails so a weird input doesn't fire a tone-deaf email at your biggest prospect. The evals that catch quality drift before a customer does. The human approval gate on the cases high-stakes enough to warrant one. None of that is glamorous. All of it is why the system is still running six months later instead of quietly switched off.

Why the ratio protects you

The 70/30 split isn't a compromise between "use AI" and "don't." It's how you get the upside of aggressive automation without taking on risk you can't see.

The 70 is where the leverage is — let the agents do the volume. The 30 is where the judgment is, and judgment is exactly what a model doesn't have. Keep humans in the loop on the decisions that matter and you can afford to be far more aggressive with automation everywhere else, because you've built the system that catches it when it's wrong.

Most teams get the ratio backwards. They pour their energy into the model and the demo, and treat the data, the guardrails, and the evals as cleanup. Then they wonder why the impressive prototype never became a dependable system.

Build the 30 first. The 70 is the easy part.

If you're trying to get an AI project from "great demo" to "runs in production and we trust it," that's the work I do. Let's talk.