Deterministic Until Proven Otherwise: Building AI Agents That Ship
Why the agents that actually make it to production start deterministic and earn their autonomy. How to build the scaffolding that lets them ship.
Every engineering team is being told to “build AI agents.” Almost no one talks about the hard part: getting from a prototype that wows a room to something that holds up in production and actually moves the business.
This talk lays out a framework for building AI agents deliberately, drawn from real work at Nomad Health, a healthcare staffing company where trust and reliability aren’t negotiable. I walk through the progression from autocomplete to assistant to autonomous agent, and why skipping steps is exactly how a demo that impresses leadership becomes a product nobody actually uses.
The core idea is “deterministic until proven otherwise.” Every piece of agent logic starts as a rule or a structured API call, and only graduates to LLM inference once you can prove the model is necessary. From there I get into the micro-service pattern for agent capabilities, human-in-the-loop as intentional UX rather than a safety net bolted on at the end, and why agents demand a different kind of observability, the kind where you’re watching correctness and not just uptime.
I’ll share stories from production. One is a hallucination we caught only because we had LLM-specific observability in place. Another is how my team ended up demoing Datadog’s LLM Observability to the entire company. It turns out that when non-technical stakeholders can see the decision traces and confidence scores behind an agent’s choices, an AI experiment becomes something people trust. You’ll leave with a framework for shipping agents that earn that trust from your users, your stakeholders, and your own team.

Matt Littlehale