12-factor Agents: Patterns of reliable LLM applications

12-factor Agents: Principles for building reliable LLM applications

The spirit of 12 Factor Apps lives on in the form of 12-factor agents, a project open to public feedback and contributions at https://github.com/humanlayer/12-factor-agents. I'm Dex, a journalist who's been working with AI agents for a while, and I've tried every agent framework out there. In this article, I'll share my journey and the principles that have led me to build reliable LLM-powered software.

I've talked to many strong founders in and out of YC, all building impressive things with AI. Most of them are rolling their own stack, which is surprising given the plethora of frameworks available. I've found that most products claiming to be "AI Agents" aren't as agentic as they seem. They're mostly deterministic code with LLM steps sprinkled in at just the right points to make the experience magical.

Agents, on the other hand, don't follow the "here's your prompt, here's a bag of tools, loop until you hit the goal" pattern. They're comprised of mostly software, and I've set out to answer: What are the principles we can use to build LLM-powered software that is actually good enough for production customers? Welcome to 12-factor agents.

As every Chicago mayor since Daley has consistently plastered all over the city's major airports, we're glad you're here. Special thanks to @iantbutler01, @tnm, @hellovai, @stantonk, @balanceiskey, @AdjectiveAllison, @pfbyjy, @a-churchill, and the SF MLOps community for early feedback on this guide.

The Short Version: The 12 Factors Even if LLMs continue to get exponentially more powerful, there will be core engineering techniques that make LLM-powered software more reliable, more scalable, and easier to maintain. For a deeper dive on my agent journey and what led us here, check out A Brief History of Software - a quick summary here:

We're gonna talk a lot about Directed Graphs (DGs) and their Acyclic friends, DAGs. I'll start by pointing out that...well...software is a directed graph. There's a reason we used to represent programs as flow charts.

Around 20 years ago, we started to see DAG orchestrators become popular. We're talking classics like Airflow, Prefect, some predecessors, and some newer ones like (dagster, inggest, windmill). These followed the same graph pattern, with the added benefit of observability, modularity, retries, administration, etc.

I'm not the first person to say this, but my biggest takeaway when I started learning about agents, was that you get to throw the DAG away. Instead of software engineers coding each step and edge case, you can give the agent a goal and a set of transitions: And let the LLM make decisions in real time to figure out the path

The promise here is that you write less software, you just give the LLM the "edges" of the graph and let it figure out the nodes. You can recover from errors, you can write less code, and you may find that LLMs find novel solutions to problems.

As we'll see later, it turns out this doesn't quite work. Let's dive one step deeper - with agents you've got this loop consisting of 3 steps:

Our initial context is just the starting event (maybe a user message, maybe a cron fired, maybe a webhook, etc), and we ask the llm to choose the next step (tool) or to determine that we're done.

At the end of the day, this approach just doesn't work as well as we want it to. In building HumanLayer, I've talked to at least 100 SaaS builders (mostly technical founders) looking to make their existing product more agentic.

The journey usually goes something like:

DISCLAIMER: I'm not going to talk about MCP. I'm sure you can see where it fits in.

DISCLAIMER 2: I'm not going to talk about MCP. I'm sure you can see where it fits in.

DISCLAIMER 3: I'm using mostly typescript, for reasons but all this stuff works in python or any other language you prefer.

Design Patterns for great LLM applications

The fastest way I've seen for builders to get good AI software in the hands of customers is to take small, modular concepts from agent building, and incorporate them into their existing product