Agent Engineering Lab · 12 modules

Twelve labs, one engineering story

Each lab is a focused playground for one concept in production AI agent engineering. Seven share the canonical credit-order-eligibility scenario at /tools/agent-lab (rendered as different lenses, deep-linked from the cards below); four are dedicated routes for material that needed its own surface area. One is documentation only because the topic is operational, not visual.

  1. Foundations

    Messages, schemas, tools, the loop itself, and retrieval.

    5 labs
  2. Retrieval

    Why a single retrieval signal is rarely enough.

    1 lab
  3. Tooling

    Tool discovery and the workflow / agent boundary.

    2 labs
  4. Quality

    Replay, score, and regression-test agent behavior.

    1 lab
  5. Safety

    Human approval and policy outside the model.

    2 labs
  6. Operations

    What production deployment and observability look like.

    1 lab
Why this layout: Labs 2 / 3 / 4 / 5 / 9 / 10 / 11 are different lenses on a single, complete agent run, so they share one route and the canonical scenario — each card deep-links to the right lens via a query parameter. Labs 1 / 6 / 7 / 8 introduce concepts the canonical run does not exercise (the bare LLM protocol, alternative retrieval signals, the MCP tool manifest, and a deterministic-workflow comparison) so they live as siblings. Lab 12 is a short operations doc because production observability and deployment are infrastructure topics, not a single visual demo.