Concepts

Tracing

Flapjack records span-tree traces for every agent turn and runner run. View execution details, LLM calls, tool invocations, and costs.

Flapjack automatically records a structured trace for every agent turn and runner run. Traces provide full visibility into what happened during execution — LLM calls, tool invocations, MCP requests, conditions, and costs.

Data Model

Trace (log_traces)
  └── Span (log_spans)        — one per semantic operation
        └── Evaluation (log_evaluations)  — async quality scores

Trace: Root record, one per agent message or runner run
Span: A single operation within a trace (LLM call, tool execution, etc.)
Evaluation: Optional quality scores attached to spans (heuristic, LLM-judge, or human)

Span Kinds

Kind	Description
`agent`	Top-level agent execution
`runner_step`	A single step in a runner pipeline
`llm`	LLM inference call
`tool`	Tool execution (webhook, custom, etc.)
`mcp`	MCP server tool call
`webhook`	Webhook invocation
`condition`	Condition evaluation in runner flows
`embedding`	Embedding/RAG lookup
`guardrail`	Guardrail check
`chain`	Multi-step chain

Span Fields

Each span includes:

Field	Description
`id`	Span UUID
`trace_id`	Parent trace UUID
`parent_span_id`	Parent span (null for root spans)
`kind`	Span kind (see above)
`name`	Human-readable label
`status`	`running`, `ok`, or `error`
`started_at` / `ended_at`	Timestamps
`input` / `output`	JSONB payloads (truncated to ~256 KB)
`error`	Error details if status is `error`

LLM spans additionally include: provider, model, temperature, max_tokens, input_tokens, output_tokens, cached_input_tokens, cost_usd.

Viewing Traces

List Traces

GET /api/logs

Parameter	Type	Description
`limit`	`number`	Max results (default 50, max 200)
`cursor`	`string`	Composite cursor (`<startedAt>\|<traceId>`) for pagination
`agent_id`	`string`	Filter by agent
`runner_id`	`string`	Filter by runner
`thread_id`	`string`	Filter by thread
`run_id`	`string`	Filter by runner run
`status`	`string`	Filter by status
`q`	`string`	Substring search on trace name

Response:

{
  "traces": [
    {
      "id": "trace-123",
      "name": "Agent turn",
      "status": "ok",
      "started_at": "2026-05-10T12:00:00Z",
      "ended_at": "2026-05-10T12:00:03Z",
      "span_count": 5,
      "error_count": 0,
      "llm_call_count": 2,
      "tool_call_count": 1
    }
  ],
  "nextCursor": "2026-05-10T11:59:00Z|trace-122"
}

Get Trace Detail

GET /api/logs/{traceId}

Returns the full trace with all spans (sorted by execution order) and evaluations:

{
  "trace": { "id": "trace-123", "name": "Agent turn", "status": "ok", "..." : "..." },
  "spans": [
    { "id": "span-1", "kind": "agent", "name": "Support Agent", "status": "ok", "..." : "..." },
    { "id": "span-2", "kind": "llm", "name": "claude-sonnet-4-6", "model": "claude-sonnet-4-6", "input_tokens": 1200, "output_tokens": 350, "cost_usd": 0.0084, "..." : "..." },
    { "id": "span-3", "kind": "mcp", "name": "github:list_issues", "status": "ok", "..." : "..." }
  ],
  "evaluations": []
}

Cost Attribution

Every LLM span records cost_usd, input_tokens, output_tokens, and cached_input_tokens. These roll up into the analytics endpoints for per-agent and per-runner cost tracking.

Next Steps

API: Overview — endpoint reference
Runners — runner pipelines and cost analytics