Tracked LLM Calls

Use kitaru.llm() with model aliases, transported runtime config, and optional secret-backed credentials

kitaru.llm() lets you make a single tracked model call with automatic:

prompt artifact capture
response artifact capture
usage/latency metadata logging

If you want the full setup path from stored credentials to an actual flow run, start with Secrets + Model Registration.

Model selection order

When you call kitaru.llm(), Kitaru resolves the model in this order:

the explicit model= argument
KITARU_DEFAULT_MODEL
the default alias from the effective model registry in the current environment

If KITARU_DEFAULT_MODEL matches a registered alias, Kitaru resolves that alias. Otherwise it treats the value as a raw provider/model string.

When you submit or replay a flow, Kitaru automatically transports your local model registry into the execution environment. That means remote runs can still resolve aliases with kitaru.llm() and kitaru model list. If KITARU_MODEL_REGISTRY is already set in the runtime environment, its aliases and default alias take precedence over matching local entries.

Register a model alias

kitaru model register fast --model openai/gpt-5-nano --secret openai-creds

You can also register an alias without a linked secret:

kitaru model register fast --model openai/gpt-5-nano

List aliases with:

kitaru model list

kitaru model register writes aliases to local Kitaru config, but submitted and replayed runs automatically receive that registry as a transported runtime snapshot. KITARU_MODEL_REGISTRY is available as an advanced manual override for adding aliases or overriding matching ones.

Supported providers

Built-in runtime support covers:

openai/* — OpenAI models (requires kitaru[openai])
anthropic/* — Anthropic models (requires kitaru[anthropic])
ollama/* — local Ollama models (requires kitaru[openai], no API key needed)
openrouter/* — OpenRouter meta-router (requires kitaru[openai])

Ollama and OpenRouter use the OpenAI-compatible API, so they share the kitaru[openai] extra — no additional packages needed.

Credential resolution order

For built-in providers that require credentials (OpenAI, Anthropic, OpenRouter), Kitaru resolves credentials in this order:

provider credentials already present in the environment
the secret linked to the resolved alias
otherwise, fail with a setup error

That means environment variables win over a linked secret for known providers.

Ollama does not require credentials (local server). Use OLLAMA_HOST to point to a non-default server address (default: http://localhost:11434).

Environment-backed setup

export OPENAI_API_KEY=sk-...

Secret-backed setup

Store provider keys in a Kitaru secret:

kitaru secrets set openai-creds --OPENAI_API_KEY=sk-...

When an alias includes --secret openai-creds, kitaru.llm() loads that secret at runtime if the required environment variable is not already set.

Call `kitaru.llm()` inside a flow

from kitaru import flow
import kitaru

@flow
def writer(topic: str) -> str:
    outline = kitaru.llm(
        f"Create a 3-bullet outline about {topic}.",
        model="fast",
        name="outline_call",
    )
    outline_text = outline.load()
    return kitaru.llm(
        f"Write a short paragraph using this outline:\n{outline_text}",
        model="fast",
        name="draft_call",
    )

Flow-body kitaru.llm() calls are durable call boundaries. Use .load() when you need the text in flow-body Python, such as composing the next prompt. If you pass a checkpoint or LLM output into a downstream checkpoint, keep passing the original output handle. See In flow bodies for the general pattern.

Advanced options

kitaru.llm() also accepts system=, temperature=, and max_tokens=:

reply = kitaru.llm(
    "Summarize this document in 3 bullets.",
    model="fast",
    system="You are a concise technical editor.",
    temperature=0.2,
    max_tokens=200,
    name="summary_call",
)

Chat-style message lists

Instead of a plain string, you can pass a chat-style message list:

reply = kitaru.llm(
    [
        {"role": "user", "content": "Draft a release note headline."},
        {"role": "assistant", "content": "Kitaru adds durable replay controls."},
        {"role": "user", "content": "Now make it shorter."},
    ],
    model="fast",
    name="headline_refine",
)

Each message must include role and content keys. If system= is provided alongside a message list, Kitaru prepends a system message automatically.

When to use `kitaru.llm()` vs your own client

kitaru.llm() is designed for simple text-in/text-out model calls. It handles credential resolution, prompt/response capture, and usage tracking automatically. Built-in runtime support covers openai/*, anthropic/*, ollama/*, and openrouter/* models.

kitaru.llm() requires a provider SDK to be installed. Install with pip install kitaru[openai] (also covers Ollama and OpenRouter), pip install kitaru[anthropic], or pip install kitaru[llm] for both.

For advanced patterns — tool calling, structured outputs, streaming, vision inputs, or multi-turn conversation management — use your provider SDK directly inside a @checkpoint. You still get durable checkpointing and replay; you just manage the model interaction yourself:

from openai import OpenAI
from kitaru import checkpoint

@checkpoint
def agent_step(messages: list[dict]) -> str:
    client = OpenAI()
    resp = client.chat.completions.create(
        model="gpt-5-nano",
        messages=messages,
        tools=[...],  # tool calling, structured output, etc.
    )
    return resp.choices[0].message.content

For a full example of a tool-calling agent built this way, see examples/coding_agent/.

Tool calling and structured output support for kitaru.llm() is on the roadmap. For now, use your provider SDK directly inside checkpoints for these patterns.

Runtime behavior by context

Inside a flow (outside checkpoints): kitaru.llm() runs as a synthetic durable call boundary.
Inside a checkpoint: it is tracked as a child event; the enclosing checkpoint remains the replay boundary.

What Kitaru records

Each kitaru.llm() call records:

prompt artifacts
response artifacts
token usage
latency
credential source metadata (environment or secret)

Example in this repository

uv sync --extra local --extra llm

# Register an alias (with or without a linked secret) before running the example.
uv run kitaru model register fast --model openai/gpt-5-nano
uv run examples/llm/flow_with_llm.py
uv run pytest tests/test_phase12_llm_example.py

If you want the full credential-backed setup path first, start with Secrets + Model Registration.

For the broader catalog, see Examples.

Tracked LLM Calls

On this page