Budgets
Agentic runs consume three kinds of resources: money (API calls cost dollars), time (agents can run for hours), and tokens (context windows fill up). Controlling these is harder than it sounds, because limits need to be set at different times by different people:
- At build time, the hank author shapes the internal budget — which steps are expensive, which are critical, how much of the total budget each phase should get.
- At run time, the operator sets the envelope — “I have $5 and 30 minutes for this.”
This gets more complicated with loops (how much budget does each iteration get? when should the loop stop?), and even more complicated on resume (if $6 was already spent and 45 minutes have elapsed, what’s left?).
Hankweave handles all of it. Authors and operators each express their preferences declaratively, and the runtime resolves the effective limits — merging sources, distributing across codons, tracking consumption, and enforcing limits in real time. You see exactly what will be enforced before any tokens are spent.
Try It Now
Before anything else — you can add a budget safety net to any hank without changing the hank file:
# Cap this run at 50 cents, use the cheapest model
hankweave hank.json data/ --max-cost 0.50 -m haikuThis catches configuration errors, broken prompts, and workflow issues for pennies. When the structure works, remove the overrides and run for real.
For a time-limited pilot:
hankweave hank.json data/ --max-cost 2.00 --max-time 300No hank.json changes needed. The operator always gets the last word on resource limits.
What Can You Say With a Budget?
As the hank author
You understand the hank’s internal structure. You can say:
- “This hank should cost about $12 total.” A global ceiling for the entire run.
- “Give research 15% and the dev loop 70% of whatever budget is available.” Proportional allocation — you shape the distribution without hardcoding dollar amounts.
- “This codon must not cost more than $3, ever.” A hard per-codon cap, independent of the global budget.
- “Terminate this loop after $5.” A loop-level pool shared across all iterations — making the loop variable-length.
- “If this codon hits its budget, that’s a failure — don’t pretend it succeeded.” An
onExceeded: "fail"policy for all-or-nothing steps. - “This codon should produce bounded output.” A
maxOutputTokenscap for summaries and configs. - “Don’t let context grow past 100K tokens.” A
maxContextTokenscap to prevent quality erosion. - “This codon shouldn’t take more than 5 minutes.” A
maxTimeSecondswall-clock cap.
As the operator
You don’t need to understand the hank’s internals:
- “I don’t want this run costing more than $5.”
--max-cost 5.00 - “I don’t want this run taking more than 30 minutes.”
--max-time 1800 - “Give me a cheap pilot run.”
--max-cost 0.50 -m haiku - “See what budgets are active before I commit tokens.”
--validate
How preferences compose
The author sets the shape (which steps get how much). The operator sets the envelope (total available). Hankweave takes the tighter constraint at every level:
If the author says $12 and the operator says $5, the run gets $5 — but the author’s proportional shares still apply within that envelope. The ratios are preserved; the absolute numbers shrink.
The operator can tighten, never loosen. This preserves the author’s intent — if a hank says $5, the runner can’t accidentally make it $50.
Expressing Budgets
Every use case above maps to a budget object in configuration. Here are the common patterns, from simplest to most sophisticated.
Simple safety net
You know roughly what a run should cost. Catch runaways.
{
"overrides": {
"budget": { "maxDollars": 5.0 }
},
"hank": [
{
"id": "plan",
"model": "sonnet",
"continuationMode": "fresh",
"promptFile": "./prompts/plan.md"
},
{
"id": "execute",
"model": "sonnet",
"continuationMode": "fresh",
"promptFile": "./prompts/execute.md"
},
{
"id": "review",
"model": "haiku",
"continuationMode": "fresh",
"promptFile": "./prompts/review.md"
}
]
}No allocation specified, so it defaults to "shared". All three codons draw from a $5 pool in order. If plan costs $0.80 and execute costs $3.50, review has $0.70 left.
Shaped allocation
You know the structure of your work and want to allocate accordingly.
{
"overrides": {
"budget": {
"maxDollars": 12.0,
"maxTimeSeconds": 1800,
"allocation": "proportional",
"shares": {
"research": 0.15,
"dev-loop": 0.7,
"final-review": 0.15
}
}
},
"hank": [
{
"id": "research",
"model": "sonnet",
"continuationMode": "fresh",
"promptFile": "./prompts/research.md"
},
{
"type": "loop",
"id": "dev-loop",
"terminateOn": { "type": "iterationLimit", "limit": 5 },
"budget": { "maxTimeSeconds": 900 },
"codons": [
{
"id": "implement",
"model": "sonnet",
"continuationMode": "fresh",
"promptFile": "./prompts/implement.md",
"budget": { "maxDollars": 3.0 }
},
{
"id": "test",
"model": "haiku",
"continuationMode": "fresh",
"promptFile": "./prompts/test.md"
}
]
},
{
"id": "final-review",
"model": "opus",
"continuationMode": "fresh",
"promptFile": "./prompts/review.md",
"budget": { "onExceeded": "fail" }
}
]
}What this says:
- research gets 15% of $12 = $1.80. If it finishes cheap, savings flow to later codons.
- dev-loop gets 70% = $8.40, with a 15-minute time cap. Within the loop, all iterations share the pool.
- implement has a $3 per-iteration hard cap — one iteration can’t blow the whole loop pool.
- test has no cap — uses whatever’s left in the loop pool for that iteration.
- final-review gets 15% = $1.80 with
onExceeded: "fail"— if it can’t finish within budget, that’s a real failure.
Budget-driven loops
Use loop budgets to say “iterate as much as you can afford”:
{
"type": "loop",
"id": "refine",
"terminateOn": { "type": "iterationLimit", "limit": 20 },
"budget": { "maxDollars": 3.0 },
"codons": [
{
"id": "improve",
"model": "sonnet",
"continuationMode": "continue-previous",
"promptFile": "./prompts/improve.md"
}
]
}The iteration limit is a safety valve (20 max), but the real control is the $3 budget. The loop runs as many iterations as it can afford. This makes loops variable — the number of iterations is driven by cost, not a fixed count.
Budget exhaustion is a loop termination
condition, alongside iterationLimit
and contextExceeded. When a loop’s budget is exhausted, remaining codons in
the current iteration are removed and execution continues after the loop.
Critical step with fail policy
Some codons produce artifacts that are useless if incomplete:
{
"id": "generate-report",
"model": "opus",
"continuationMode": "fresh",
"promptFile": "./prompts/report.md",
"budget": { "maxDollars": 5.0, "onExceeded": "fail" },
"onFailure": "abort"
}If the report can’t finish within $5, the hank stops rather than producing a partial report.
Output and context guards
For bounded tasks or quality-sensitive codons:
{
"id": "summarize",
"model": "haiku",
"continuationMode": "fresh",
"promptFile": "./prompts/summarize.md",
"budget": { "maxOutputTokens": 10000 }
}{
"id": "deep-analysis",
"model": "sonnet",
"continuationMode": "fresh",
"promptFile": "./prompts/analyze.md",
"budget": { "maxContextTokens": 100000 }
}How Budgets Are Resolved
When you run --validate or start a run, all budget declarations are resolved into concrete per-codon limits and printed as a table:
Budget
───────────────────────────────────────────────────────
Global ceiling: $12.00 (hank)
Time limit: 1800s (hank)
Allocation: proportional (unspent flows to later codons)
Codon Model Max Dollars Max Time On exceeded
───── ───── ─────────── ──────── ───────────
research Sonnet $1.80 (15% of $12.00) — completes
dev-loop $8.40 (70% of $12.00) 900s (loop)
├─ implement Sonnet ≤ $3.00 (codon cap) — completes
└─ test Haiku ≤ loop pool — completes
final-review Opus $1.80 (15% of $12.00) — ⚠ fails runThis table answers the key questions: What budgets are active? Where do they come from? Which codons might run out of money (model name matters — $1 on Haiku is very different from $1 on Opus)? What happens when they do?
When the operator adds --max-cost 5.00, the table updates to show $5.00 (--max-cost, hank wanted $12.00) and the scaled allocations.
The resolution chain
Resolution happens just before each codon starts:
Step 1: Effective ceiling. min(CLI, runtime config, hank overrides) — tightest wins.
Step 2: Remaining pool. effectiveDollars - sum(completed codon costs). At the loop level, the pool is scoped to the loop’s own budget, itself capped by its parent’s allocation.
Step 3: Allocate to the codon. Depends on allocation mode: shared gives the full remaining pool; proportional gives the share fraction bounded by remaining pool; proportional-strict gives the share fraction with no redistribution of savings.
Step 4: Apply codon hard cap. min(allocated, codon cap). A codon’s own maxDollars always wins over a generous allocation.
Step 5: Undefined = uncapped. No limit from any source means no limit enforced.
The Four Currencies
| Currency | Field | Level | What it caps |
|---|---|---|---|
| Dollars | maxDollars | codon, loop, hank | Total API cost in USD. Fungible across codons. |
| Time | maxTimeSeconds | codon, loop, hank | Wall-clock seconds. Watchdog — when time’s up, whoever’s running gets stopped. |
| Output tokens | maxOutputTokens | codon only | Tokens generated by the model. Safety cap for bounded tasks. |
| Context tokens | maxContextTokens | codon only | High-water mark of context window fill (input+output per turn). Quality guard. |
Dollars (maxDollars)
The primary currency. Dollars are fungible — a dollar not spent by one codon is available to the next. Requires the model to be in the LLM provider registry for pricing. If the model can’t be priced, preflight warns you.
Time (maxTimeSeconds)
Wall-clock seconds. Can’t be allocated proportionally — codons execute sequentially, and “give the planning step 10% of the hour” isn’t how anyone thinks. Time is always a shared watchdog at the container level, and a hard cap at the codon level.
On resume, elapsed time from prior runs is accounted for.
Output Tokens (maxOutputTokens)
Codon-level only. Codons flush context and start fresh, so a hank-level output token limit across unrelated codons isn’t useful. Use for bounded tasks: “generate a summary — if it’s producing more than 10K tokens, something went wrong.”
Context Tokens (maxContextTokens)
Codon-level only. Tracks the high-water mark of inputTokens + outputTokens per turn — the actual context window fill level. Useful for:
- Quality control: Some models degrade past certain context lengths.
- Verbosity control: Model-agnostic limit on how long a codon runs — 100K tokens is 100K tokens whether you’re on Haiku or Opus.
Allocation Modes
When a container (hank or loop) has a maxDollars budget, allocation controls how it’s distributed.
shared | proportional | proportional-strict | |
|---|---|---|---|
| Pool | $10 shared | $10 divided by shares | $10 divided by shares |
| Codon A (20%) | spends $2 from pool | gets $2 (20%), spends $1 | gets $2 (20%), spends $1 |
| Codon B (60%) | spends $6 from pool | gets $6 (60%) | gets $6 (60%) |
| Codon C (20%) | gets remaining $2 | gets $2 + $1 savings = $3 | gets $2 (savings evaporate) |
"shared" (default)
First-past-the-post. All children draw from one pool in execution order. Early codons that finish cheap leave more room for later ones. The risk is that a runaway early codon can starve everything after it.
"proportional"
Pre-allocates shares. Named children get their fraction of maxDollars; unnamed children split the remainder evenly. Generous: if a codon finishes under budget, savings flow back to the pool.
Requires maxDollars on the container — without a total to proportion,
there’s nothing to divide.
"proportional-strict"
Same shares, but unspent budget evaporates. Use for benchmarking per-codon costs or enforcing strict partitioning. Most of the time, the generous version is what you want.
What Happens When a Budget Is Hit
When any limit is hit, the runtime kills the agent process. The onExceeded policy decides the outcome:
| Policy | Behavior | Use when |
|---|---|---|
"complete" (default) | Codon marked completed. Downstream codons still run. | Best-effort work — partial output is useful. |
"fail" | Codon marked failed. onFailure policy kicks in. | All-or-nothing — partial output is worse than no output. |
Set at the hank level (default for all codons) or per-codon (overrides hank default).
onExceeded: "fail" + onFailure: "retry" = the retry gets the same
budget allocation and will likely fail again. Preflight
warns about this.
Budgets in Loops
Two mechanisms, two levels:
- Loop-level budget: A pool shared across all iterations. When exhausted, the loop terminates — even mid-iteration. This enables budget-driven variable loops.
- Codon-level caps inside loops: Applied fresh each iteration.
maxDollars: 1.00means $1 per iteration, not $1 total. The loop budget controls the aggregate.
"shared" is almost always right inside a loop. You usually don’t know
how many iterations will run, so proportional allocation within a loop is
rarely useful.
Resume Semantics
Budget accounting picks up where it left off:
- Dollars: Hydrated from prior run costs in the state file. $6 spent + $10 budget = $4 remaining.
- Time: Hydrated from prior run timestamps. 45 minutes elapsed + 60-minute budget = 15 minutes remaining.
- Config: Re-read from the current hank file. Increase
maxDollarsbetween runs and it takes effect immediately.
Preflight Validation
--validate catches budget misconfigurations before any tokens are spent.
Errors (block execution)
| Error | Cause |
|---|---|
| Shares sum to more than 1.0 | Can’t allocate more than 100% of the budget. |
| Shares reference unknown IDs | A share key doesn’t match any codon or loop ID. |
| Shares without proportional allocation | shares set but allocation is "shared". The shares do nothing. |
Proportional allocation without maxDollars | Nothing to proportion. |
Warnings (inform but don’t block)
| Warning | What it means |
|---|---|
| Codon cap exceeds its allocation | maxDollars: 5.00 but only gets a $1.00 share — cap is unreachable. |
| Loop budget exceeds hank allocation | Same idea at the loop level. |
| Unallocated budget with no absorbers | Shares sum to 60%, no unshared codons to use the remaining 40%. |
exhaustWithPrompt + time budget | Time enforcement may interrupt the extension before it completes. |
| Model without pricing data | Dollar budget set but model can’t be priced. Cost tracking will be inaccurate. |
| Zero effective budget | Shares sum to 100% and this codon has no share — starts at $0, immediately exceeds. |
onExceeded: "fail" + onFailure: "retry" | Budget-exceeded retries will likely fail again with the same budget. |
| Fail-policy codon late in shared pool | Earlier codons may exhaust the pool before this codon runs. |
| Codon time cap exceeds loop time cap | The codon cap will never be reached. |
Events
Budget information appears in the event stream:
| Event | When | Data |
|---|---|---|
codon.completed | Codon force-completed by budget | budgetExceeded: { currency, limit, used } |
budget.summary | End of run | Per-codon table: budget vs. actual usage |
info | Budget breach | Human-readable message with source attribution |
loop.iteration.completed | Loop terminates due to budget | Reason: "budget exceeded" |
Common Mistakes
Shares without setting allocation mode
{
"overrides": {
"budget": { "maxDollars": 10.0, "shares": { "step-1": 0.3, "step-2": 0.7 } }
}
}Preflight error. Default is "shared", which ignores shares. Fix: add "allocation": "proportional".
Expecting codon caps to be cumulative in loops
"maxDollars": 1.0 on a codon in a loop means $1 per iteration, not $1 total. For cumulative control, set the budget on the loop.
Dollar budget on an unpriced model
Cost tracking reports $0 forever and the budget never triggers. Use maxTimeSeconds, maxOutputTokens, or maxContextTokens instead.
Field Reference
Hank Overrides / Loop budget
| Field | Type | Default | Description |
|---|---|---|---|
maxDollars | number | — | Total cost budget in USD. |
maxTimeSeconds | number | — | Wall-clock time limit in seconds. |
allocation | "shared" | "proportional" | "proportional-strict" | "shared" | How the dollar budget is distributed among children. |
shares | Record<string, number> | — | Map of child IDs to fractions (0–1). Requires proportional allocation. |
onExceeded | "complete" | "fail" | "complete" | Default policy when a budget limit is hit. |
Codon budget
| Field | Type | Default | Description |
|---|---|---|---|
maxDollars | number | — | Max cost in USD for this execution. |
maxTimeSeconds | number | — | Max wall-clock time in seconds. |
maxOutputTokens | number | — | Max output tokens (model-generated). |
maxContextTokens | number | — | Max context window tokens (high-water mark). |
onExceeded | "complete" | "fail" | inherited | Override container default. |
CLI Flags
| Flag | Description |
|---|---|
--max-cost <dollars> | Global cost ceiling. Highest priority. |
--max-time <seconds> | Global time ceiling. Highest priority. |
Runtime Config (hankweave.json)
| Field | Type | Description |
|---|---|---|
budget.maxDollars | number | Global cost ceiling. Overridden by CLI. |
budget.maxTimeSeconds | number | Global time ceiling. Overridden by CLI. |
Related Pages
- Performance and Cost Tracking — How Hankweave tracks and reports costs
- Loops — The iteration primitive, including budget-driven termination
- Codons — The atomic unit where limits are enforced
- Configuration Reference — Full
budgetschema in all config locations - CLI Reference —
--max-cost,--max-time, and--validate