State Machine

Every codon in Hankweave follows a deterministic path from start to finish. This isn't an implementation detail; it's a formal state machine with 8 states, 13 event types, and strict validation to prevent impossible transitions. Understanding this model is critical for building custom tooling or contributing to the runtime.

The Codon State Machine

When a codon runs, it moves through a fixed sequence of phases. The state machine enforces which transitions are legal—you cannot skip steps or go backward.

The 8 States

Codon State Machine

The following table details each state and the data available within it.

State	Description	Data Available
`preparing`	Rig setup running (copy files, run commands)	`codonId`, `startTime`, `loopContext`
`starting`	Spawning agent process	+ `rigSetupCheckpoint`, `sentinels`
`initializing`	Process started, waiting for session ID	+ `claudePid`, `claudeLogPath`, `previousSessionId`
`running`	Agent is working	+ `claudeSessionId`, `currentCost`, `currentTokens`, `assistantMessageCount`
`completing-sentinels`	Draining sentinel queues	Same as running
`completed`	Success (terminal)	+ `finalCost`, `finalTokens`, `completionCheckpoint`, `exitCode: 0`
`failed`	Error (terminal)	+ `failureReason`, `failedDuring`, `partialCost`, `errorCheckpoint`
`skipped`	User skipped (terminal)	+ `skippedDuring`, `skipCheckpoint`

Valid Transitions

The state machine enforces a strict transition graph. An attempt to make an invalid transition throws an InvalidTransitionError.

Text

export const CodonTransitions: Record<CodonStatus, CodonStatus[]> = {
  preparing:            ["starting", "failed", "skipped"],
  starting:             ["initializing", "failed", "skipped"],
  initializing:         ["running", "failed", "skipped"],
  running:              ["completing-sentinels", "completed", "failed", "skipped"],
  "completing-sentinels": ["completed", "failed", "skipped"],
  completed:            [], // Terminal - no transitions
  failed:               [], // Terminal - no transitions
  skipped:              [], // Terminal - no transitions
};

Normal execution flows forward. Any non-terminal state can transition to failed (on error) or skipped (on user intervention). Terminal states—completed, failed, and skipped—are final.

⚠️

Terminal States Are Final: You cannot transition out of completed, failed, or skipped. To retry a failed codon, you must start a new run.

Why 8 States?

Eight states might seem like a lot, but each state captures a distinct phase where different things can go wrong. Knowing where something failed tells you why.

For example, a failure during preparing indicates a configuration problem, while a failure during starting suggests a process or environment issue. Likewise, a process can spawn successfully (starting) but fail to authenticate (initializing). These are different problems with different solutions.

The completing-sentinels state is necessary because agents and sentinels finish asynchronously. The agent might complete its work while sentinels still need time to drain their queues. Finally, the terminal states (completed, failed, skipped) carry distinct final data, such as costs and exit codes, that don't apply to active states.

When you see failedDuring: "initializing", you know immediately: the process started, but it never produced output. That's specific enough to act on.

State Transition Events

State changes are not applied directly; they are triggered by events. When an event is emitted, the state manager validates it and applies the resulting transition atomically. This provides an audit trail and prevents invalid state changes.

The 13 Event Types

Every possible state change has a corresponding event type. Some trigger state transitions, while others update metadata like costs or checkpoints.

State Event Types

Event	Purpose	When Triggered
`RunStarted`	Creates new run entry	Server startup
`RunCompleted`	Marks run as successfully finished	Last codon completes
`RunFailed`	Marks run as failed	Codon fails, fatal error
`RunCrashed`	Marks run as crashed (detected on recovery)	Stale lock file
`CodonStarted`	Adds new codon to current run	Starting a codon
`CodonTransitioned`	Changes codon state	State machine transition
`CostsUpdated`	Sets absolute cost values	Token usage update
`CostsIncremented`	Adds delta to costs	Incremental cost tracking
`AssistantMessageCountUpdated`	Updates message counter	New assistant message
`CodonFinalCostSet`	Sets authoritative final cost	Claude result message
`SentinelStatesUpdated`	Updates sentinel tracking	Sentinel lifecycle events
`CheckpointCreated`	Records checkpoint SHA	Git commit created
`InitialCheckpointSet`	Records initial project state	First checkpoint

CodonTransitioned in Detail

The CodonTransitioned event handles most state changes. It carries the from and to states, plus any metadata required for the transition. For example, you can't transition to running without a claudeSessionId, or to completed without a checkpointSha.

Text

{
  type: "CodonTransitioned";
  data: {
    runId: RunId;
    codonId: CodonId;
    from: CodonStatus;      // Current state
    to: CodonStatus;        // Target state
    metadata?: {
      // For starting → initializing
      claudePid?: number;
      claudeLogPath?: string;
      previousSessionId?: SessionId;
 
      // For initializing → running
      claudeSessionId?: SessionId;
 
      // For any → failed
      exitCode?: number;
      failureReason?: FailureReason;
      failedDuring?: CodonStatus;
 
      // For any → skipped
      skippedDuring?: CodonStatus;
 
      // For → completed
      resultMessageReceived?: boolean;
      checkpointSha?: string;
    };
  };
}

The state manager validates this required metadata using type guards:

Text

// From state-transition-guards.ts
function validateTransitionMetadata(to: CodonStatus, metadata: unknown): void {
  switch (to) {
    case "initializing":
      // Requires claudePid, claudeLogPath
      if (!hasInitializingMetadata(metadata)) throw new MetadataValidationError(...);
      break;
    case "running":
      // Requires claudeSessionId
      if (!hasRunningMetadata(metadata)) throw new MetadataValidationError(...);
      break;
    case "completed":
      // Requires checkpointSha
      if (!hasCompletedMetadata(metadata)) throw new MetadataValidationError(...);
      break;
    case "failed":
      // Requires exitCode, failureReason, failedDuring
      if (!hasFailedMetadata(metadata)) throw new MetadataValidationError(...);
      break;
    case "skipped":
      // Requires skippedDuring
      if (!hasSkippedMetadata(metadata)) throw new MetadataValidationError(...);
      break;
  }
}

The Fire-and-Forget Pattern

Here's something that might surprise you: state transitions are asynchronous and queued. When you call stateManager.transition(), it returns immediately—the transition gets queued and processed in the background.

Fire-and-Forget Pattern

Why Use This Pattern?

This queue-based approach solves several problems:

Non-blocking: The runtime doesn't block on disk writes during hot paths.
Ordered Execution: Events are processed in the order they were emitted, guaranteeing sequence.
Race Condition Prevention: Only one transition processes at a time.
Crash Safety: Each transition is persisted before the next one starts.

Text

// From state-manager.ts
private transitionQueue: StateTransition[] = [];
private isProcessing = false;
 
// Public API - fire and forget!
transition(event: StateTransition): void {
  this.transitionQueue.push(event);
  this.processQueue(); // Don't await - let it run
}
 
private async processQueue(): Promise<void> {
  if (this.isProcessing) return;
 
  this.isProcessing = true;
  while (this.transitionQueue.length > 0) {
    const event = this.transitionQueue.shift();
    // Validate, apply, persist...
    await this.save();
  }
  this.isProcessing = false;
}

Eventual Consistency: State is eventually consistent. If you call getState() immediately after transition(), you might not see the change yet. Use event listeners to react to state changes as they are applied.

Cost Tracking

Cost tracking uses the same event system and can be handled in two ways.

Setting Absolute Costs: `CostsUpdated`

This event sets the total cost to a specific value.

Text

stateManager.transition({
  type: "CostsUpdated",
  data: {
    runId,
    codonId,
    cost: 0.0234,      // New total
    tokens: { inputTokens: 1500, outputTokens: 800, ... }
  }
});

Incrementing Costs: `CostsIncremented`

This event adds a delta to the current cost. This approach is more resilient to race conditions, as it doesn't require knowing the current total.

Text

stateManager.transition({
  type: "CostsIncremented",
  data: {
    runId,
    codonId,
    costDelta: 0.0012,  // Amount to add
    tokensDelta: { inputTokens: 100, outputTokens: 50, ... }
  }
});

For performance, the state manager maintains an in-memory cache for fast cost queries.

Text

private costCache = {
  total: 0,
  currentRun: 0,
  lastUpdated: null
};
 
getCurrentRunCost(): number {
  return this.costCache.currentRun;
}
 
getTotalCost(): number {
  return this.costCache.total;
}

Sentinel State Integration

Sentinels don't have their own state machine; their state is embedded within the codon. Each sentinel tracks its activity during the codon's lifetime.

Text

interface SentinelState {
  id: string;
  model: string;
  loadedAt: string;
  unloadedAt?: string;
  llmCallCount: number;
  failedLLMCalls: number;
  totalCost: number;
  status: "active" | "unloaded";
  unloadReason?: "codon-complete" | "fatal-error" | "consecutive-failures";
}

While a codon is running, sentinel states are stored in codon.sentinels.loaded. When the codon reaches a terminal state, this field is renamed to codon.sentinels.executed. This rename acts as a clear signal to any subscribers that all sentinel activity is final and no further updates will occur.

Sentinel State Integration

State Validation

When the state manager loads, it validates the state file for integrity to catch corruption, dangling references, and inconsistencies.

Text

validate(state: unknown): StateValidation {
  const errors = [];
  const warnings = [];
 
  // Type structure validation
  if (!this.isValidStateStructure(state)) {
    errors.push({ type: "corrupted_data", message: "Invalid structure" });
  }
 
  // Referential integrity
  if (state.currentRunId && !state.runs.find(r => r.runId === state.currentRunId)) {
    errors.push({ type: "missing_run", message: "Current run not found" });
  }
 
  // Orphaned folders (warning, not error)
  // ...
 
  return { valid: errors.length === 0, errors, warnings };
}

Validation catches several categories of problems:

corrupted_data: The state file structure is invalid.
missing_run: currentRunId points to a run that doesn't exist.
missing_checkpoint: A state entry references a Git SHA that doesn't exist.
orphaned_folder (Warning): A run folder exists on disk without a corresponding state entry.

State Persistence

State is persisted to disk using an atomic write pattern to prevent corruption from partial writes. The process ensures that either the new state is written completely or the old state remains untouched:

Backup: The current state file is copied to a backup location.
Write to Temp: The new state is written to a temporary file (.tmp).
Atomic Rename: The temporary file is atomically renamed to become the new state file.

Text

async save(): Promise<void> {
  // 1. Backup current state
  if (fs.existsSync(this.statePath)) {
    await fs.copyFile(this.statePath, this.stateBackupPath);
  }
 
  // 2. Write to temp file
  const tempPath = `${this.statePath}.tmp`;
  await fs.writeFile(tempPath, JSON.stringify(this.state, null, 2));
 
  // 3. Atomic rename
  await fs.rename(tempPath, this.statePath);
}

This pattern provides two layers of safety: the backup file can be restored if the write fails, and the temporary file signals an incomplete write if the process crashes mid-operation.

Common State Machine Issues

The following errors are the most common state machine issues.

Invalid Transition Error

Text

InvalidTransitionError: Invalid transition from completed to running

⚠️

Developer Bug Indicator: InvalidTransitionError signals a bug in the runtime or in custom tooling, not a normal operational issue. This error should never occur in correctly functioning code. If you see it, check if you are trying to transition a codon that is already in a terminal state.

Missing Metadata

Text

MetadataValidationError: Missing required metadata for transition to running: claudeSessionId

This error means the transition event is missing required fields. Each target state requires specific metadata—check the guard functions in state-transition-guards.ts to see what's required for each transition.

Orphaned Run Detection

On startup, Hankweave detects runs that were active when the server last shut down. If the process that owned a run no longer exists, that run is marked as crashed.

Text

async detectCrashedRuns(): Promise<void> {
  for (const run of this.state.runs) {
    if (run.status === "running" && run.runId !== this.state.currentRunId) {
      try {
        process.kill(run.serverPid, 0); // Check if process exists
      } catch {
        // Process doesn't exist - mark as crashed
        this.transition({
          type: "RunCrashed",
          data: { runId: run.runId, detectedAt: new Date().toISOString(), ... }
        });
      }
    }
  }
}

TypeScript Types

The state machine types are designed to be self-documenting. TypeScript's type system enforces the same invariants as the runtime, so if your code compiles, it is likely interacting with the state model correctly. The key types from state-types.ts are shown below.

Text

// The 8 codon states
type CodonStatus =
  | "preparing" | "starting" | "initializing" | "running"
  | "completing-sentinels"
  | "completed" | "failed" | "skipped";
 
// Discriminated union for codon execution
type CodonExecution =
  | PreparingCodon | StartingCodon | InitializingCodon
  | RunningCodon | CompletingSentinelsCodon
  | CompletedCodon | FailedCodon | SkippedCodon;
 
// Check if terminal
function isTerminalCodonStatus(status: CodonStatus): boolean {
  return status === "completed" || status === "failed" || status === "skipped";
}
 
// Get cost regardless of state
function getCodonCost(codon: CodonExecution): number {
  switch (codon.status) {
    case "completed": return codon.finalCost;
    case "failed": return codon.partialCost;
    case "skipped": return 0;
    case "running":
    case "completing-sentinels": return codon.currentCost;
    default: return 0;
  }
}

The discriminated union is the key pattern. TypeScript knows exactly which fields are available in each state, preventing you from accessing a field that doesn't exist in the current state.

Text

function handleCodon(codon: CodonExecution) {
  if (codon.status === "running") {
    // TypeScript knows: claudeSessionId, currentCost, etc. exist
    console.log(codon.currentCost);
  } else if (codon.status === "completed") {
    // TypeScript knows: finalCost, completionCheckpoint, etc. exist
    console.log(codon.finalCost);
  }
}

Execution Flow — High-level execution lifecycle
Checkpoints — Git-based state persistence
WebSocket Protocol — Event schemas and commands

Execution Flow Execution Thread

State Machine

The Codon State Machine

The 8 States

Valid Transitions

Why 8 States?

State Transition Events

The 13 Event Types

CodonTransitioned in Detail

The Fire-and-Forget Pattern

Why Use This Pattern?

Cost Tracking

Setting Absolute Costs: CostsUpdated

Incrementing Costs: CostsIncremented

Sentinel State Integration

State Validation

State Persistence

Common State Machine Issues

Invalid Transition Error

Missing Metadata

Orphaned Run Detection

TypeScript Types

Related Pages

Setting Absolute Costs: `CostsUpdated`

Incrementing Costs: `CostsIncremented`