Skip to Content
GuidesWorkflowsLong-Horizon Tasks

Long-Horizon Task Execution

Enable agents to work autonomously on complex, multi-day development tasks with automated verification, code review, and supervisor orchestration.

Overview

Long-horizon task execution combines:

  • Status Files: Structured task definitions with milestones and verification criteria
  • Implementation Agent: Primary coding agent executing milestone work
  • Code Review Agents: Specialized reviewers validating quality at milestone boundaries
  • Supervisor Agent: Orchestrator that merges changes and decides next steps
  • Anti-Halting Detection: Automatic recovery from stuck processes

Status Files

Status files (.status.org or .status.md) define your development plan with clear milestones, deliverables, and verification criteria.

Structure

#+TITLE: Feature Name Status * Document :PROPERTIES: :next_steps: Current next steps :last_updated: 2026-01-20 :current_milestone: M1 :END: * Introduction :PROPERTIES: :initiative_goals: High-level goals :related_specs: Links to specs :END: Brief context and scope. * Milestones ** M0: Initial Setup :PROPERTIES: :status: completed :END: *** Deliverables - [x] Create project structure - [x] Add dependencies *** Verification - test_name: test_project_builds file: tests/setup_test.rs description: Verifies project compiles status: passing ** M1: Core Implementation :PROPERTIES: :status: in_progress :END: *** Deliverables - [ ] Implement main feature - [ ] Add error handling *** Verification - test_name: test_main_feature description: Verifies feature works status: pending

Key Properties

PropertyDescription
:next_steps:Current actionable next steps (updated each session)
:last_updated:Date of last update
:current_milestone:ID of milestone being worked on
:status:planned, in_progress, completed, or blocked

Running Long-Horizon Tasks

Basic Usage

ah task create --agent claude --prompt "Implement the authentication feature following the status file at ./auth.status.org. Update checkboxes as you complete each item."

The agent reads the status file, works on deliverables, and updates progress as it goes.

Execution Loop

The long-horizon loop follows this pattern:

┌─────────────────────────────────────────────────────────────┐ │ STEP 1: Launch Implementation Agent │ │ - Reads status file, implements next milestone │ │ - Updates checkboxes as deliverables complete │ │ - Signals completion by updating milestone status │ └─────────────────────────────────────────────────────────────┘ ┌─────────────────────────────────────────────────────────────┐ │ STEP 2: Launch Coordinator Agent │ │ - Examines changes and logs │ │ - Detects abandonment (agent deviated due to blockers) │ │ - Selects appropriate review agents │ └─────────────────────────────────────────────────────────────┘ ┌─────────────────────────────────────────────────────────────┐ │ STEP 3: Launch Review Agents │ │ - Each reviewer validates in isolated workspace │ │ - Reviewers can fix issues directly │ │ - Results merged after all reviewers complete │ └─────────────────────────────────────────────────────────────┘ ┌─────────────────────────────────────────────────────────────┐ │ STEP 4: Launch Supervisor Agent │ │ - Merges fixes from reviewers │ │ - Runs test suite to verify integrity │ │ - Decides: CONTINUE, NEW_REVIEW, COMPLETE, or STOP │ └─────────────────────────────────────────────────────────────┘

Code Review Agents

Specialized review agents validate code at milestone boundaries:

AgentFocusBlocking
SecurityVulnerabilities, injection, auth bypassYes
Test IntegrityTest cheating, weakened assertionsYes
Goal AdherenceMilestone requirements matchYes
ArchitectureModule boundaries, cyclesWarn
PerformanceAlgorithms, data structuresConditional
IdiomsProject style, patternsNo

Review Policy

  • Fixed Issues: Automatically merged if tests pass
  • Unfixable Blocking Issues: Implementation agent must revise
  • Warnings: Logged but don’t block milestone advancement

Custom Review Agents

Define project-specific reviewers in .agents/reviewers/:

# .agents/config.toml [[review_agents]] name = "migrations" model = "sonnet" blocking = true prompt_file = ".agents/reviewers/migrations.md" trigger = "path:migrations/**" [[review_agents]] name = "api_contracts" model = "haiku" blocking = false prompt_file = ".agents/reviewers/api_contracts.md" trigger = "always"

Anti-Halting Detection

The system automatically detects and recovers from stuck processes:

Detection Types

TypeDetectionRecovery
Interactive PromptProcess waiting for stdinBackground process, provide interaction commands
Network TimeoutBlocked on connect/recvTerminate, suggest retry with timeout
Busy LoopHigh CPU, identical stacksTerminate, provide diagnostic info
DeadlockCircular wait chainTerminate, show thread dump

Interactive Process Handling

When an interactive prompt is detected, the process is moved to the background:

[AH-ANTI-HALT] Process moved to background - waiting for input COMMAND: npm init COMMAND ID: cmd-456 You can interact with it using: ah agent pty-snapshot cmd-456 # View current state ah agent send-keys cmd-456 "y" --enter # Send input ah agent kill cmd-456 # Terminate RECOMMENDED ALTERNATIVES: 1. Use non-interactive mode: npm init -y 2. Set CI environment: CI=true npm init

Research Integration

When agents encounter blocking technical challenges, a Research Agent can be launched:

  1. Coordinator detects agent abandoned goals due to blockers
  2. Research Agent is spawned with web search capabilities
  3. It creates a forked session from the point of divergence
  4. Injects targeted help to unstuck the implementation agent

Configuration

# .agents/config.toml [supervisor] # Enable long-horizon mode enabled = true [supervisor.implementation] # Default agent for implementation work default_agent = "claude" model = "sonnet" max_session_duration = "2h" [code_review] # Run reviews in parallel (requires AgentFS) parallel = true # Standard reviewers to include reviewers = ["security", "test-integrity", "goal-adherence"] [anti_halting] enabled = true [anti_halting.timeouts] base_timeout = "5m" max_extended_timeout = "30m" [anti_halting.interactive_background] enabled = true timeout = "5m"

Best Practices

Writing Good Status Files

  1. Clear Deliverables: Each deliverable should be small and testable
  2. Concrete Verification: Every milestone needs automated verification
  3. Realistic Milestones: Break complex work into 2-4 hour chunks
  4. Update Frequently: Agents should update status after each deliverable

Session Workflow

At the end of each session, agents must:

  1. Update :next_steps: with what should be done next
  2. Update :last_updated: to current date
  3. Update verification statuses after running tests
  4. Add any new outstanding tasks discovered
  5. Update milestone status if blocked or completed