Architecture

Tofie Architecture #

This document provides a deep dive into the Tofie system architecture, explaining how all components work together to automate the development workflow.

System Overview #

Tofie is a distributed system that orchestrates multiple services to automate software development tasks. The architecture follows an event-driven pattern with n8n as the central orchestrator.

sequenceDiagram
    participant User
    participant Linear
    participant n8n
    participant Coder
    participant Claude
    participant GitHub

    User->>Linear: Comment "@Tofie plan"
    Linear->>n8n: Webhook: issue.comment_created
    n8n->>n8n: Parse command & extract data
    n8n->>Linear: Update status to "Planning"
    n8n->>Coder: SSH: Execute planning.sh
    Coder->>Coder: Create git worktree
    Coder->>Claude: Call Claude CLI
    Claude->>Claude: Generate PLAN.md
    Claude->>Coder: Save to .plans/PLAN.md
    Coder->>n8n: Return JSON result
    n8n->>Linear: Comment with plan
    n8n->>Linear: Update status to "Planned"

Component Architecture #

1. Linear (Project Management Layer) #

Role: Source of truth for requirements and project tracking

Key Features:

  • Issue management with descriptions and comments
  • Status workflow (Todo → Planning → Planned → In Progress → In Review → Done)
  • Webhooks for real-time event notifications
  • API for programmatic status updates

Integration Points:

  • Outbound: Webhooks to n8n on issue comments
  • Inbound: API calls from n8n for status updates and comments

Data Flow:

User creates/updates issue
     ↓
Issue contains:
- Title and description
- Acceptance criteria
- Comments with clarifications
- Labels and priority
     ↓
Webhook fires on @Tofie mention
     ↓
n8n receives full issue context

2. n8n (Orchestration Layer) #

Role: Central automation hub that coordinates all services

Key Features:

  • Webhook receivers for Linear events
  • Workflow automation with conditional logic
  • SSH execution for remote scripts
  • Error handling and retry logic
  • Credential management

Workflow Structure:

Webhook Trigger
     ↓
Parse Command (plan/implement/review/pr)
     ↓
Extract Linear Context
     ↓
Build Script Input JSON
     ↓
SSH to Coder Instance
     ↓
Execute Tofie Script
     ↓
Parse Script Output
     ↓
Update Linear (status + comment)

Configuration:

  • Environment: bonsai.app.n8n.cloud
  • Webhook URL: https://bonsai.app.n8n.cloud/webhook/tofie-event
  • Signing key: Validates webhook authenticity
  • SSH credentials: Stored in n8n credential vault

3. Coder Instance (Execution Layer) #

Role: Development environment where all work happens

Infrastructure:

  • EC2 instance (Ubuntu Linux)
  • Persistent storage for repositories
  • Claude Code CLI installed
  • Git configured with SSH keys
  • Access to Doppler for secrets

Directory Structure:

/home/coder/
├── bonsai/                          # Main repository (on main branch)
│   ├── .claude/                     # Claude Code configuration
│   │   ├── commands/tofie/          # Slash commands
│   │   └── skills/                  # Claude skills
│   ├── tools/local/scripts/tofie/   # Tofie automation scripts
│   │   ├── planning.sh
│   │   ├── planning-subagent.sh
│   │   ├── implement.sh
│   │   ├── implement-subagent.sh
│   │   ├── adjust.sh
│   │   ├── adjust-subagent.sh
│   │   └── submit-pr.sh
│   └── [rest of codebase]
│
└── trees/                           # Git worktrees (isolated branches)
    ├── john-eng-1144/               # Worktree for issue ENG-1144
    │   ├── .plans/                  # Gitignored artifacts
    │   │   ├── PLAN.md
    │   │   ├── linear-metadata.json
    │   │   ├── n8n-metadata.json
    │   │   └── pr-info.json
    │   └── [full codebase on feature branch]
    │
    └── jane-eng-2205/               # Another parallel worktree
        └── ...

Git Worktree Benefits:

  • Isolation: Each issue works in separate directory
  • Parallelization: Multiple issues can be worked on simultaneously
  • No switching: No need to checkout/switch branches
  • Clean state: Each worktree starts from clean main branch
  • Safety: Changes don’t affect main repository

4. Claude Code (AI Layer) #

Role: AI-powered development assistant

Capabilities:

  • Code generation and modification
  • Implementation planning
  • Code review and quality checks
  • PR description generation
  • Following project conventions

Invocation Methods:

Direct Slash Commands (Fast, for single operations):

claude "/tofie-plan --branch feature/my-feature"

Via Subagents (For complex, isolated operations):

claude "Use tofie-planner SUBAGENT for comprehensive research"

Permission Modes:

  • Standard scripts: --permission-mode acceptEdits
  • Subagent scripts: --dangerously-skip-permissions

Context Engineering: Claude uses a structured planning framework with:

  • Background research phase
  • Requirements analysis
  • Phase breakdown
  • Risk assessment
  • Success criteria

5. GitHub (Code Hosting Layer) #

Role: Version control and collaboration platform

Integration:

  • PRs created via gh CLI
  • All Tofie PRs are drafts
  • PR descriptions reference Linear issues
  • Conventional commit format for titles

PR Creation Flow:

submit-pr.sh script
     ↓
Check if PR already exists (gh pr list)
     ↓
If exists: Return existing PR info
     ↓
If not exists:
  ├─ Review branch commits
  ├─ Read PLAN.md for context
  ├─ Generate PR title (conventional format)
  ├─ Build PR description
  ├─ Create draft PR (gh pr create --draft)
  └─ Return PR URL and number

Data Flow #

Planning Workflow #

flowchart TD
    A[User: @Tofie plan] --> B[Linear Webhook]
    B --> C[n8n: Parse command]
    C --> D[n8n: Build JSON input]
    D --> E[n8n: SSH to Coder]
    E --> F[Coder: planning.sh]
    F --> G{Worktree exists?}
    G -->|No| H[Create worktree from main]
    G -->|Yes| I[Reuse existing]
    H --> J[Create .plans/ directory]
    I --> J
    J --> K[Write linear-metadata.json]
    J --> L[Write n8n-metadata.json]
    K --> M[Execute Claude CLI]
    L --> M
    M --> N[Claude: Research codebase]
    N --> O[Claude: Analyze requirements]
    O --> P[Claude: Generate PLAN.md]
    P --> Q[Write .plans/PLAN.md]
    Q --> R[Return JSON success]
    R --> S[n8n: Parse result]
    S --> T[n8n: Post plan to Linear]
    T --> U[n8n: Update status to Planned]

Implementation Workflow #

flowchart TD
    A[User: @Tofie implement] --> B[Linear Webhook]
    B --> C[n8n: Parse command]
    C --> D[n8n: Find worktree]
    D --> E[n8n: SSH to Coder]
    E --> F[Coder: implement.sh]
    F --> G[Read .plans/PLAN.md]
    G --> H[Read linear-metadata.json]
    H --> I[Execute Claude CLI]
    I --> J[Claude: Review plan]
    J --> K[Claude: Implement changes]
    K --> L{Run quality checks}
    L -->|Fail| M[Claude: Fix issues]
    M --> L
    L -->|Pass| N[Claude: Create commits]
    N --> O[Push to remote branch]
    O --> P{Create PR?}
    P -->|Yes| Q[Run submit-pr.sh]
    P -->|No| R[Return JSON success]
    Q --> S[Create draft PR]
    S --> R
    R --> T[n8n: Parse result]
    T --> U[n8n: Post summary to Linear]
    U --> V[n8n: Update status]

Script Architecture #

All Tofie scripts follow a consistent pattern:

Script Structure #

#!/bin/bash
set -euo pipefail

# 1. Configuration
CLAUDE_CMD="/home/$USER/.local/bin/claude"
REPO_ROOT="$(cd ... && pwd)"

# 2. Logging functions (stderr)
log_info() { echo "[INFO] $*" >&2; }
log_error() { echo "[ERROR] $*" >&2; }

# 3. JSON output function (stdout)
output_json() {
    # Structured JSON output for n8n
    cat <<EOF
{
  "success": $success,
  "message": "$message",
  ...
}
EOF
}

# 4. Input validation
INPUT_JSON=$(cat)  # Read from stdin
BRANCH_NAME=$(echo "$INPUT_JSON" | jq -r '.branchName')

# 5. Main logic
# - Create/find worktree
# - Prepare metadata
# - Execute Claude CLI
# - Parse results

# 6. Output results
output_json true "Success message" ...

Script Variants #

Standard Scripts (planning.sh, implement.sh, adjust.sh):

  • Use slash commands directly
  • Faster execution
  • Synchronous operation
  • Permission mode: --permission-mode acceptEdits

Subagent Scripts (planning-subagent.sh, implement-subagent.sh, adjust-subagent.sh):

  • Launch isolated Claude subagents
  • For complex, thorough operations
  • Parallel execution possible
  • Permission mode: --dangerously-skip-permissions

When to use which:

  • Standard: Quick operations, single plan/implementation
  • Subagent: Parallel planning, long-running research, experimental approaches

State Management #

Worktree Lifecycle #

stateDiagram-v2
    [*] --> Created: git worktree add
    Created --> Planning: planning.sh
    Planning --> Planned: PLAN.md created
    Planned --> Implementation: implement.sh
    Implementation --> InReview: Code committed
    InReview --> PRCreated: submit-pr.sh
    PRCreated --> Merged: GitHub PR merged
    Merged --> Cleaned: git worktree remove
    Cleaned --> [*]

    note right of Planning
        .plans/PLAN.md
        .plans/linear-metadata.json
        .plans/n8n-metadata.json
    end note

    note right of PRCreated
        .plans/pr-info.json
    end note

Linear Status Transitions #

Todo
  ↓ @Tofie plan
Planning
  ↓ PLAN.md created
Planned
  ↓ @Tofie implement
In Progress
  ↓ Implementation complete
In Review
  ↓ @Tofie pr
PR Created
  ↓ GitHub PR merged
Done

Metadata Persistence #

linear-metadata.json:

  • Created during planning
  • Contains full issue context (description, comments, labels)
  • Read during implementation for context
  • Never modified after creation

n8n-metadata.json:

  • Created during planning
  • Contains n8n webhook URL and signing key
  • Used for future event notifications
  • Static configuration

pr-info.json:

  • Created during PR submission
  • Contains PR number, URL, title
  • Used to avoid duplicate PR creation
  • Updated if PR is recreated

Security Architecture #

Authentication Flow #

sequenceDiagram
    participant n8n
    participant Coder
    participant Doppler
    participant GitHub

    n8n->>Coder: SSH (key-based auth)
    Coder->>Doppler: Request GITHUB_TOKEN
    Doppler->>Coder: Return encrypted token
    Coder->>GitHub: gh CLI with token
    GitHub->>GitHub: Validate token
    GitHub->>Coder: Success

Security Measures #

  1. SSH Key Authentication

    • n8n → Coder via SSH keys (no passwords)
    • Keys stored in n8n credential vault
    • Limited to specific user account
  2. Secrets Management

    • GitHub tokens in Doppler
    • Linear API keys in n8n credentials
    • Webhook signing keys for validation
  3. Least Privilege

    • Coder instance: Limited SSH access
    • GitHub tokens: Repo and workflow scope only
    • Linear API: Read issues + update status only
  4. Isolation

    • Each worktree is isolated
    • No cross-contamination between issues
    • Clean state from main branch each time

Scalability Considerations #

Current Limits #

  • Coder Instance: Single EC2 instance

    • Can handle ~10 parallel worktrees
    • Limited by disk space and memory
  • n8n: Cloud-hosted, auto-scaling

    • No practical webhook limits
    • Execution queue for SSH operations
  • Claude Code: Rate-limited by API

    • Concurrent requests limited
    • Long operations may timeout

Future Improvements #

  1. Multiple Coder Instances

    • Load balancing across instances
    • Geographic distribution
    • Dedicated instances per team
  2. Worktree Cleanup

    • Auto-remove merged worktrees
    • Archive old worktrees
    • Disk space monitoring
  3. Caching & Optimization

    • Cache common codebase analysis
    • Reuse research across similar issues
    • Pre-warm frequently used contexts
  4. Monitoring & Observability

    • Execution time tracking
    • Success/failure metrics
    • Resource usage dashboards

Error Handling & Recovery #

Error Categories #

1. Input Validation Errors

  • Missing required fields
  • Invalid JSON format
  • Unknown commands

Recovery: Return error JSON immediately, don’t execute

2. Infrastructure Errors

  • SSH connection failure
  • Claude CLI not found
  • Git worktree creation failure

Recovery: Retry with exponential backoff, alert on repeated failures

3. Execution Errors

  • Claude generates invalid code
  • Tests fail after implementation
  • PR creation fails

Recovery: Rollback changes, update Linear with error, allow manual intervention

4. Timeout Errors

  • Planning takes too long
  • Implementation exceeds timeout
  • Claude API timeout

Recovery: Kill process, preserve partial work, update Linear

Retry Strategy #

Attempt 1: Immediate execution
    ↓ (failure)
Wait 30 seconds
    ↓
Attempt 2: Retry
    ↓ (failure)
Wait 2 minutes
    ↓
Attempt 3: Final retry
    ↓ (failure)
Alert team + update Linear with error

Performance Characteristics #

Typical Execution Times #

Operation Standard Script Subagent Script Notes
Planning 2-5 minutes 5-15 minutes Subagent does deeper research
Implementation 5-15 minutes 10-30 minutes Depends on complexity
PR Submission 1-2 minutes N/A Only standard script
Full Workflow 10-25 minutes 20-60 minutes Planning + Implementation + PR

Resource Usage #

Coder Instance:

  • CPU: 2-4 cores during Claude execution
  • Memory: 4-8 GB per worktree
  • Disk: ~500 MB per worktree
  • Network: Minimal (GitHub/API calls only)

n8n:

  • Webhook response: <100ms
  • SSH overhead: ~1-2 seconds
  • Workflow execution: <1 minute (excluding script time)

Monitoring & Debugging #

Log Locations #

Coder Instance:

/home/coder/
├── .claude/logs/              # Claude CLI logs
└── trees/*/logs/              # Script execution logs (if enabled)

n8n:

  • Workflow execution logs in n8n UI
  • Webhook payload history
  • Error traces with stack traces

Linear:

  • Issue comment history shows Tofie responses
  • Status change history
  • Activity timeline

Debug Mode #

Enable debug logging:

DEBUG=1 ./planning.sh < input.json

Output includes:

  • Parsed input values
  • File paths and existence checks
  • Claude CLI prompts and responses
  • Git command outputs

Integration Points #

n8n ↔ Coder #

Protocol: SSH Format: JSON stdin/stdout Error Handling: Exit codes + JSON error field

Coder ↔ Claude #

Protocol: CLI invocation Format: Text prompts + file operations Permission Mode: Varies by script

Coder ↔ GitHub #

Protocol: gh CLI (GitHub REST API) Authentication: Token from Doppler Operations: PR create, PR list, PR view

n8n ↔ Linear #

Protocol: REST API + Webhooks Authentication: API key + webhook signature Operations: Update status, post comments, read issues

Best Practices #

For Script Developers #

  1. Always use JSON I/O: Structured data for reliable parsing
  2. Log to stderr, output to stdout: Keep streams separate
  3. Include timestamps: Help debugging timing issues
  4. Handle partial failures: Don’t fail entire workflow for minor issues
  5. Make scripts idempotent: Safe to run multiple times

For System Administrators #

  1. Monitor disk space: Worktrees accumulate over time
  2. Rotate logs: Claude logs can grow large
  3. Update tokens: GitHub tokens expire periodically
  4. Test webhooks: Use n8n’s test webhook feature
  5. Keep n8n workflows versioned: Export and commit to git

For Users #

  1. Be specific in Linear: Better input → better output
  2. Review plans before implementing: Catch issues early
  3. Test draft PRs: Tofie creates drafts for your review
  4. Provide feedback: Help improve prompts and outputs
  5. Report issues: Tag DevOps in Linear if something fails

Future Architecture Directions #

Short-term (Q1 2026) #

  • Worktree auto-cleanup after PR merge
  • Better error notifications (Slack integration)
  • Execution time dashboards
  • Cost tracking per operation

Medium-term (Q2-Q3 2026) #

  • Multi-instance Coder support
  • Parallel implementation exploration
  • A/B testing different approaches
  • Caching and optimization

Long-term (Q4 2026+) #

  • Self-healing infrastructure
  • Predictive planning (analyze similar issues)
  • Auto-tuning prompts based on success rates
  • Integration with deployment pipeline