Clawdbot Architecture

Deep dive into Clawdbot (Moltbot) internals: from lane queues to semantic vision, deconstructing modern AI Agent engineering practices.

SYSTEM: ONLINE

LATENCY: 12ms

SECURITY: ENFORCED

Initiate Sequence

CORE_ARCHITECTURE MODULE

CORE_ARCHITECTURE

The Lane-based Gateway

Clawdbot is essentially a TypeScript CLI Application, not a Web App. It runs as a daemon on your machine, acting as a gateway for all external connections (Telegram, Slack, etc.).

The core architecture uses the "Lane-based" design pattern:

Channel Adapter: Standardizes message input from different channels.
Gateway Server: Task coordination center handling concurrent requests.
Lane Queue: Default to Serial. Each session is assigned a dedicated lane, avoiding the race condition hell of async/await. Only explicitly low-risk tasks are processed in parallel.

TypeScriptCLISerial Queue

IMG_SRC:RENDER_OK

// Gateway Coordinator
class Gateway {
  // Map session ID to dedicated Serial Lane
  private lanes = new Map<string, SerialLane>();

  async handleRequest(req: Request) {
    const lane = this.lanes.get(req.sessionId) 
      || new SerialLane();
      
    // High-risk/Stateful -> Serial Lane (Queue)
    if (req.isStateful) {
      await lane.add(new Task(req)); 
    } 
    // Low-risk/Read-only -> Parallel execution
    else {
      await runParallel(new Task(req));
    }
  }
}

// Serial Lane: Prevents race conditions
class SerialLane {
  private queue: Task[] = [];
  private isRunning = false;

  async add(task: Task) {
    this.queue.push(task);
    if (!this.isRunning) this.process();
  }

  private async process() {
    this.isRunning = true;
    while (this.queue.length > 0) {
      // Execute one by one
      await this.queue.shift()?.execute();
    }
    this.isRunning = false;
  }
}

MEMORY_SYSTEM MODULE

MEMORY_SYSTEM

Hybrid Retrieval Architecture

Without memory, AI is just a goldfish. Clawdbot solves persistence via a dual-layer system:

Session Transcripts (JSONL): Records full session history for context building.
Memory Files (Markdown): Long-term memory stored in memory/*.md.

Hybrid Search

Combines Vector Search (SQLite) and Keyword Search (FTS5). Searching for "authentication bug" finds semantically related "auth issues" and matches exact phrases. The Agent manages memory directly via standard file writing tools, no black-box API.

JSONLMarkdownSQLiteFTS5

IMG_SRC:RENDER_OK

// Memory Structure
memory/
  ├── user_profile.md
  ├── project_claudia.md
  └── tech_stack_notes.md

// Search Query
SELECT * FROM memory 
WHERE vec_search(embedding, query) 
OR text_match(content, query);

COMPUTER_USE MODULE

COMPUTER_USE

Sandboxed Execution

Clawdbot gives Agents real computer operation permissions, but with strict security limits:

Execution Environment: Defaults to Docker Sandbox, can also be configured for direct host operation.
Allowlist: User explicitly approved commands (e.g., npm, git).
Blocklist: Defaults to blocking dangerous operations (e.g., rm -rf /, command substitution, redirection to system files).

DockerSandboxAllowlist

// ~/.clawdbot/exec-approvals.json
{
  "allowlist": [
    { "pattern": "/usr/bin/npm", "lastUsed": 1706644800 },
    { "pattern": "/opt/homebrew/bin/git", "lastUsed": 1706644900 }
  ],
  "blocked": [
    "rm -rf /",
    "cat file > /etc/hosts"
  ]
}

SEMANTIC_VISION MODULE

SEMANTIC_VISION

Accessibility Tree over Pixels

Browser tools do not rely on screenshots, but use Semantic Snapshots.

It is a text representation of the page's Accessibility Tree (ARIA).

IMAGE SNAPSHOT

~5MB Size

High Token Cost

SEMANTIC SNAPSHOT

<50KB Size

Structure Only

Browsing the web is essentially a semantic understanding task, not just a visual task. The Agent sees a structured list of buttons, inputs, and headings.

PlaywrightARIALow-Token

IMG_SRC:RENDER_OK

- button "Sign In" [ref=1]
- textbox "Email" [ref=2]
- textbox "Password" [ref=3]
- link "Forgot password?" [ref=4]
- heading "Welcome back"