Bubbaloop Agent Guide¶

This document teaches AI agents how to use bubbaloop via MCP (Model Context Protocol). Read this before calling any tools.

Connection¶

stdio (local agent)¶

bubbaloop mcp --stdio

Reads JSON-RPC from stdin, writes to stdout. Logs go to ~/.bubbaloop/mcp-stdio.log.

Authentication: None (process boundary provides implicit trust per MCP spec).

HTTP (remote agent)¶

MCP server runs on http://127.0.0.1:8088/mcp when daemon is active.

Authentication: Authorization: Bearer <token> (token at ~/.bubbaloop/mcp-token)

Rate limits: 100 request burst, ~1 req/sec sustained replenishment.

Agent model authentication resolves in order: API key (ANTHROPIC_API_KEY env var) → OAuth bearer token (from bubbaloop login). API key takes precedence when both are configured.

Creating and Managing Agents¶

Bubbaloop runs a multi-agent runtime inside the daemon. Each agent is an independent LLM reasoning loop with its own identity (Soul), memory, and capabilities. The CLI is a thin Zenoh pub/sub client — all LLM processing happens daemon-side.

How It Works¶

┌──────────────┐     Zenoh pub/sub      ┌─────────────────────────────────────┐
│  CLI client   │◄─────────────────────►│          Daemon (bubbaloop up)       │
│               │   inbox → shared      │                                     │
│ agent chat    │   outbox ← per-agent  │  ┌───────────┐  ┌───────────┐      │
│ agent list    │                       │  │ jean-clawd │  │cam-expert │ ...  │
└──────────────┘                        │  │ Soul+Mem   │  │ Soul+Mem  │      │
                                        │  └───────────┘  └───────────┘      │
                                        │       ▲ shared DaemonPlatform       │
                                        │       │ (MCP tools, node mgr)       │
                                        └───────────────────────────────────┘

CLI publishes a JSON message to the shared inbox Zenoh topic
Runtime (inside daemon) routes the message to the right agent
Agent processes the turn (LLM calls, tool use) and streams events to its outbox topic
CLI subscribes to all outbox topics, filters by correlation ID, and renders the response

Step 1: Configure Agents¶

Create ~/.bubbaloop/agents.toml:

# The default agent — receives all unaddressed messages
[agents.jean-clawd]
enabled = true
default = true

# A specialist agent — target with `bubbaloop agent chat -a camera-expert`
[agents.camera-expert]
enabled = true
capabilities = ["camera", "rtsp", "video"]

Fields:

Field	Type	Default	Description
`enabled`	bool	`true`	Whether the agent starts with the daemon
`default`	bool	`false`	Routes unaddressed messages here (exactly one should be `true`)
`capabilities`	string[]	`[]`	Keyword tags (future: capability-based routing)
`provider`	string	`"claude"`	LLM provider: `"claude"` or `"ollama"`
`model`	string	—	Model override (e.g., `"claude-haiku-4-5-20251001"`). Overrides `soul/capabilities.toml` model_name when set.

If no agents.toml exists, the runtime creates a single default agent named jean-clawd.

Interactive setup wizard:

bubbaloop agent setup              # Configure existing agent (interactive selection)
bubbaloop agent setup -a my-agent  # Create or configure a specific agent

The wizard lets you choose provider (Claude or Ollama), pick a model, and — for new agents — write an initial identity.md. No daemon required.

Step 2: Customize the Agent's Soul¶

Each agent has a per-agent directory at ~/.bubbaloop/agents/{agent_id}/. Create the soul files:

~/.bubbaloop/agents/camera-expert/soul/identity.md — system prompt:

You are CamBot, an AI agent specialized in video surveillance and RTSP cameras
through the Bubbaloop skill runtime.

Your focus: managing RTSP camera feeds, diagnosing video issues, configuring streams.

When given a task, DO it — use your tools, get results, report back.
Be concise. Report what you did and the result.

~/.bubbaloop/agents/camera-expert/soul/capabilities.toml — model and tuning:

model_name = "claude-sonnet-4-20250514"
max_turns = 15
allow_internet = true

# Heartbeat tuning (adaptive interval in seconds)
heartbeat_base_interval = 60
heartbeat_min_interval = 5
heartbeat_decay_factor = 0.7

# "auto" = execute immediately, "propose" = save for human approval
default_approval_mode = "auto"

# Retry / circuit breaker
max_retries = 3

# Memory retention: delete episodic logs older than N days (0 = keep forever)
episodic_log_retention_days = 30

# Context compaction: flush working state to episodic memory when input tokens
# exceed this threshold, enabling recovery after LLM context truncation
compaction_flush_threshold_tokens = 4000

# Temporal decay: half-life (days) for BM25 search relevance scoring.
# Older episodic entries are demoted. 0 = no decay.
episodic_decay_half_life_days = 30

If soul files don't exist, the agent falls back to the global soul at ~/.bubbaloop/soul/, then to compiled-in defaults.

Soul files are hot-reloaded — edit them while the daemon is running and changes take effect on the next turn.

Step 3: Start the Daemon¶

bubbaloop up

The daemon starts the agent runtime, which: 1. Reads ~/.bubbaloop/agents.toml (or uses default config) 2. For each enabled agent: creates ~/.bubbaloop/agents/{id}/ directory, loads Soul, initializes Claude provider, opens per-agent Memory (episodic NDJSON + semantic SQLite) 3. Subscribes to the shared Zenoh inbox 4. Registers per-agent manifest queryables 5. Spawns per-agent tokio tasks (event loops)

Look for these log lines to confirm:

[Runtime] Agent 'jean-clawd' ready (default=true)
[Runtime] Agent 'camera-expert' ready (default=false)
[Runtime] Agent runtime started: 2 agent(s), inbox=bubbaloop/global/{machine}/agent/inbox

Step 4: Interact via CLI¶

# Single message — plain stdout, good for scripting/piping
bubbaloop agent chat "what is the system status?"

# Interactive TUI REPL — two-panel layout, scrollable history
bubbaloop agent chat

# TUI REPL with tool debug info (see every tool call + result)
bubbaloop agent chat -v

# Target a specific agent
bubbaloop agent chat -a camera-expert "describe the video feed"

# List all running agents
bubbaloop agent list

TUI layout: The top panel shows scrollable conversation history. The bottom panel (always visible, green border) is the input line. Use ↑/↓ or PageUp/PageDown to scroll history while the agent is responding. Press Ctrl-C or type q on an empty input line to exit.

TUI colours: cyan = agent name, green = agent text, yellow = tool calls, gray = tool results, red = errors, bold white = your messages.

agent list queries all manifest Zenoh queryables and prints:

ID                   NAME                      DEFAULT    MODEL                          CAPABILITIES
-----------------------------------------------------------------------------------------------
jean-clawd           Bubbaloop                 yes        claude-sonnet-4-20250514
camera-expert        CamBot                               claude-sonnet-4-20250514       camera, rtsp, video

Per-Agent Directory Layout¶

~/.bubbaloop/agents/{agent_id}/
├── soul/
│   ├── identity.md          # System prompt (markdown)
│   └── capabilities.toml    # Model config, heartbeat tuning
├── memory/                  # Episodic logs (NDJSON, one file per day)
└── memory.db                # Semantic memory (SQLite: jobs, search index)

Each agent owns its memory exclusively — no sharing between agents.

Robustness & Error Recovery¶

The agent loop includes multiple layers of fault tolerance:

Timeouts: - Turn timeout: 120 seconds per LLM turn (prevents stuck API calls) - Tool-call timeout: 30 seconds per individual tool execution (prevents runaway tools)

Tool result truncation: - Tool outputs exceeding 4096 characters are truncated with a [truncated] marker - Prevents large outputs (e.g., verbose logs) from blowing up the LLM context window

Provider retry with exponential backoff: - Retries on transient HTTP errors (429 rate limit, 5xx server errors) - Exponential backoff: 1s, 2s, 4s (base * 2^attempt) - Maximum 3 retries before propagating the error

Pre-compaction context recovery: - When input tokens approach the context limit, the agent flushes working state to episodic memory - On subsequent turns, the most recent flush is recovered and injected into the system prompt as "Previously Persisted Context" - This ensures continuity across LLM context truncations

Job retry with circuit breaker: - Failed jobs retry with exponential backoff: 10s * 2^retry_count - Maximum retries configurable via max_retries in capabilities.toml (default: 3) - After exhausting retries, jobs are dead-lettered (failed_requires_approval status)

Zenoh Gateway Topics¶

bubbaloop/global/{machine_id}/agent/inbox                ← shared intake (all messages)
bubbaloop/global/{machine_id}/agent/{agent_id}/outbox    ← per-agent streamed response
bubbaloop/global/{machine_id}/agent/{agent_id}/manifest  ← agent capabilities (queryable)

Wire format (JSON):

Inbox — AgentMessage (CLI → Daemon):

{"id": "uuid", "text": "user message", "agent": "camera-expert", "auth_token": "bb_abc123"}

Field	Type	Required	Description
`id`	string	Yes	Unique message ID
`text`	string	Yes	User message text
`agent`	string	No	Target agent ID (omit for default agent)
`auth_token`	string	No	Bearer token from `~/.bubbaloop/mcp-token`

Outbox — AgentEvent (Daemon → CLI):

{"id": "uuid", "type": "delta", "text": "token..."}
{"id": "uuid", "type": "tool", "text": "get_system_status", "input": "{\"name\":\"cam1\"}"}
{"id": "uuid", "type": "tool_result", "text": "..."}
{"id": "uuid", "type": "error", "text": "API error: 429"}
{"id": "uuid", "type": "system", "text": "context: 3 world state entries, 2 memory episodes"}
{"id": "uuid", "type": "done"}

Field	Type	Required	Description
`id`	string	Yes	Matches the inbox message ID
`type`	string	Yes	Event type (see below)
`text`	string	Yes	Event payload
`input`	string	No	Tool input JSON (only on `tool` events)

Event types: delta (LLM token), tool (tool call), tool_result (tool output), error (API/runtime error), system (lifecycle context — world state, memory, turn counter), done (turn complete).

Message Routing¶

If the inbox message has an explicit agent field → route to that agent
Otherwise → route to the default agent (default = true in config)
If the target agent's inbox is full (capacity 32) → message is dropped with a warning

Building a Channel Adapter¶

Because the gateway is a Zenoh convention (not hardcoded to the CLI), you can build adapters for any channel by publishing to the inbox topic and subscribing to outbox topics. The wire format is the same regardless of source.

Quick Start Workflow¶

list_nodes → see all nodes with status
get_node_health → get details for a specific node
get_node_schema → understand its data format
get_stream_info → get Zenoh topic for live data
send_command → trigger actions on the node
schedule_task → automate recurring agent jobs

Architecture: Dual-Plane Model¶

MCP = control plane (tool calls, JSON responses, max ~100 req/s) Zenoh = data plane (sensor streams, protobuf, 1000s msg/s)

Never route streaming data through MCP. Use get_stream_info to get Zenoh connection params for direct subscription.

Why Two Planes?¶

MCP is request/response: great for "give me status" or "start this node"
Zenoh is pub/sub: great for "stream all temperature readings"
Mixing them creates bottlenecks and violates MCP transport limits

Tool Reference¶

Discovery Tools¶

`list_nodes`¶

Tier: Viewer (read-only)

List all registered nodes with their status, capabilities, and topics.

Parameters: None

Returns: JSON array of node summaries:

[
  {
    "name": "rtsp-camera",
    "status": "Running",
    "health": "Healthy",
    "installed": true,
    "is_built": true,
    "node_type": "sensor"
  }
]

Example workflow:

Agent: list_nodes
→ Get overview of all nodes
→ Pick interesting ones for detailed inspection

`get_node_health`¶

Tier: Viewer

Get detailed health status of a specific node including uptime and resource usage.

Parameters: - node_name (string, required): Name of the node (e.g., "rtsp-camera", "openmeteo")

Returns: JSON object with: - name: Node name - status: Current status ("Running", "Stopped", "Failed", "Building") - health: Health state ("Healthy", "Degraded", "Unhealthy", "Unknown") - installed: Whether node source is installed - is_built: Whether binary is built - uptime_seconds: How long the node has been running (if running) - last_heartbeat: Timestamp of last heartbeat

Example:

{
  "node_name": "rtsp-camera",
  "status": "Running",
  "health": "Healthy",
  "uptime_seconds": 3627,
  "last_heartbeat": "2026-02-26T15:23:41Z"
}

`discover_nodes`¶

Tier: Viewer

Discover all nodes across all machines by querying manifests on bubbaloop/**/manifest. Returns self-describing nodes with their capabilities.

Parameters: None

Returns: Multi-line text with one manifest per line, formatted as:

[bubbaloop/global/machine_id/node_name/manifest] {"name":"...","version":"...","capabilities":[...]}

Use case: Fleet-wide discovery in multi-machine deployments.

`get_node_manifest`¶

Tier: Viewer

Get the manifest (self-description) of a node including its capabilities, published topics, commands, and hardware requirements.

Parameters: - node_name (string, required): Name of the node

Returns: JSON manifest:

{
  "name": "rtsp-camera",
  "version": "0.1.0",
  "description": "RTSP video stream capture",
  "capabilities": ["video_capture", "motion_detection"],
  "publishes": [
    {"topic": "frame", "schema": "VideoFrame", "rate_hz": 30}
  ],
  "commands": [
    {"name": "capture_frame", "params": {"resolution": "string"}}
  ],
  "hardware": {"arch": "aarch64", "min_memory_mb": 512}
}

`list_commands`¶

Tier: Viewer

List available commands for a specific node with their parameters and descriptions. Use this before send_command to discover what actions a node supports.

Parameters: - node_name (string, required): Name of the node

Returns: JSON array of command definitions:

[
  {
    "name": "capture_frame",
    "description": "Capture a single frame",
    "parameters": [
      {"name": "resolution", "type": "string", "default": "1080p"}
    ]
  },
  {
    "name": "set_exposure",
    "parameters": [
      {"name": "value", "type": "number", "required": true}
    ]
  }
]

`get_node_schema`¶

Tier: Viewer

Get the protobuf schema of a node's data messages. Returns the schema in human-readable format (proto3 syntax) if available.

Parameters: - node_name (string, required): Name of the node

Returns: Protobuf schema definition or error message if not available.

Example output:

syntax = "proto3";

message VideoFrame {
  uint64 timestamp_ns = 1;
  bytes image_data = 2;
  uint32 width = 3;
  uint32 height = 4;
  string encoding = 5;
}

`get_stream_info`¶

Tier: Viewer

Get Zenoh connection parameters for subscribing to a node's data stream. Returns topic pattern, encoding, and endpoint. Use this to set up streaming data access outside MCP.

Parameters: - node_name (string, required): Name of the node

Returns: JSON with Zenoh connection info:

{
  "zenoh_topic": "bubbaloop/global/nvidia_orin00/rtsp-camera/**",
  "encoding": "protobuf",
  "endpoint": "tcp/localhost:7447",
  "note": "Subscribe to this topic via Zenoh client library for real-time data. MCP is control-plane only."
}

Usage: Pass this info to a Zenoh client library (Python: zenoh-python, Rust: zenoh) to subscribe directly to the data stream.

Lifecycle Tools¶

`start_node`¶

Tier: Operator

Start a stopped node via the daemon. The node must be installed and built.

Parameters: - node_name (string, required): Name of the node

Returns: Success or error message.

Example: start_node(node_name="rtsp-camera")

`stop_node`¶

Tier: Operator

Stop a running node via the daemon.

Parameters: - node_name (string, required): Name of the node

Returns: Success or error message.

`restart_node`¶

Tier: Operator

Restart a node (stop then start).

Parameters: - node_name (string, required): Name of the node

Returns: Success or error message.

`build_node`¶

Tier: Admin

Trigger a build for a node. Builds the node's source code using its configured build command (Cargo, pixi, npm, etc.).

Parameters: - node_name (string, required): Name of the node

Returns: Success or error message.

Note: Builds can take several minutes. Check logs with get_node_logs to monitor progress.

Data & Command Tools¶

`send_command`¶

Tier: Operator

Send a command to a node's command queryable. The node must support the command — call list_commands first to see available commands.

Parameters: - node_name (string, required): Name of the node - command (string, required): Command name (must be listed in the node's manifest) - params (object, optional): JSON parameters for the command (default: {})

Returns: Command result or error message.

Example:

{
  "node_name": "rtsp-camera",
  "command": "capture_frame",
  "params": {"resolution": "1080p"}
}

Response:

{
  "status": "ok",
  "frame_path": "/tmp/frame_12345.jpg"
}

`get_node_config`¶

Tier: Operator

Get the current configuration of a node by querying its Zenoh config queryable.

Parameters: - node_name (string, required): Name of the node

Returns: JSON configuration object (node-specific schema).

Example response:

{
  "rtsp_url": "rtsp://192.168.1.100:554/stream",
  "framerate": 30,
  "resolution": "1920x1080"
}

`get_node_logs`¶

Tier: Operator

Get the latest logs from a node's systemd service.

Parameters: - node_name (string, required): Name of the node

Returns: Plain text log output (last 50 lines by default).

Use case: Debug node failures or monitor startup progress.

Scheduling Tools¶

`schedule_task`¶

Tier: Operator

Schedule a task for the agent to execute later. Supports one-off and recurring tasks via cron expressions. Uses tokio::Notify for immediate pickup.

Parameters: - prompt (string, required): The instruction for the agent to execute - cron_schedule (string, optional): Cron expression for recurring tasks (5 or 6 field) - recurrence (boolean, optional): Whether this is a recurring task (default: false)

Returns: Job ID on success.

Example (one-off):

{
  "prompt": "Check all node health and report"
}

Example (recurring):

{
  "prompt": "Run health patrol on all sensors",
  "cron_schedule": "*/15 * * * *",
  "recurrence": true
}

`list_jobs`¶

Tier: Viewer

List scheduled jobs. Optionally filter by status.

Parameters: - status (string, optional): Filter by status — pending, running, completed, failed, failed_requires_approval

`delete_job`¶

Tier: Operator

Delete a scheduled job by ID.

Parameters: - job_id (string, required): The job ID to delete

Memory Tools¶

Memory tools operate on the agent's per-agent SQLite database at ~/.bubbaloop/agents/{agent_id}/memory.db. Each agent has isolated memory — no shared global state.

`memory_search`¶

Tier: Viewer

Search episodic memory for past conversations, tool results, and agent observations. Uses BM25 full-text search with temporal decay.

Parameters: - query (string, required): Search query (keywords or phrases) - limit (integer, optional): Maximum results to return (default: 10)

`memory_forget`¶

Tier: Admin

Remove matching entries from episodic memory search index. Use for PII removal, correcting false memories, or user-requested deletion. Creates an audit trail.

Parameters: - query (string, required): Search query to match entries to forget - reason (string, required): Reason for forgetting (logged in audit trail)

`create_proposal`¶

Tier: Operator

Create a proposal for human approval before executing a risky action. Use for destructive operations like removing nodes or changing configs.

Parameters: - skill (string, required): The tool or action category (e.g., "restart_node", "remove_node") - description (string, required): Human-readable description of what will happen - actions (string, required): JSON array of tool calls to execute if approved

`list_proposals`¶

Tier: Viewer

List proposals for human-in-the-loop approval.

Parameters: - status (string, optional): Filter by status — pending, approved, rejected, expired

System Tools¶

`get_system_status`¶

Tier: Viewer

Get overall system status including daemon health, node count, and Zenoh connection state.

Parameters: None

Returns: JSON status summary:

{
  "machine_id": "nvidia_orin00",
  "nodes_total": 12,
  "nodes_running": 10,
  "nodes_healthy": 9,
  "mcp_server": "running",
  "agent_available": true
}

Use case: Health check before performing operations.

`get_machine_info`¶

Tier: Viewer

Get machine hardware and OS information: architecture, hostname, OS version.

Parameters: None

Returns: JSON machine info:

{
  "machine_id": "nvidia_orin00",
  "arch": "aarch64",
  "os": "linux",
  "hostname": "jetson-orin"
}

`query_zenoh`¶

Tier: Admin

Query a Zenoh key expression (admin only). Key must start with bubbaloop/. Returns up to 100 results.

Parameters: - key_expr (string, required): Full Zenoh key expression to query (e.g., "bubbaloop/global/nvidia_orin00/openmeteo/status")

Returns: Multi-line text with one result per line:

[bubbaloop/global/nvidia_orin00/openmeteo/status] {"temperature":22.5,"pressure":1013}

Use case: Low-level debugging, custom queries not covered by other tools.

Security note: Admin-only to prevent unauthorized data access.

`read_file`¶

Tier: Operator

Read the contents of a file. Returns up to 500 lines. Use for config files, logs, scripts, or any text file on the system.

Parameters: - path (string, required): Absolute or relative file path (supports ~/ expansion)

Returns: File contents as plain text. Long files are truncated at 500 lines.

Security: Sensitive files are blocked: SSH keys (id_rsa, id_ed25519), .pem/.key/.p12 files, .env (not .env.example), /etc/shadow, /etc/sudoers, master.key.

Example: read_file(path="/etc/hostname")

`write_file`¶

Tier: Operator

Write content to a file inside ~/.bubbaloop/workspace/. Creates parent directories if needed. Writes outside the workspace are blocked.

Parameters: - path (string, required): File path (relative paths resolve inside workspace) - content (string, required): File content to write

Returns: Confirmation with byte count and path.

Security: All writes are scoped to ~/.bubbaloop/workspace/. Symlink escape prevention via path canonicalization. Directory auto-created on first use.

Example:

{
  "path": "scripts/monitor.py",
  "content": "#!/usr/bin/env python3\nprint('Hello')"
}

`run_command`¶

Tier: Operator

Run a shell command and return its output. Captures both stdout and stderr. Use for diagnostics, system inspection, or any task requiring shell access.

Parameters: - command (string, required): Shell command to execute (passed to /bin/sh -c) - timeout_secs (integer, optional): Timeout in seconds (default: 30, max: 300)

Returns: Command output (stdout + stderr). Long output truncated at 50KB.

Security: 10-category blocklist enforced (see Security Model section below). Safe operations like ls, df, cat, pixi, cargo are allowed.

Example: run_command(command="df -h")

RBAC Tiers¶

Bubbaloop uses three authorization tiers. Each tool requires a minimum tier to execute.

Tier	Access Level	MCP Tools
Viewer (18)	Read-only monitoring	`list_nodes`, `get_node_health`, `get_node_schema`, `get_stream_info`, `get_system_status`, `get_machine_info`, `discover_nodes`, `get_node_manifest`, `list_commands`, `discover_capabilities`, `list_proposals`, `list_jobs`, `get_system_telemetry`, `get_telemetry_history`, `list_missions`, `list_constraints`, `get_belief`, `list_world_state`
Operator (15)	Day-to-day operations	`start_node`, `stop_node`, `restart_node`, `get_node_config`, `send_command`, `get_node_logs`, `enable_autostart`, `disable_autostart`, `approve_proposal`, `reject_proposal`, `delete_job`, `pause_mission`, `resume_mission`, `cancel_mission`, `update_belief`
Admin (12)	System modification	`install_node`, `remove_node`, `build_node`, `query_zenoh`, `uninstall_node`, `clean_node`, `clear_episodic_memory`, `update_telemetry_config`, `configure_context`, `register_alert`, `unregister_alert`, `register_constraint`

The tier counts (18+15+12=45) include 3 telemetry tools that are mapped in both the MCP server and the agent dispatch. The MCP server exposes 42 unique tools.

Default tier: In single-user localhost mode, all requests are granted Admin tier.

Token format: ~/.bubbaloop/mcp-token contains <token>:<tier> (e.g., bb_abc123:operator)

Permission model: Higher tiers inherit lower tier permissions (Admin can do everything, Operator can do Viewer tasks).

RBAC enforcement is in mcp/rbac.rs and mcp/mod.rs — all MCP tool calls pass through tier validation. Path and command validation for agent-internal tools (read_file, write_file, run_command) is in dispatch_security.rs.

Security Model¶

The agent's system tools (read_file, write_file, run_command) enforce defence-in-depth security to prevent damage to existing platforms.

Read Access¶

Can read any file on the filesystem
Blocked: SSH keys (id_rsa, id_ed25519, id_ecdsa, id_dsa), .pem/.key/.p12/.pfx/.jks files, .env (not .env.example/.env.template/.env.sample), /etc/shadow, /etc/sudoers, master.key

Write Access¶

Scoped to ~/.bubbaloop/workspace/ — all writes outside are rejected
Symlink escape prevention via path canonicalization
Directory auto-created on first use

Shell Commands¶

10-category blocklist:

Privilege escalation — sudo, su (requires manual execution)
Destructive filesystem — rm -rf /, mkfs, dd if=, fork bombs
System control — shutdown, reboot, killall, pkill, iptables, mount (kill <numeric_pids> is allowed for agent cleanup; kill 0/1/-1 blocked)
Non-bubbaloop service management — systemctl stop/disable/mask <non-bubbaloop> blocked; bubbaloop services allowed
System package managers — apt, apt-get, dpkg, yum, dnf, pacman, snap, flatpak (use pixi/pip for project deps)
Network mutation — ifconfig up/down, ip link set, ip route, ip addr
Remote code execution — curl | sh, wget | bash (plain curl/wget for data is fine)
Container destruction — docker rm/stop/kill, podman rm/stop/kill (docker ps/logs/inspect allowed)
Git destructive ops — push --force, reset --hard, clean -f (normal git operations allowed)
rm scoped — only files in ~/.bubbaloop/workspace/ and /tmp can be removed

Nodes¶

A node is a self-describing sensor/actuator capability. Each node:

Has a manifest (JSON) describing its capabilities, published topics, commands, hardware requirements
Publishes data on Zenoh topics (protobuf-encoded for efficiency)
Accepts commands via its command queryable (JSON request/response)
Reports health via periodic heartbeats on bubbaloop/global/{machine_id}/{node_name}/health
Serves its schema for runtime introspection on bubbaloop/global/{machine_id}/{node_name}/schema

Node Lifecycle States¶

Installed: Source code cloned to ~/.bubbaloop/nodes/{node_name}/
Built: Binary compiled to ~/.bubbaloop/nodes/{node_name}/target/release/{node_name}
Running: systemd service active, publishing to Zenoh
Healthy: Receiving heartbeats within expected interval

Node Discovery Pattern¶

1. list_nodes                    # Get names and status
2. get_node_manifest             # Understand capabilities
3. list_commands                 # See available actions
4. get_node_schema               # Decode data format
5. get_stream_info               # Get Zenoh topic for streaming

Task Scheduling¶

The scheduling system lets agents execute tasks autonomously. Use schedule_task for one-off or recurring jobs.

Scheduling Pattern¶

schedule_task(prompt="Check all node health", cron_schedule="*/15 * * * *", recurrence=true)
→ Creates a recurring job that runs every 15 minutes
→ Agent processes the prompt autonomously on each trigger

Job Lifecycle¶

pending — waiting for next scheduled run
running — agent is processing the job
completed — job finished successfully
failed — job failed, will retry with exponential backoff
failed_requires_approval — exhausted retries, needs human intervention

Best Practices¶

Use list_jobs to monitor job status
Use delete_job to cancel recurring jobs
Jobs survive daemon restarts (persisted in SQLite)
Failed jobs retry automatically — check list_jobs(status="failed") for stuck jobs

Error Handling¶

Tool Error Format¶

All tool errors return success responses with error text (MCP pattern):

{
  "content": [
    {"type": "text", "text": "Error: Node not found: nonexistent-node"}
  ]
}

Common Error Patterns¶

Validation error: "Validation error: Node name must be 1-64 characters, alphanumeric + -_"
Node not found: "Error: Node not found: <name>"
Permission denied: ErrorData with INVALID_REQUEST code
Zenoh timeout: "Error: No response from node (is it running?)"

Validation Rules¶

Node names: 1-64 characters, [a-zA-Z0-9_-] only Key expressions: Must start with bubbaloop/ Commands: Must be listed in node's manifest

Best Practices¶

Discovery Workflow¶

Always discover before acting: 1. list_nodes → Get overview 2. get_node_health → Check specific node status 3. list_commands → See available actions 4. send_command → Execute action

Streaming Data¶

Never poll with repeated tool calls (violates MCP rate limits)
Always use get_stream_info → subscribe to Zenoh topic directly
MCP is for control plane, Zenoh is for data plane

Automation¶

Use schedule_task with cron for recurring monitoring (runs in daemon, not via MCP calls)
Scheduled jobs are more efficient than polling loops
Check list_jobs to monitor scheduled job status

System Health¶

Check get_system_status before bulk operations
Use get_node_logs to diagnose failures
Use list_jobs(status="failed") to find stuck jobs

Performance¶

Batch independent operations where possible (MCP rate limit: 100 burst, 1/sec sustained)
Use Zenoh direct subscription for high-frequency data (1000s msg/sec supported)
Keep scheduled job prompts focused and specific

Advanced Topics¶

Multi-Machine Deployments¶

Use discover_nodes to find all nodes across the fleet. Trigger patterns like bubbaloop/**/sensor/temperature will match across all machines in the Zenoh network.

Zenoh Key Structure¶

bubbaloop/global/{machine_id}/{node_name}/{topic}

machine_id: Unique machine identifier (hostname-based)
node_name: Node instance name
topic: Published topic (manifest, status, schema, command, etc.)

Protobuf Decoding¶

Get schema: get_node_schema(node_name="...")
Subscribe to Zenoh topic (from get_stream_info)
Decode bytes with protobuf library (Python: protobuf, Rust: prost)

Troubleshooting¶

"No response from node (is it running?)"¶

Check node status with get_node_health. If stopped, use start_node. If unhealthy, check get_node_logs.

"Permission denied: tool requires admin tier"¶

Your token has insufficient permissions. Check ~/.bubbaloop/mcp-token tier setting.

"Validation error: ..."¶

Parameter format is invalid. Check the Tool Reference section for correct parameter schemas.

"Agent not available"¶

The daemon was started without the agent runtime. Ensure bubbaloop up starts with agent support and ~/.bubbaloop/agents.toml exists.

Rate limit exceeded¶

HTTP transport limits: 100 burst, 1/sec sustained. Space out requests or use Zenoh direct subscription for data.

Quick Reference¶

Essential Command Sequence¶

# Discovery
list_nodes → get_node_health → get_node_manifest → list_commands

# Control
start_node / stop_node / restart_node / send_command

# Scheduling
schedule_task → list_jobs → delete_job

# Data
get_stream_info → (subscribe to Zenoh topic externally)

# Health
get_system_status → get_node_logs

Tool Count by Tier¶

MCP tools (exposed to external clients via MCP server): 42 tools across Viewer/Operator/Admin tiers.

Agent-internal tools (available only to the LLM agent via dispatch, not exposed via MCP):

memory_search, memory_forget, schedule_task, create_proposal, read_file, write_file, run_command, publish_to_topic

Note: get_system_telemetry, get_telemetry_history, and update_telemetry_config are available via both the MCP server and the agent dispatch.

Key Paths¶

Token: ~/.bubbaloop/mcp-token
Agents: ~/.bubbaloop/agents/{agent_id}/
Nodes: ~/.bubbaloop/nodes/{node_name}/
Logs: ~/.bubbaloop/mcp-stdio.log (stdio mode)

Example: Temperature Monitoring Flow¶

1. list_nodes
   → Find "temperature-sensor" node

2. get_node_health(node_name="temperature-sensor")
   → Verify it's running and healthy

3. get_node_schema(node_name="temperature-sensor")
   → Understand data format (protobuf schema)

4. get_stream_info(node_name="temperature-sensor")
   → Get Zenoh topic: "bubbaloop/global/nvidia_orin00/temperature-sensor/reading"

5. schedule_task(
     prompt="Check temperature-sensor health and report any anomalies",
     cron_schedule="*/15 * * * *",
     recurrence=true
   )
   → Automate monitoring every 15 minutes

6. list_jobs()
   → Verify job is scheduled and check results

Now the agent autonomously monitors temperature every 15 minutes without further MCP calls.

Summary¶

42 MCP tools + agent-internal tools across 3 RBAC tiers (Viewer, Operator, Admin)
Dual-plane architecture: MCP for control, Zenoh for data
Task scheduling for autonomous behavior (cron jobs, retry with circuit breaker)
Robustness — turn/tool timeouts, provider retry, context recovery, result truncation
Self-describing nodes with manifests, schemas, commands
Multi-machine support via Zenoh network discovery

Read the tool reference, understand the dual-plane model, use scheduled tasks for automation, and always prefer Zenoh direct subscription over MCP polling for data streams.

Bubbaloop Agent Guide¶

Connection¶

stdio (local agent)¶

HTTP (remote agent)¶

Creating and Managing Agents¶

How It Works¶

Step 1: Configure Agents¶

Step 2: Customize the Agent's Soul¶

Step 3: Start the Daemon¶

Step 4: Interact via CLI¶

Per-Agent Directory Layout¶

Robustness & Error Recovery¶

Zenoh Gateway Topics¶

Message Routing¶

Building a Channel Adapter¶

Quick Start Workflow¶

Architecture: Dual-Plane Model¶

Why Two Planes?¶

Tool Reference¶

Discovery Tools¶

list_nodes¶

get_node_health¶

discover_nodes¶

get_node_manifest¶

list_commands¶

get_node_schema¶

get_stream_info¶

Lifecycle Tools¶

start_node¶

stop_node¶

restart_node¶

build_node¶

Data & Command Tools¶

send_command¶

get_node_config¶

get_node_logs¶

Scheduling Tools¶

schedule_task¶

list_jobs¶

delete_job¶

Memory Tools¶

memory_search¶

memory_forget¶

create_proposal¶

list_proposals¶

System Tools¶

get_system_status¶

get_machine_info¶

query_zenoh¶

read_file¶

write_file¶

run_command¶

RBAC Tiers¶

Security Model¶

Read Access¶

Write Access¶

Shell Commands¶

Nodes¶

Node Lifecycle States¶

Node Discovery Pattern¶

Task Scheduling¶

Scheduling Pattern¶

Job Lifecycle¶

Best Practices¶

Error Handling¶

Tool Error Format¶

Common Error Patterns¶

Validation Rules¶

Best Practices¶

Discovery Workflow¶

Streaming Data¶

Automation¶

System Health¶

Performance¶

Advanced Topics¶

Multi-Machine Deployments¶

Zenoh Key Structure¶

Protobuf Decoding¶

Troubleshooting¶

"No response from node (is it running?)"¶

`list_nodes`¶

`get_node_health`¶

`discover_nodes`¶

`get_node_manifest`¶

`list_commands`¶

`get_node_schema`¶

`get_stream_info`¶

`start_node`¶

`stop_node`¶

`restart_node`¶

`build_node`¶

`send_command`¶

`get_node_config`¶

`get_node_logs`¶

`schedule_task`¶

`list_jobs`¶

`delete_job`¶

`memory_search`¶

`memory_forget`¶

`create_proposal`¶

`list_proposals`¶

`get_system_status`¶

`get_machine_info`¶

`query_zenoh`¶

`read_file`¶

`write_file`¶

`run_command`¶