Persistent Intelligence Architecture × DIMOS-Unitree Integration¶

Last updated: 2025-10-14

Purpose¶

This document bridges the Persistent Intelligence Architecture (multi-brain, day/night learning) with the DIMOS-Unitree framework currently being developed for ShadowHound. It provides a concrete implementation roadmap showing how the abstract architecture maps to actual DIMOS components, skills, and the mission agent.

Key Question: How do we layer persistent intelligence capabilities onto the existing DIMOS/Go2/Mission Agent stack without breaking what works?

1. Current System Architecture (As-Built)¶

1.1 Four-Layer Stack (Current Implementation)¶

┌─────────────────────────────────────────────────────────────┐
│ APPLICATION LAYER                                            │
│ • shadowhound_bringup (launch files)                        │
│ • Mission orchestration via web UI                          │
│ • Configuration: .env files (not YAML)                      │
└─────────────────────────────────────────────────────────────┘
                          ↓
┌─────────────────────────────────────────────────────────────┐
│ AGENT LAYER (Mission Intelligence)                           │
│ • shadowhound_mission_agent (~2,100 LOC)                    │
│ • LLM reasoning: OpenAI cloud OR vLLM Thor                  │
│ • Tool calling: Skills registry integration                  │
│ • FastAPI web UI (479 LOC)                                  │
└─────────────────────────────────────────────────────────────┘
                          ↓
┌─────────────────────────────────────────────────────────────┐
│ SKILLS LAYER (DIMOS Execution Engine)                       │
│ • MyUnitreeSkills (~30 behaviors in DIMOS)                  │
│ • ⚠️ WebRTC API blocks majority of skills                   │
│ • Working: Non-WebRTC skills only                           │
│ • Perception: YOLO, tracking (untested)                     │
└─────────────────────────────────────────────────────────────┘
                          ↓
┌─────────────────────────────────────────────────────────────┐
│ ROBOT LAYER (Hardware Interface)                             │
│ • go2_ros2_sdk (ROS2 bridge)                                │
│ • Sensors: Camera, 2D LiDAR, IMU, joint states, odom       │
│ • Nav2 + SLAM Toolbox (tested on hardware)                 │
└─────────────────────────────────────────────────────────────┘

1.2 Hardware Configuration (Current)¶

Component	Hardware	Purpose
Development	Laptop	ROS2, DIMOS, mission agent, UI
Compute	Thor AGX (128GB)	LLM/VLM inference (vLLM)
Robot	Unitree Go2 Pro	Actuators, sensors, locomotion
Simulation	Tower (RTX 4070)	Available but unused

Network: Laptop ↔ Thor ↔ Go2 over Wi-Fi (laptop primary for dev)

1.3 Key Observations¶

What Works ✅: - Mission agent LLM reasoning and tool calling - Skill registry and execution framework - Nav2 + SLAM validated on hardware - Web UI for monitoring and control - Dual LLM backend support (cloud + local)

What's Missing ❌: - Trajectory/decision logging - Data persistence across sessions - Learning/adaptation loop - Simulation integration - Skill evolution tracking

Constraints ⚠️: - WebRTC API blocks most DIMOS skills - Thor GPU performance degraded (37→5 tok/s) - No MockRobot for development testing - Limited skill set available (non-WebRTC only)

2. Persistent Intelligence Architecture (Target State)¶

2.1 Multi-Brain Topology (Future Hardware)¶

[Go2 Body] ⇄ [Thor: Mobile Brainstem]  ~~Wi-Fi~~  ⇄  [Spark: Cortex]  ⇄  [Tower: Avatar/Sim]
                     |                                        |                    |
            Real-time control                    Reasoning/Training         Isaac Sim testing
            (50-200 Hz)                          (nightly fine-tunes)       (domain: sim)

2.2 Day/Night Learning Cycle¶

Day (Operations): - Thor runs mission agent + DIMOS skills - Logs decision trajectories locally (WAL) - Offloads heavy reasoning to Spark (optional)

Night (Learning): - Spark curates interesting trajectories - Fine-tunes skill adapters (LoRA) - Tests in Isaac Sim (Tower) - Deploys approved adapters back to Thor

3. Integration Mapping: Architecture → DIMOS¶

3.1 Where Components Live¶

Architecture Concept	Current DIMOS/ShadowHound Component	Notes
Mobile Brainstem	Thor (future) OR Laptop (current)	Mission agent + skill executor
Cortex	Not present (future: Spark)	Would handle heavy reasoning/training
Avatar	Not present (future: Tower/Isaac Sim)	RTX 4070 available but unused
Body	Go2 Pro + go2_ros2_sdk	Already working
Deliberation RPC	Mission agent → DIMOS skill calls	Already exists (tool calling)
Trajectory Log	Not implemented	Need to add
Skill Adapters	Not implemented	Future: LoRA weights for skills
Replay Buffer	Not implemented	Need to add (WAL pattern)
Domain Tags	Not implemented	Easy to add

3.2 Local Planning Discovery (Critical Update)¶

Recent Discovery: DIMOS has complete VFH (Vector Field Histogram) + Pure Pursuit local planner that eliminates SLAM dependency for MVP.

Impact: This changes both the current MVP path AND the trajectory logging architecture:

Current MVP Path (Revised)¶

Week 1: Local Planning First (NEW)
  - Test VFH local planner (no map required)
  - Add YOLO detection → navigation pipeline
  - Frame transforms: base_link → odom
  - End-to-end mission: "Find the ball"

Week 2+: Persistent Intelligence Foundation
  - Log reactive navigation decisions
  - Capture perception → action sequences
  - Build trajectory database

Why This Matters for Learning: - Reactive decisions are learnable: VFH parameter choices, when to re-plan, recovery behaviors - No localization failures: Simpler failure modes to analyze - Richer data: More reactive decisions per mission (vs few waypoints in global planning)

What to log (local planning decisions):

{
    "trajectory_type": "reactive_navigation",
    "mission": "find_red_ball",
    "steps": [
        {
            "timestamp": 1234567890.123,
            "domain": "real",
            "perception": {
                "detected_objects": [
                    {"label": "ball", "position": [2.0, 0.5], "confidence": 0.8}
                ],
                "camera_embedding": [...],  # CLIP/etc for later queries
            },
            "decision": {
                "type": "set_goal",
                "goal_xy": [2.0, 0.5],
                "frame": "odom",
                "reason": "yolo_detection"
            },
            "vfh_state": {
                "histogram": [...],  # 144 bins
                "selected_direction": 0.35,  # radians
                "obstacle_density": 0.2,
                "safety_threshold": 0.8
            },
            "action": {
                "linear_vel": 0.3,
                "angular_vel": 0.15
            },
            "outcome": {
                "distance_to_goal": 1.2,  # After action
                "collision": false,
                "stuck": false
            }
        },
        # ... more steps until goal reached
    ],
    "mission_result": {
        "success": true,
        "duration_seconds": 12.3,
        "distance_traveled": 2.8,
        "goal_accuracy": 0.15  # meters
    }
}

Learning opportunities: 1. Parameter adaptation: Tune safety_threshold, velocities based on outcomes 2. Recovery strategies: Learn when recovery behaviors work 3. Perception reliability: Correlate YOLO confidence with navigation success 4. VLM verification value: Compare missions with/without VLM verification

Frame Transformations in Logged Data¶

Critical: All positions logged in odom frame (VFH's working frame)

# Detection starts in base_link (camera frame)
detection_base_link = yolo.detect(frame)  # (x=2.0, y=0.0)

# Transform to odom (for navigation + logging)
detection_odom = robot.transform(detection_base_link, "base_link" → "odom")

# Log in odom (consistent frame for replay)
trajectory_logger.log({
    "perception": {"position": detection_odom, "frame": "odom"},
    "decision": {"goal_xy": detection_odom, "frame": "odom"}
})

Why odom for learning: - Consistent coordinate system across missions - Replay in simulator uses same frame - Adapter fine-tuning needs consistent input representation

See local_planning_architecture.md for complete frame handling details.

3.3 Message Flow Analysis¶

Current: Mission Agent → DIMOS Skills¶

# Mission Agent (shadowhound_mission_agent)
user_input = "Find the red ball"
plan = llm.generate_plan(user_input)  # Tool calls generated

# Execution
for step in plan:
    skill_name = step["name"]  # e.g., "nav.goto"
    args = step["args"]         # e.g., {"x": 5.0, "y": 2.0}
    result = skill_registry.execute(skill_name, **args)

Target: With Trajectory Logging¶

# Enhanced execution with logging
for step in plan:
    # Capture context BEFORE execution
    context = {
        "camera_embedding": self.get_camera_embedding(),
        "robot_pose": self.get_pose(),
        "detected_objects": self.perception.get_objects(),
        "mission_state": self.state_machine.current_state
    }

    # Execute skill
    result = skill_registry.execute(skill_name, **args)

    # Log trajectory record
    trajectory_logger.log_step(
        domain="real",
        skill=skill_name,
        args=args,
        context=context,
        result=result,
        outcome_score=self.assess_outcome(result)
    )

3.3 DIMOS Skills → Trajectory Actions Mapping¶

DIMOS MyUnitreeSkills (subset that works):

DIMOS Skill	Trajectory Action Type	Logging Priority
Navigation skills	`{"type": "nav", "skill": "goto/rotate/stop"}`	High
Perception skills	`{"type": "perception", "skill": "snapshot/detect"}`	High
Voice skills	`{"type": "voice", "skill": "speak/listen"}`	Medium
Utility skills	`{"type": "util", "skill": "wait/report"}`	Low

WebRTC-blocked skills: Log attempts and failures for future analysis

4. Implementation Phases¶

Phase 0: Foundation (No Hardware Changes) - Current Sprint¶

Goal: Add architectural patterns to current codebase without blocking MVP

Tasks: 1. ✅ Domain tagging: Add domain: "real" to all logs 2. ✅ Session IDs: Generate robot_id and session_id at startup 3. ✅ Monotonic timestamps: Use CLOCK_MONOTONIC for ordering 4. ✅ Skill call logging: Log every skill execution with context (simple JSON append) 5. ✅ Network profiling: Document laptop↔Thor performance

Deliverables: - budgets.yaml - Network and timing constraints - logs/trajectories/session_YYYYMMDD_HHMMSS.jsonl - Simple skill logs - Updated mission agent with logging hooks

Effort: ~1-2 days Risk: Very low (additive only)

Phase 1: Data Durability (Thor or Laptop) - MVP + 1 week¶

Goal: Power-loss safe trajectory logging

Tasks: 1. Implement WAL (Write-Ahead Logging) pattern 2. Segment files: replay/segments/seg_YYYYMMDD_NNN.wal 3. Manifest with checksums: replay/MANIFEST.json 4. Recovery tool: scripts/recover_trajectories.py

Implementation Details:

# trajectory_logger.py
class TrajectoryLogger:
    def __init__(self, data_dir="/data/replay"):
        self.segment_writer = SegmentWriter(data_dir)
        self.manifest = Manifest(data_dir)

    def log_step(self, domain, skill, args, context, result, outcome_score):
        record = {
            "domain": domain,
            "robot_id": self.robot_id,
            "session_id": self.session_id,
            "seq_id": self.next_seq_id(),
            "timestamp_ns": time.clock_gettime_ns(time.CLOCK_MONOTONIC),
            "skill": skill,
            "args": args,
            "context": context,
            "result": result,
            "outcome_score": outcome_score
        }

        self.segment_writer.append(record)  # Double-buffered, fsync every N

        if self.segment_writer.should_rotate():
            self.segment_writer.rotate()
            self.manifest.update()  # Atomic write

Deliverables: - shadowhound_trajectory_logger package - WAL implementation with recovery - Integration with mission agent

Effort: ~2-3 days Risk: Medium (need to test power-loss scenarios)

Phase 2: Message Contracts (Prepare for Offload) - MVP + 2 weeks¶

Goal: Define stable interfaces for future Thor↔Spark communication

Tasks: 1. Define Deliberation RPC schema (JSON or protobuf) 2. Define Trajectory Log schema (standardize format) 3. Implement schema validation 4. Document network budgets

Schemas:

# schemas/deliberation_rpc.py
from pydantic import BaseModel

class RobotState(BaseModel):
    pose: dict  # {x, y, yaw}
    goal: str
    detected_objects: list[str]
    mission_state: str

class DeliberationRequest(BaseModel):
    embedding: bytes  # Camera embedding (4-8KB)
    state: RobotState
    deadline_ms: int

class DeliberationResponse(BaseModel):
    subgoal: dict  # {skill, args}
    constraints: dict
    valid: bool
    reasoning: str  # LLM explanation

# schemas/trajectory_record.py
from pydantic import BaseModel

class TrajectoryRecord(BaseModel):
    domain: str  # real | sim | synthetic
    robot_id: str
    session_id: str
    seq_id: int
    timestamp_ns: int

    task_id: str
    context: dict  # State before action
    action: dict   # {skill, args}
    result: dict   # Skill execution result
    outcome_score: float  # 0.0-1.0

Deliverables: - shadowhound_interfaces/schemas/ - Pydantic models - Schema validation in mission agent - Documentation in docs/architecture/message_contracts.md

Effort: ~1-2 days Risk: Low (can start simple, evolve)

Phase 3: Simulation Avatar (Tower/Isaac Sim) - Parallel Track¶

Goal: Test policies in simulation before hardware deployment

Tasks: 1. Set up Isaac Sim on Tower (RTX 4070) 2. Import Go2 URDF/USD model 3. Create basic scene (empty room, obstacles) 4. Implement policy server (accepts RPC calls) 5. Tag all sim data with domain: "sim"

Architecture:

[Isaac Sim Scene] → Camera/LiDAR → [Embedding Extractor] → [Policy Server] → Skills
                                                                    ↓
                                                           Trajectory Logger
                                                           (domain: sim)

Integration with DIMOS: - Policy server uses same skill registry as real robot - Skills execute against Isaac Sim physics - Trajectories logged identically (except domain: "sim")

Deliverables: - Isaac Sim workspace on Tower - Policy server container - Basic Go2 scene with nav challenges - Sim trajectory logs for validation

Effort: ~1-2 weeks (learning curve) Risk: Medium (new tooling, but doesn't block MVP)

Phase 4: Cortex Integration (When Spark Arrives) - Post-MVP¶

Goal: Offload heavy reasoning to Spark, enable nightly training

Tasks: 1. Deploy mission agent reasoning to Spark 2. Implement Thor↔Spark RPC (gRPC) 3. Network latency testing (Wi-Fi constraints) 4. Fallback strategy (local reasoning if offload fails)

Day Operations:

[Thor] Mission agent (lightweight) → RPC → [Spark] LLM reasoning → response
  ↓ (if latency OK)                                ↑
  ↓ (if timeout)                                   ↑
[Thor] Local fallback reasoning ──────────────────┘

Night Operations (Spark-only):

[Spark] Read trajectory logs from Thor (Ethernet sync)
  ↓
[Spark] Curate interesting trajectories (failures, novelties)
  ↓
[Spark] Fine-tune skill adapters (LoRA)
  ↓
[Tower] Test adapters in Isaac Sim (regression suite)
  ↓
[Spark] Sign and package approved adapters
  ↓
[Thor] Deploy adapters (next day startup)

Deliverables: - gRPC service definitions - Spark deployment containers - Nightly training pipeline - Adapter deployment system

Effort: ~2-3 weeks Risk: High (multi-machine coordination, new hardware)

5. DIMOS-Specific Considerations¶

5.1 WebRTC API Blocker¶

Problem: Most DIMOS MyUnitreeSkills use WebRTC API directly (non-functional)

Impact on Persistent Intelligence: - Limits skill repertoire for learning - Trajectories will show many failed skill attempts - Need to build custom Nav2-based skills

Mitigation Strategies: 1. Short-term: Focus learning on working skills (Nav2, perception) 2. Medium-term: Implement custom skills using ROS2 topics (not WebRTC) 3. Long-term: Fix WebRTC API or contribute upstream to DIMOS

Logging Strategy: - Log all skill attempts (including WebRTC failures) - Tag with skill_available: false for blocked skills - Use failure logs to inform skill implementation priorities

5.2 Thor GPU Performance¶

Problem: Thor GPU degraded (37→5 tok/s for LLM inference)

Impact on Day/Night Cycle: - Can't run heavy LLM/VLM models on Thor during day operations - Need to offload to Spark or cloud for complex reasoning - Limits onboard autonomy

Architecture Adjustment: - Day: Thor runs lightweight VLM/LLM (or offloads to Spark/cloud) - Night: Spark handles all training/fine-tuning - Fallback: Cloud LLM if Thor/Spark both unavailable

Testing Needed: - Profile Thor with realistic mission loads - Measure latency for Thor→Spark offload - Identify minimum viable model size for Thor

5.3 MockRobot for Development¶

Problem: No MockRobot implemented (hardware required for testing)

Impact on Learning Loop: - Can't test trajectory logging without hardware - Sim avatar becomes critical for safe testing - Development velocity limited

Solution Path: 1. Phase 3: Isaac Sim becomes the "MockRobot" 2. Skills execute in simulation with identical interfaces 3. Trajectories logged with domain: "sim" for validation

5.4 Skills Inventory Integration¶

Current State: DIMOS has ~30 MyUnitreeSkills (mostly blocked)

Trajectory Logging Strategy: - Maintain skills_manifest.json:

{
  "nav.goto": {
    "status": "working",
    "implementation": "nav2",
    "logging_priority": "high"
  },
  "nav.rotate": {
    "status": "working",
    "implementation": "nav2",
    "logging_priority": "high"
  },
  "webrtc.stand": {
    "status": "blocked",
    "implementation": "webrtc_api",
    "logging_priority": "medium",
    "failure_reason": "webrtc_api_unavailable"
  }
}

Learning Focus: - Train adapters on working skills first - Use failure logs to prioritize skill development - Track skill availability over time (metrics)

6. Data Flow: Current → Target¶

6.1 Current Data Flow¶

User Input → Mission Agent → LLM Planning → Skill Calls → DIMOS → Go2
                                                              ↓
                                                    (no logging)

6.2 Phase 0 Data Flow¶

User Input → Mission Agent → LLM Planning → Skill Calls → DIMOS → Go2
                                    ↓                        ↓
                              (tool calls)        (execution results)
                                    ↓                        ↓
                              Trajectory Logger (simple JSON)
                                    ↓
                              logs/trajectories/session_XXX.jsonl

6.3 Phase 1 Data Flow (WAL)¶

User Input → Mission Agent → LLM Planning → Skill Calls → DIMOS → Go2
                                    ↓                        ↓
                              Trajectory Logger (WAL)
                                    ↓
                    replay/segments/seg_XXX.wal (double-buffered, fsync)
                                    ↓
                              replay/MANIFEST.json (atomic update)

6.4 Phase 4 Data Flow (Full Learning Loop)¶

                    ┌─────── Day Operations ───────┐
                    │                              │
User → [Thor] Mission Agent → Skills → Go2 → Sensors
              ↓                ↓
        Trajectory Logger   Results
              ↓
      /data/replay (WAL)
              ↓
    (docked at night: Ethernet sync)
              ↓
        [Spark] Curator → Training → Testing → Adapters
                              ↓           ↓
                          LoRA    [Tower] Isaac Sim
                              ↓
                    (approved adapters)
                              ↓
        [Thor] Load adapters (next day)
              └──────────────────────────┘

7. Practical Implementation Checklist¶

Phase 0: Quick Wins (This Week)¶

[ ] Add domain, robot_id, session_id to mission agent
[ ] Switch to monotonic timestamps
[ ] Create budgets.yaml with network measurements
[ ] Implement simple skill call logging (JSON append)
[ ] Test logging on sample missions

Phase 1: Data Durability (Next 2 Weeks)¶

[ ] Implement WAL segment writer
[ ] Implement manifest with checksums
[ ] Add recovery tool
[ ] Test power-loss scenarios (graceful shutdown + abrupt)
[ ] Document logging API

Phase 2: Message Contracts (Parallel)¶

[ ] Define Pydantic schemas for RPC and trajectories
[ ] Add schema validation to mission agent
[ ] Document in docs/architecture/message_contracts.md
[ ] Create validation tests

Phase 3: Simulation (Parallel, Non-Blocking)¶

[ ] Research Isaac Sim requirements
[ ] Set up workspace on Tower (RTX 4070)
[ ] Import Go2 model
[ ] Create basic test scene
[ ] Implement policy server skeleton
[ ] Test skill execution in sim

Phase 4: Cortex (Post-MVP, When Spark Arrives)¶

[ ] Design Thor↔Spark network topology
[ ] Implement gRPC deliberation service
[ ] Test latency budgets
[ ] Implement nightly training pipeline
[ ] Create adapter deployment system

8. Open Questions & Design Decisions¶

8.1 Architecture Questions¶

Q1: Where should trajectory logger run? - Option A: On laptop (current dev setup) - Option B: On Thor (future deployment) - Decision: Start on laptop, migrate to Thor when ready

Q2: JSON or Protobuf for trajectory format? - Option A: JSON (human-readable, flexible) - Option B: Protobuf (compact, typed, faster) - Decision: Start with JSON, consider protobuf when performance matters

Q3: When to implement embedding pipeline? - Option A: Phase 0 (now) - Option B: Phase 3 (when sim ready) - Decision: Phase 3 - wait until VLM/encoder choice is clear

Q4: Isaac Sim priority? - Option A: High (start immediately) - Option B: Medium (after Phase 1 complete) - Decision: TBD - depends on MVP timeline pressure

8.2 DIMOS Integration Questions¶

Q5: How to handle WebRTC-blocked skills in logging? - Log attempts with failure tags - Use to prioritize skill development - Track skill availability metrics over time

Q6: Should we log low-level ROS2 data (joint states, IMU)? - Recommendation: No, focus on decision-level trajectories - Low-level data is for VLA (future work) - Keep trajectory logs lightweight (skill-level)

Q7: How to assess outcome_score automatically? - Start with binary (success/failure from skill result) - Evolve to LLM-based assessment ("did this help?") - Eventually: reward model from human feedback

9. Success Metrics¶

Phase 0 Success¶

[ ] 100% of skill calls logged with context
[ ] Domain tags on all data
[ ] Network budgets documented
[ ] No performance regression (logging overhead <5%)

Phase 1 Success¶

[ ] Trajectories survive power-loss (tested)
[ ] Recovery tool can reconstruct from segments
[ ] Manifest integrity maintained
[ ] Logging overhead <10%

Phase 3 Success¶

[ ] Go2 model working in Isaac Sim
[ ] Same skills run in sim and real
[ ] Trajectory logs tagged with domain: sim
[ ] Basic policy can navigate test scene

Phase 4 Success¶

[ ] Thor↔Spark RPC working within latency budget
[ ] Nightly training pipeline runs automatically
[ ] Adapters deployed and loaded successfully
[ ] Measurable improvement in mission success rate

10. References¶

Persistent Intelligence Architecture - Abstract architecture
Early Design Priorities - Implementation patterns
Mission Agent vs Executor - DIMOS architecture
MVP Roadmap - Current goals and constraints
Skills Inventory - Available DIMOS skills

"The best time to plant a tree was 20 years ago. The second best time is now."
— Integration begins with the first log entry