DIMOS Integration¶

Context¶

ShadowHound needed a cognitive architecture for mission planning and execution. DIMOS (Distributed Intelligent Mobile Operating System) provides: - Mission planning and execution framework - Skill registry and execution engine - Memory systems (RAG, episodic, semantic) - Tool calling integration with LLMs

Goal: Integrate DIMOS as the agent layer, validate on physical Unitree Go2 robot.

Hypothesis¶

DIMOS framework can provide the cognitive layer for ShadowHound with: - Mission planning from natural language - Skill-based execution (leverage existing MyUnitreeSkills) - Memory-augmented agent (RAG for context) - ROS2 integration for robot control

Approach: Build custom mission agent as ROS2 node wrapping DIMOS.

Experiments¶

Experiment 1: Architecture Design (2025-10-05)¶

Commit: 7159e5a

What We Tried: - Designed four-layer architecture: Application → Agent → Skills → Robot - Separated mission agent (ROS2 wrapper) from mission executor (DIMOS cognitive layer) - Documented architectural patterns

Results: - ✅ Clear separation of concerns - ✅ ROS2 integration path identified - ✅ Skill registry approach validated

Decision: Proceed with custom ROS2 wrapper around DIMOS

Experiment 2: Web Interface Options (2025-10-05)¶

Commit: 26d25fa

What We Tried: Evaluated three options: 1. Use DIMOS built-in web interface 2. Build custom FastAPI interface 3. Use ROS2 web bridge

Results: - DIMOS UI: Functional but tightly coupled, hard to customize - Custom FastAPI: Full control, can tailor to robot needs - ROS2 bridge: Too generic, not mission-focused

Decision: Build custom web UI (Option 2) - Provides full control and understanding

Experiment 3: Custom Web UI Implementation (2025-10-05-06)¶

Commit: eeaa03a

What We Tried: - Built FastAPI web interface from scratch (479 LOC) - WebSocket for real-time status updates - REST API for mission submission - Static HTML/CSS/JS frontend

Results: - ✅ 479 lines of custom UI code - ✅ Real-time status updates working - ✅ Mission submission functional - ✅ Full control over UI/UX - ✅ Deep understanding of all components

Decision: Success - Custom UI provides exactly what we need

Experiment 4: Mission Agent Implementation (2025-10-06)¶

Commit: 32ec845

What We Tried: - Implemented mission_agent.py as ROS2 node (713 LOC) - ROS2 action server for mission execution - Service interface for status queries - Integration with mission executor (DIMOS layer)

Results: - ✅ ROS2 integration working - ✅ Action-based mission execution - ✅ Clean interface to DIMOS layer - ✅ Status publishing to topics

Decision: Validated - ROS2 wrapper pattern works

Experiment 5: Mission Executor (DIMOS Layer) (2025-10-06)¶

Commit: 54ea06a

What We Tried: - Implemented mission_executor.py (517 LOC) - DIMOS cognitive layer integration - Skill registry access - Memory system integration (RAG)

Results: - ✅ Clean separation from ROS2 concerns - ✅ DIMOS functionality encapsulated - ✅ Testable without ROS2 - ✅ Memory augmentation working

Decision: Success - Architecture pattern validated

Experiment 6: Configuration System (2025-10-06-07)¶

Commits: 10a1b2a, bfa27bf

What We Tried: - Environment-based configuration (.env files) - Multiple backend support (OpenAI cloud, local Ollama) - Thor vs laptop configurations - Configuration validation in start script

Results: - ✅ Clean configuration management - ✅ Easy backend switching - ✅ Environment-specific settings working - ✅ Validation catches misconfigurations

Decision: Adopted - .env pattern for all configuration

Experiment 7: Hardware Validation (2025-10-07-09)¶

Commits: f16bda8 (merge)

What We Tried: - Deployed to laptop for real robot testing - Connected to Unitree Go2 via go2_ros2_sdk - Tested with OpenAI cloud backend - Tested with vLLM local backend on Thor

Results: - ✅ End-to-end system working on physical robot! - ✅ OpenAI backend validated - ✅ vLLM backend validated - ⚠️ WebRTC API issues discovered (blocks majority of DIMOS skills) - ✅ Basic navigation skills functional

Decision: Merge feature branch - Core system validated, known limitations documented

Final Results¶

What Worked¶

Custom Mission Agent Architecture (~2,100 LOC total): - mission_agent.py (713 LOC): ROS2 wrapper node - mission_executor.py (517 LOC): DIMOS cognitive layer - web_interface.py (479 LOC): Custom FastAPI UI - rag_memory_example.py (392 LOC): Memory integration

Key Achievements: 1. ✅ End-to-end system working on physical robot 2. ✅ Dual backend support (cloud + local) 3. ✅ Custom web UI with real-time updates 4. ✅ Clean architectural separation (ROS ↔ Agent ↔ Skills ↔ Robot) 5. ✅ Memory-augmented agent (RAG integration) 6. ✅ Configuration system supporting multiple environments

Validation Highlights: - Tested on Unitree Go2 hardware ✅ - OpenAI cloud backend working ✅ - vLLM local backend working ✅ - Skills execution functional ✅

What Didn't Work / Constraints Discovered¶

WebRTC API Issues: - Majority of DIMOS MyUnitreeSkills use WebRTC API - WebRTC connection unstable/unreliable - Blocks: vision skills, advanced locomotion, complex behaviors - Impact: MVP scope limited to basic navigation initially

DIMOS Web Interface: - Tightly coupled to DIMOS internals - Hard to customize for robot-specific needs - Decision to build custom UI was correct

Key Decisions¶

Feature branch approach
Rationale: Validate architecture before merging to main
Outcome: De-risked major architectural change
Custom web UI over DIMOS built-in
Rationale: Full control, deep understanding
Trade-off: More code to maintain, but exactly what we need
Separate mission agent and executor
Rationale: Clean separation of ROS concerns from cognitive layer
Benefit: Mission executor testable without ROS2
Dual backend support
Rationale: Cloud for development, local for deployment
Flexibility: Can switch based on task complexity
Validate on hardware before merge
Rationale: Catch integration issues early
Outcome: WebRTC constraint discovered before merge

Architecture Pattern Established¶

Application Layer:   start.sh, launch files, configuration
       ↓
Agent Layer:        mission_agent.py (ROS2) + mission_executor.py (DIMOS)
       ↓
Skills Layer:       DIMOS MyUnitreeSkills (~30 behaviors)
       ↓
Robot Layer:        go2_ros2_sdk → Unitree Go2 hardware

Benefits: - Clear separation of concerns - Testable layers - ROS2 wrapper provides standard interface - DIMOS provides cognitive capabilities

Implementation Status¶

[x] Architecture designed
[x] Mission agent implemented (713 LOC)
[x] Mission executor implemented (517 LOC)
[x] Custom web UI built (479 LOC)
[x] Configuration system working
[x] Hardware validation complete
[x] Feature branch merged
[x] Documentation complete

Follow-Up Work¶

Immediate (Post-Merge)¶

[ ] Implement MockRobot for development without hardware
[ ] Address WebRTC API issues (or work around them)
[ ] Expand skill coverage (non-WebRTC skills)

Future¶

[ ] Rename MissionExecutor → RobotAgent (clearer semantics)
[ ] Implement testing pyramid (mock skills → mock robot → Gazebo → hardware)
[ ] Enhanced memory systems (episodic, semantic)

Lessons Learned¶

Feature branches work: Large architectural changes benefit from validation before merge
Hardware testing critical: Discovered WebRTC constraint only on real robot
Custom > Integration: Building custom UI gave deep understanding and control
Separation of concerns pays off: ROS wrapper distinct from cognitive layer enables testing
Configuration is critical: .env pattern allows easy environment switching

References¶

Mission Agent vs Executor Architecture
Mission Agent Package Docs
DIMOS Framework (submodule)
Feature branch: feature/dimos-integration
Merge commit: f16bda8

Total Time: 5 days (Oct 5-9, 2025) Commits: ~50 commits on feature branch Lines of Code: ~2,100 LOC (mission agent architecture) Outcome: ✅ Working end-to-end system validated on physical robot