Skip to content

DIMOS Integration

Context

ShadowHound needed a cognitive architecture for mission planning and execution. DIMOS (Distributed Intelligent Mobile Operating System) provides: - Mission planning and execution framework - Skill registry and execution engine - Memory systems (RAG, episodic, semantic) - Tool calling integration with LLMs

Goal: Integrate DIMOS as the agent layer, validate on physical Unitree Go2 robot.

Hypothesis

DIMOS framework can provide the cognitive layer for ShadowHound with: - Mission planning from natural language - Skill-based execution (leverage existing MyUnitreeSkills) - Memory-augmented agent (RAG for context) - ROS2 integration for robot control

Approach: Build custom mission agent as ROS2 node wrapping DIMOS.

Experiments

Experiment 1: Architecture Design (2025-10-05)

Commit: 7159e5a

What We Tried: - Designed four-layer architecture: Application → Agent → Skills → Robot - Separated mission agent (ROS2 wrapper) from mission executor (DIMOS cognitive layer) - Documented architectural patterns

Results: - ✅ Clear separation of concerns - ✅ ROS2 integration path identified - ✅ Skill registry approach validated

Decision: Proceed with custom ROS2 wrapper around DIMOS


Experiment 2: Web Interface Options (2025-10-05)

Commit: 26d25fa

What We Tried: Evaluated three options: 1. Use DIMOS built-in web interface 2. Build custom FastAPI interface 3. Use ROS2 web bridge

Results: - DIMOS UI: Functional but tightly coupled, hard to customize - Custom FastAPI: Full control, can tailor to robot needs - ROS2 bridge: Too generic, not mission-focused

Decision: Build custom web UI (Option 2) - Provides full control and understanding


Experiment 3: Custom Web UI Implementation (2025-10-05-06)

Commit: eeaa03a

What We Tried: - Built FastAPI web interface from scratch (479 LOC) - WebSocket for real-time status updates - REST API for mission submission - Static HTML/CSS/JS frontend

Results: - ✅ 479 lines of custom UI code - ✅ Real-time status updates working - ✅ Mission submission functional - ✅ Full control over UI/UX - ✅ Deep understanding of all components

Decision: Success - Custom UI provides exactly what we need


Experiment 4: Mission Agent Implementation (2025-10-06)

Commit: 32ec845

What We Tried: - Implemented mission_agent.py as ROS2 node (713 LOC) - ROS2 action server for mission execution - Service interface for status queries - Integration with mission executor (DIMOS layer)

Results: - ✅ ROS2 integration working - ✅ Action-based mission execution - ✅ Clean interface to DIMOS layer - ✅ Status publishing to topics

Decision: Validated - ROS2 wrapper pattern works


Experiment 5: Mission Executor (DIMOS Layer) (2025-10-06)

Commit: 54ea06a

What We Tried: - Implemented mission_executor.py (517 LOC) - DIMOS cognitive layer integration - Skill registry access - Memory system integration (RAG)

Results: - ✅ Clean separation from ROS2 concerns - ✅ DIMOS functionality encapsulated - ✅ Testable without ROS2 - ✅ Memory augmentation working

Decision: Success - Architecture pattern validated


Experiment 6: Configuration System (2025-10-06-07)

Commits: 10a1b2a, bfa27bf

What We Tried: - Environment-based configuration (.env files) - Multiple backend support (OpenAI cloud, local Ollama) - Thor vs laptop configurations - Configuration validation in start script

Results: - ✅ Clean configuration management - ✅ Easy backend switching - ✅ Environment-specific settings working - ✅ Validation catches misconfigurations

Decision: Adopted - .env pattern for all configuration


Experiment 7: Hardware Validation (2025-10-07-09)

Commits: f16bda8 (merge)

What We Tried: - Deployed to laptop for real robot testing - Connected to Unitree Go2 via go2_ros2_sdk - Tested with OpenAI cloud backend - Tested with vLLM local backend on Thor

Results: - ✅ End-to-end system working on physical robot! - ✅ OpenAI backend validated - ✅ vLLM backend validated - ⚠️ WebRTC API issues discovered (blocks majority of DIMOS skills) - ✅ Basic navigation skills functional

Decision: Merge feature branch - Core system validated, known limitations documented


Final Results

What Worked

Custom Mission Agent Architecture (~2,100 LOC total): - mission_agent.py (713 LOC): ROS2 wrapper node - mission_executor.py (517 LOC): DIMOS cognitive layer - web_interface.py (479 LOC): Custom FastAPI UI - rag_memory_example.py (392 LOC): Memory integration

Key Achievements: 1. ✅ End-to-end system working on physical robot 2. ✅ Dual backend support (cloud + local) 3. ✅ Custom web UI with real-time updates 4. ✅ Clean architectural separation (ROS ↔ Agent ↔ Skills ↔ Robot) 5. ✅ Memory-augmented agent (RAG integration) 6. ✅ Configuration system supporting multiple environments

Validation Highlights: - Tested on Unitree Go2 hardware ✅ - OpenAI cloud backend working ✅ - vLLM local backend working ✅ - Skills execution functional ✅

What Didn't Work / Constraints Discovered

WebRTC API Issues: - Majority of DIMOS MyUnitreeSkills use WebRTC API - WebRTC connection unstable/unreliable - Blocks: vision skills, advanced locomotion, complex behaviors - Impact: MVP scope limited to basic navigation initially

DIMOS Web Interface: - Tightly coupled to DIMOS internals - Hard to customize for robot-specific needs - Decision to build custom UI was correct

Key Decisions

  1. Feature branch approach
  2. Rationale: Validate architecture before merging to main
  3. Outcome: De-risked major architectural change

  4. Custom web UI over DIMOS built-in

  5. Rationale: Full control, deep understanding
  6. Trade-off: More code to maintain, but exactly what we need

  7. Separate mission agent and executor

  8. Rationale: Clean separation of ROS concerns from cognitive layer
  9. Benefit: Mission executor testable without ROS2

  10. Dual backend support

  11. Rationale: Cloud for development, local for deployment
  12. Flexibility: Can switch based on task complexity

  13. Validate on hardware before merge

  14. Rationale: Catch integration issues early
  15. Outcome: WebRTC constraint discovered before merge

Architecture Pattern Established

Application Layer:   start.sh, launch files, configuration
       ↓
Agent Layer:        mission_agent.py (ROS2) + mission_executor.py (DIMOS)
       ↓
Skills Layer:       DIMOS MyUnitreeSkills (~30 behaviors)
       ↓
Robot Layer:        go2_ros2_sdk → Unitree Go2 hardware

Benefits: - Clear separation of concerns - Testable layers - ROS2 wrapper provides standard interface - DIMOS provides cognitive capabilities

Implementation Status

  • [x] Architecture designed
  • [x] Mission agent implemented (713 LOC)
  • [x] Mission executor implemented (517 LOC)
  • [x] Custom web UI built (479 LOC)
  • [x] Configuration system working
  • [x] Hardware validation complete
  • [x] Feature branch merged
  • [x] Documentation complete

Follow-Up Work

Immediate (Post-Merge)

  • [ ] Implement MockRobot for development without hardware
  • [ ] Address WebRTC API issues (or work around them)
  • [ ] Expand skill coverage (non-WebRTC skills)

Future

  • [ ] Rename MissionExecutor → RobotAgent (clearer semantics)
  • [ ] Implement testing pyramid (mock skills → mock robot → Gazebo → hardware)
  • [ ] Enhanced memory systems (episodic, semantic)

Lessons Learned

  1. Feature branches work: Large architectural changes benefit from validation before merge
  2. Hardware testing critical: Discovered WebRTC constraint only on real robot
  3. Custom > Integration: Building custom UI gave deep understanding and control
  4. Separation of concerns pays off: ROS wrapper distinct from cognitive layer enables testing
  5. Configuration is critical: .env pattern allows easy environment switching

References


Total Time: 5 days (Oct 5-9, 2025) Commits: ~50 commits on feature branch Lines of Code: ~2,100 LOC (mission agent architecture) Outcome: ✅ Working end-to-end system validated on physical robot