DIMOS Integration¶
Context¶
ShadowHound needed a cognitive architecture for mission planning and execution. DIMOS (Distributed Intelligent Mobile Operating System) provides: - Mission planning and execution framework - Skill registry and execution engine - Memory systems (RAG, episodic, semantic) - Tool calling integration with LLMs
Goal: Integrate DIMOS as the agent layer, validate on physical Unitree Go2 robot.
Hypothesis¶
DIMOS framework can provide the cognitive layer for ShadowHound with: - Mission planning from natural language - Skill-based execution (leverage existing MyUnitreeSkills) - Memory-augmented agent (RAG for context) - ROS2 integration for robot control
Approach: Build custom mission agent as ROS2 node wrapping DIMOS.
Experiments¶
Experiment 1: Architecture Design (2025-10-05)¶
Commit: 7159e5a
What We Tried: - Designed four-layer architecture: Application → Agent → Skills → Robot - Separated mission agent (ROS2 wrapper) from mission executor (DIMOS cognitive layer) - Documented architectural patterns
Results: - ✅ Clear separation of concerns - ✅ ROS2 integration path identified - ✅ Skill registry approach validated
Decision: Proceed with custom ROS2 wrapper around DIMOS
Experiment 2: Web Interface Options (2025-10-05)¶
Commit: 26d25fa
What We Tried: Evaluated three options: 1. Use DIMOS built-in web interface 2. Build custom FastAPI interface 3. Use ROS2 web bridge
Results: - DIMOS UI: Functional but tightly coupled, hard to customize - Custom FastAPI: Full control, can tailor to robot needs - ROS2 bridge: Too generic, not mission-focused
Decision: Build custom web UI (Option 2) - Provides full control and understanding
Experiment 3: Custom Web UI Implementation (2025-10-05-06)¶
Commit: eeaa03a
What We Tried: - Built FastAPI web interface from scratch (479 LOC) - WebSocket for real-time status updates - REST API for mission submission - Static HTML/CSS/JS frontend
Results: - ✅ 479 lines of custom UI code - ✅ Real-time status updates working - ✅ Mission submission functional - ✅ Full control over UI/UX - ✅ Deep understanding of all components
Decision: Success - Custom UI provides exactly what we need
Experiment 4: Mission Agent Implementation (2025-10-06)¶
Commit: 32ec845
What We Tried:
- Implemented mission_agent.py as ROS2 node (713 LOC)
- ROS2 action server for mission execution
- Service interface for status queries
- Integration with mission executor (DIMOS layer)
Results: - ✅ ROS2 integration working - ✅ Action-based mission execution - ✅ Clean interface to DIMOS layer - ✅ Status publishing to topics
Decision: Validated - ROS2 wrapper pattern works
Experiment 5: Mission Executor (DIMOS Layer) (2025-10-06)¶
Commit: 54ea06a
What We Tried:
- Implemented mission_executor.py (517 LOC)
- DIMOS cognitive layer integration
- Skill registry access
- Memory system integration (RAG)
Results: - ✅ Clean separation from ROS2 concerns - ✅ DIMOS functionality encapsulated - ✅ Testable without ROS2 - ✅ Memory augmentation working
Decision: Success - Architecture pattern validated
Experiment 6: Configuration System (2025-10-06-07)¶
Commits: 10a1b2a, bfa27bf
What We Tried: - Environment-based configuration (.env files) - Multiple backend support (OpenAI cloud, local Ollama) - Thor vs laptop configurations - Configuration validation in start script
Results: - ✅ Clean configuration management - ✅ Easy backend switching - ✅ Environment-specific settings working - ✅ Validation catches misconfigurations
Decision: Adopted - .env pattern for all configuration
Experiment 7: Hardware Validation (2025-10-07-09)¶
Commits: f16bda8 (merge)
What We Tried: - Deployed to laptop for real robot testing - Connected to Unitree Go2 via go2_ros2_sdk - Tested with OpenAI cloud backend - Tested with vLLM local backend on Thor
Results: - ✅ End-to-end system working on physical robot! - ✅ OpenAI backend validated - ✅ vLLM backend validated - ⚠️ WebRTC API issues discovered (blocks majority of DIMOS skills) - ✅ Basic navigation skills functional
Decision: Merge feature branch - Core system validated, known limitations documented
Final Results¶
What Worked¶
Custom Mission Agent Architecture (~2,100 LOC total):
- mission_agent.py (713 LOC): ROS2 wrapper node
- mission_executor.py (517 LOC): DIMOS cognitive layer
- web_interface.py (479 LOC): Custom FastAPI UI
- rag_memory_example.py (392 LOC): Memory integration
Key Achievements: 1. ✅ End-to-end system working on physical robot 2. ✅ Dual backend support (cloud + local) 3. ✅ Custom web UI with real-time updates 4. ✅ Clean architectural separation (ROS ↔ Agent ↔ Skills ↔ Robot) 5. ✅ Memory-augmented agent (RAG integration) 6. ✅ Configuration system supporting multiple environments
Validation Highlights: - Tested on Unitree Go2 hardware ✅ - OpenAI cloud backend working ✅ - vLLM local backend working ✅ - Skills execution functional ✅
What Didn't Work / Constraints Discovered¶
WebRTC API Issues: - Majority of DIMOS MyUnitreeSkills use WebRTC API - WebRTC connection unstable/unreliable - Blocks: vision skills, advanced locomotion, complex behaviors - Impact: MVP scope limited to basic navigation initially
DIMOS Web Interface: - Tightly coupled to DIMOS internals - Hard to customize for robot-specific needs - Decision to build custom UI was correct
Key Decisions¶
- Feature branch approach
- Rationale: Validate architecture before merging to main
-
Outcome: De-risked major architectural change
-
Custom web UI over DIMOS built-in
- Rationale: Full control, deep understanding
-
Trade-off: More code to maintain, but exactly what we need
-
Separate mission agent and executor
- Rationale: Clean separation of ROS concerns from cognitive layer
-
Benefit: Mission executor testable without ROS2
-
Dual backend support
- Rationale: Cloud for development, local for deployment
-
Flexibility: Can switch based on task complexity
-
Validate on hardware before merge
- Rationale: Catch integration issues early
- Outcome: WebRTC constraint discovered before merge
Architecture Pattern Established¶
Application Layer: start.sh, launch files, configuration
↓
Agent Layer: mission_agent.py (ROS2) + mission_executor.py (DIMOS)
↓
Skills Layer: DIMOS MyUnitreeSkills (~30 behaviors)
↓
Robot Layer: go2_ros2_sdk → Unitree Go2 hardware
Benefits: - Clear separation of concerns - Testable layers - ROS2 wrapper provides standard interface - DIMOS provides cognitive capabilities
Implementation Status¶
- [x] Architecture designed
- [x] Mission agent implemented (713 LOC)
- [x] Mission executor implemented (517 LOC)
- [x] Custom web UI built (479 LOC)
- [x] Configuration system working
- [x] Hardware validation complete
- [x] Feature branch merged
- [x] Documentation complete
Follow-Up Work¶
Immediate (Post-Merge)¶
- [ ] Implement MockRobot for development without hardware
- [ ] Address WebRTC API issues (or work around them)
- [ ] Expand skill coverage (non-WebRTC skills)
Future¶
- [ ] Rename MissionExecutor → RobotAgent (clearer semantics)
- [ ] Implement testing pyramid (mock skills → mock robot → Gazebo → hardware)
- [ ] Enhanced memory systems (episodic, semantic)
Lessons Learned¶
- Feature branches work: Large architectural changes benefit from validation before merge
- Hardware testing critical: Discovered WebRTC constraint only on real robot
- Custom > Integration: Building custom UI gave deep understanding and control
- Separation of concerns pays off: ROS wrapper distinct from cognitive layer enables testing
- Configuration is critical: .env pattern allows easy environment switching
References¶
- Mission Agent vs Executor Architecture
- Mission Agent Package Docs
- DIMOS Framework (submodule)
- Feature branch:
feature/dimos-integration - Merge commit:
f16bda8
Total Time: 5 days (Oct 5-9, 2025) Commits: ~50 commits on feature branch Lines of Code: ~2,100 LOC (mission agent architecture) Outcome: ✅ Working end-to-end system validated on physical robot