ShadowHound Local AI Implementation - Status Report¶
Date: October 12, 2025
Branch: feature/local-llm-support
Status: 🟡 Blocked - Single 1-line fix needed
Executive Summary¶
Goal: Fully local AI stack (LLM + embeddings + vector memory) with robot control via skills
Current Status: - ✅ Architecture validated and documented - ✅ vLLM running on Thor with Qwen2.5-Coder-7B - ✅ Dependencies installed (chromadb, sentence-transformers) - ✅ Robot skills (MyUnitreeSkills) integrated - ⚠️ BLOCKED: DIMOS has 1-line import bug preventing LocalSemanticMemory initialization
Blocker: src/dimos-unitree/dimos/agents/memory/chroma_impl.py line 147
Fix: Add from sentence_transformers import SentenceTransformer to imports
Architecture Decision: Why OpenAIAgent?¶
Requirements Analysis¶
| Requirement | Priority | Status |
|---|---|---|
| Local LLM (no cloud costs) | MUST | ✅ vLLM on Thor |
| Local embeddings (no cloud costs) | MUST | ✅ sentence-transformers ready |
| Skills/function calling (robot control) | CRITICAL | ✅ Only OpenAIAgent/ClaudeAgent/PlanningAgent |
| Vector memory (RAG) | HIGH | ⚠️ Blocked by import bug |
Agent Type Evaluation¶
Detailed comparison in docs/dimos_agent_architecture.md:
| Agent Type | Local LLM | Local Memory | Skills | Robot Control |
|---|---|---|---|---|
| OpenAIAgent | ✅ vLLM | ✅ Explicit | ✅ YES | ✅ YES |
| ClaudeAgent | ❌ Cloud only | ✅ Explicit | ✅ YES | ✅ YES |
| PlanningAgent | ✅ vLLM | ✅ Explicit | ✅ YES | ✅ YES |
| HuggingFaceLocalAgent | ✅ Yes | ✅ Default | ❌ NO | ❌ NO |
| HuggingFaceRemoteAgent | ❌ HF API | ✅ Default | ❌ NO | ❌ NO |
| CTransformersGGUFAgent | ✅ GGUF | ✅ Default | ❌ NO | ❌ NO |
Decision: OpenAIAgent + vLLM + LocalSemanticMemory
Why: - ✅ Only combination supporting ALL requirements - ✅ Works with vLLM's OpenAI-compatible API - ✅ Supports MyUnitreeSkills (Move, Reverse, SpinLeft, SpinRight, Wait) - ✅ Can explicitly pass LocalSemanticMemory to override default
Why NOT HuggingFaceLocalAgent: - ❌ NO skills/function calling support (grep found 0 matches) - ❌ Cannot control robot via MyUnitreeSkills - ❌ Would require complete rewrite of robot control architecture - ✅ Only suitable for simple text generation tasks
Technical Stack¶
Final Architecture¶
┌─────────────────────────────────────────────────────┐
│ ShadowHound Mission Agent (Laptop) │
│ │
│ OpenAIAgent │
│ ├─ LLM: vLLM API (Thor) │ ← Qwen/Qwen2.5-Coder-7B-Instruct
│ │ └─ http://192.168.10.116:8000/v1 │ 8000:8000
│ │ │
│ ├─ Memory: LocalSemanticMemory │ ← sentence-transformers/all-MiniLM-L6-v2
│ │ ├─ Embeddings: sentence-transformers │ ChromaDB: ~/.chroma/
│ │ └─ Vector DB: ChromaDB │ 384 dimensions
│ │ │
│ └─ Skills: MyUnitreeSkills │ ← Robot control functions
│ ├─ Move(x, y, yaw, duration) │
│ ├─ Reverse(x, y, yaw, duration) │
│ ├─ SpinLeft(degrees) │
│ ├─ SpinRight(degrees) │
│ └─ Wait(seconds) │
│ │
└──────────────┬──────────────────────────────────────┘
│
▼ WebRTC (go2_ros2_sdk)
Unitree Go2 Robot
192.168.1.103
Environment Configuration¶
File: .env
# Agent Backend
AGENT_BACKEND=openai
OPENAI_BASE_URL=http://192.168.10.116:8000/v1
OPENAI_MODEL=Qwen/Qwen2.5-Coder-7B-Instruct
# OPENAI_API_KEY not needed for vLLM
# Embeddings (auto-detected as local due to non-OpenAI URL)
# USE_LOCAL_EMBEDDINGS=true # Optional explicit override
# Robot
MOCK_ROBOT=false
CONN_TYPE=webrtc
GO2_IP=192.168.1.103
Dependencies Status¶
Core dependencies (✅ installed):
# Python packages
chromadb>=0.4.22
langchain-chroma
langchain-openai
sentence-transformers
openai # Client library for OpenAI-compatible APIs
# System dependencies
torch # For sentence-transformers
transformers # For tokenization
Verification:
# Test embeddings stack
python3 scripts/test_local_embeddings.py
# Expected: ✅ All tests pass (after DIMOS fix)
Known Issues¶
Issue 1: Missing Import in DIMOS ⚠️ CRITICAL BLOCKER¶
File: src/dimos-unitree/dimos/agents/memory/chroma_impl.py
Line: 147
Severity: CRITICAL - Blocks all local memory functionality
Error:
NameError: name 'SentenceTransformer' is not defined
Root Cause:
# Line 147:
self.model = SentenceTransformer(self.model_name, device=device)
# But import section (lines 1-40) is MISSING:
from sentence_transformers import SentenceTransformer
Fix (1 line):
# Add to imports section at top of file:
from sentence_transformers import SentenceTransformer
Impact: - ❌ LocalSemanticMemory.init() fails immediately - ❌ Agent falls back to no memory (no RAG) - ❌ Cannot test local embeddings end-to-end - ❌ Blocks full local AI operation
Resolution Plan: 1. ✅ Consolidate DIMOS branches (dev + fix/webrtc) 2. ⏳ Apply 1-line import fix to consolidated dev branch 3. ⏳ Test with ShadowHound (./start.sh) 4. ⏳ Verify local embeddings work (python3 scripts/test_local_embeddings.py) 5. ⏳ Submit PR to upstream DIMOS 6. ⏳ Update ShadowHound submodule SHA
Issue 2: OpenAIAgent Default Memory¶
File: dimos/agents/agent.py line 95
Issue:
self.agent_memory = agent_memory or OpenAISemanticMemory()
If agent_memory=None, it creates OpenAISemanticMemory() which:
- Requires OpenAI API key
- Calls /v1/embeddings endpoint
- Fails with vLLM (no embeddings endpoint)
Workaround:
- Always pass explicit agent_memory=LocalSemanticMemory()
- Never pass None
- Our code uses "skip" sentinel to prevent None
Status: Working as designed, documented in our code
Issue 3: DIMOS README Misleading¶
File: dimos-unitree/README.md
Claim:
OpenAI API key (required for all LLMAgents due to OpenAIEmbeddings)
Reality:
- ❌ NOT required for HuggingFaceLocalAgent (uses LocalSemanticMemory by default)
- ❌ NOT required for any agent if you pass agent_memory=LocalSemanticMemory() explicitly
- ✅ Only required if using OpenAISemanticMemory (which is a default, not a requirement)
Status: Documented in our architecture guide, consider PR to DIMOS
Implementation in ShadowHound¶
Auto-Detection Logic¶
File: src/shadowhound_mission_agent/shadowhound_mission_agent/mission_executor.py
Lines: 260-285
# Determine embeddings strategy
use_local_env = os.getenv("USE_LOCAL_EMBEDDINGS", "").lower()
if use_local_env in ("true", "false"):
# Explicit override
use_local_embeddings = use_local_env == "true"
else:
# Auto-detect based on OPENAI_BASE_URL
base_url = os.getenv("OPENAI_BASE_URL", "https://api.openai.com/v1")
use_local_embeddings = "api.openai.com" not in base_url
Benefits: - ✅ Works out-of-box for typical configs - ✅ Supports hybrid setups (local LLM + cloud embeddings) - ✅ Explicit override available for edge cases
Graceful Fallback¶
Lines: 288-323
if use_local_embeddings:
try:
agent_memory = LocalSemanticMemory(
collection_name="shadowhound_memory",
model_name="sentence-transformers/all-MiniLM-L6-v2",
)
self.logger.info("✅ LocalSemanticMemory initialized")
except ImportError as e:
self.logger.warning("⚠ Dependencies missing: chromadb langchain-chroma sentence-transformers")
agent_memory = "skip"
except Exception as e:
self.logger.warning(f"⚠ LocalSemanticMemory failed: {str(e)}")
if "SentenceTransformer" in str(e):
self.logger.error("🐛 DIMOS Bug: Missing import in chroma_impl.py line 147")
agent_memory = "skip"
else:
agent_memory = None # Will use OpenAISemanticMemory
Handles: - Missing dependencies (clear error message) - DIMOS bugs (identifies specific issue) - Falls back gracefully (no crash)
Skills Integration¶
Lines: 175-215
from dimos.robot.unitree.unitree_skills import MyUnitreeSkills
self.skills = MyUnitreeSkills(robot=self.robot)
self.logger.info(f"✅ Initialized {len(list(self.skills))} robot skills")
Skills Available: 1. Move - Forward/lateral velocity commands 2. Reverse - Backward movement 3. SpinLeft - Rotate left by degrees 4. SpinRight - Rotate right by degrees 5. Wait - Pause execution
Critical: These skills call self._robot.move_vel() and self._robot.spin() to directly control the Go2 hardware via WebRTC.
Next Steps¶
Phase 1: DIMOS Consolidation (In Progress)¶
Tasks: 1. ✅ Document all agent types and capabilities 2. ✅ Validate OpenAIAgent is correct choice 3. ⏳ Consolidate DIMOS branches (dev + fix/webrtc) 4. ⏳ Apply 1-line import fix 5. ⏳ Test consolidated branch with ShadowHound
Commands:
# See: docs/dimos_branch_consolidation.md
cd /workspaces/shadowhound/src/dimos-unitree
# Rebase fix/webrtc onto dev
git fetch origin
git checkout fix/webrtc-instant-commands-and-progress
git rebase origin/dev
# Apply import fix
# Edit: dimos/agents/memory/chroma_impl.py
# Add: from sentence_transformers import SentenceTransformer
# Test
cd /workspaces/shadowhound
rm -rf build/ install/ log/
./start.sh
Phase 2: Verification (Next)¶
Tasks: 1. Test embeddings stack independently 2. Test agent initialization 3. Test skills execution 4. Test end-to-end mission with memory
Commands:
# Test embeddings
python3 scripts/test_local_embeddings.py
# Expected: ✅ All tests pass
# Test agent
./start.sh
# Check logs for:
# - ✅ LocalSemanticMemory initialized
# - ✅ Initialized 5 robot skills
# - ✅ DIMOS OpenAI-compatible agent initialized
# Test mission
# Give command: "Move forward 2 meters"
# Expected:
# - LLM generates: Move(x=0.5, duration=4.0)
# - Robot executes movement
# - Result stored in memory
Phase 3: Upstream Contribution (After Validation)¶
Tasks: 1. Submit PR to DIMOS with import fix 2. Consider additional PRs: - Update README to clarify local options - Add examples of OpenAIAgent + vLLM - Document skills requirement clearly
Phase 4: Production Deployment (Final)¶
Tasks: 1. Sync laptop host with latest changes 2. Rebuild on laptop (all Go2 SDK packages) 3. Deploy to production 4. Monitor performance and quality 5. Document any issues
Testing Strategy¶
Unit Tests¶
Test embeddings stack:
python3 scripts/test_local_embeddings.py
Expected output:
✅ sentence-transformers installed
✅ chromadb installed
✅ langchain-chroma installed
✅ Model loaded: sentence-transformers/all-MiniLM-L6-v2
✅ Test embeddings: 384 dimensions
✅ ChromaDB collection created
✅ Documents stored successfully
✅ Semantic search working
✅ All tests passed!
Integration Tests¶
Test agent initialization:
cd /workspaces/shadowhound
./start.sh 2>&1 | grep -E "(LocalSemanticMemory|robot skills|OpenAI-compatible)"
Expected:
✅ LocalSemanticMemory initialized
✅ Initialized 5 robot skills:
- Move
- Reverse
- SpinLeft
- SpinRight
- Wait
✅ DIMOS OpenAI-compatible agent initialized
End-to-End Tests¶
Test robot control:
1. Start system: ./start.sh
2. Give mission: "Spin left 90 degrees"
3. Verify:
- LLM generates: SpinLeft(degrees=90.0)
- Skill executes: self._robot.spin(degrees=90.0)
- Robot physically rotates 90° left
- Result stored in ChromaDB memory
Test memory/RAG:
1. Give mission: "Move forward 1 meter"
2. Execute and complete
3. Give mission: "Do the same thing again"
4. Verify:
- Agent queries memory for "Move forward"
- Finds previous execution in context
- Generates same skill call: Move(x=0.5, duration=2.0)
Risk Assessment¶
| Risk | Probability | Impact | Mitigation |
|---|---|---|---|
| DIMOS rebase conflicts | MEDIUM | MEDIUM | Backup branch, manual merge |
| Import fix breaks other code | LOW | LOW | Only adds import, no logic change |
| vLLM quality insufficient | LOW | HIGH | Can switch to cloud (OpenAI API key) |
| Local embeddings quality low | LOW | MEDIUM | Can switch to OpenAI embeddings |
| Skills not working with vLLM | LOW | CRITICAL | Test thoroughly, have cloud fallback |
Overall Risk: LOW - Architecture validated, fix is trivial, fallbacks available
Success Criteria¶
Definition of Done:
- ✅ vLLM serving on Thor (Qwen2.5-Coder-7B)
- ✅ OpenAIAgent initialized with vLLM endpoint
- ✅ LocalSemanticMemory initialized successfully
- ✅ MyUnitreeSkills registered with agent
- ✅ End-to-end mission execution works
- ✅ Memory/RAG provides context in multi-turn missions
- ✅ Robot responds to natural language commands
- ✅ No cloud API calls (verified in logs)
Performance Targets:
- LLM latency: < 2s for simple commands
- Embeddings latency: < 100ms per query
- Memory search: < 50ms
- Skills execution: Depends on command (1-10s typical)
Documentation¶
Created/Updated:
- ✅ docs/dimos_agent_architecture.md - Comprehensive agent comparison
- ✅ docs/local_ai_implementation_status.md - This document
- ✅ docs/dimos_branch_consolidation.md - Branch merge strategy
- ✅ docs/dimos_development_policy.md - Submodule workflow
- ✅ docs/local_llm_memory_roadmap.md - Original roadmap
- ✅ scripts/test_local_embeddings.py - Test harness
References:
- Architecture: docs/dimos_agent_architecture.md
- Implementation: src/shadowhound_mission_agent/shadowhound_mission_agent/mission_executor.py
- Testing: scripts/test_local_embeddings.py
Timeline¶
Estimated Timeline:
- ✅ Week 1: Architecture investigation and validation (COMPLETE)
- ⏳ Week 2: DIMOS consolidation and fix (IN PROGRESS)
- Days 1-2: Branch consolidation
- Day 3: Apply import fix
- Days 4-5: Testing and verification
- 📅 Week 3: Production deployment
- Days 1-2: Laptop sync and rebuild
- Days 3-4: End-to-end testing
- Day 5: Production deployment
Current Status: End of Week 1, beginning Week 2
Contact and Support¶
Primary Developer: Daniel Martinez
Repository: https://github.com/danmartinez78/shadowhound
Branch: feature/local-llm-support
DIMOS Fork: https://github.com/danmartinez78/dimos-unitree
DIMOS Branches: dev (clean) + fix/webrtc-instant-commands-and-progress (ShadowHound)
Key Documents:
1. This status report
2. docs/dimos_agent_architecture.md - Agent type details
3. docs/dimos_branch_consolidation.md - Consolidation plan
4. docs/dimos_development_policy.md - Submodule policy
Last Updated: October 12, 2025
Next Review: After DIMOS branch consolidation