Skip to content

ShadowHound Local AI Implementation - Status Report

Date: October 12, 2025
Branch: feature/local-llm-support
Status: 🟡 Blocked - Single 1-line fix needed


Executive Summary

Goal: Fully local AI stack (LLM + embeddings + vector memory) with robot control via skills

Current Status: - ✅ Architecture validated and documented - ✅ vLLM running on Thor with Qwen2.5-Coder-7B - ✅ Dependencies installed (chromadb, sentence-transformers) - ✅ Robot skills (MyUnitreeSkills) integrated - ⚠️ BLOCKED: DIMOS has 1-line import bug preventing LocalSemanticMemory initialization

Blocker: src/dimos-unitree/dimos/agents/memory/chroma_impl.py line 147
Fix: Add from sentence_transformers import SentenceTransformer to imports


Architecture Decision: Why OpenAIAgent?

Requirements Analysis

Requirement Priority Status
Local LLM (no cloud costs) MUST ✅ vLLM on Thor
Local embeddings (no cloud costs) MUST ✅ sentence-transformers ready
Skills/function calling (robot control) CRITICAL ✅ Only OpenAIAgent/ClaudeAgent/PlanningAgent
Vector memory (RAG) HIGH ⚠️ Blocked by import bug

Agent Type Evaluation

Detailed comparison in docs/dimos_agent_architecture.md:

Agent Type Local LLM Local Memory Skills Robot Control
OpenAIAgent ✅ vLLM ✅ Explicit ✅ YES ✅ YES
ClaudeAgent ❌ Cloud only ✅ Explicit ✅ YES ✅ YES
PlanningAgent ✅ vLLM ✅ Explicit ✅ YES ✅ YES
HuggingFaceLocalAgent ✅ Yes ✅ Default NO NO
HuggingFaceRemoteAgent ❌ HF API ✅ Default ❌ NO ❌ NO
CTransformersGGUFAgent ✅ GGUF ✅ Default ❌ NO ❌ NO

Decision: OpenAIAgent + vLLM + LocalSemanticMemory

Why: - ✅ Only combination supporting ALL requirements - ✅ Works with vLLM's OpenAI-compatible API - ✅ Supports MyUnitreeSkills (Move, Reverse, SpinLeft, SpinRight, Wait) - ✅ Can explicitly pass LocalSemanticMemory to override default

Why NOT HuggingFaceLocalAgent: - ❌ NO skills/function calling support (grep found 0 matches) - ❌ Cannot control robot via MyUnitreeSkills - ❌ Would require complete rewrite of robot control architecture - ✅ Only suitable for simple text generation tasks


Technical Stack

Final Architecture

┌─────────────────────────────────────────────────────┐
│ ShadowHound Mission Agent (Laptop)                  │
│                                                     │
│ OpenAIAgent                                         │
│   ├─ LLM: vLLM API (Thor)                          │ ← Qwen/Qwen2.5-Coder-7B-Instruct
│   │   └─ http://192.168.10.116:8000/v1             │    8000:8000
│   │                                                 │
│   ├─ Memory: LocalSemanticMemory                   │ ← sentence-transformers/all-MiniLM-L6-v2
│   │   ├─ Embeddings: sentence-transformers         │    ChromaDB: ~/.chroma/
│   │   └─ Vector DB: ChromaDB                       │    384 dimensions
│   │                                                 │
│   └─ Skills: MyUnitreeSkills                       │ ← Robot control functions
│       ├─ Move(x, y, yaw, duration)                 │
│       ├─ Reverse(x, y, yaw, duration)              │
│       ├─ SpinLeft(degrees)                         │
│       ├─ SpinRight(degrees)                        │
│       └─ Wait(seconds)                             │
│                                                     │
└──────────────┬──────────────────────────────────────┘
               │
               ▼ WebRTC (go2_ros2_sdk)
         Unitree Go2 Robot
         192.168.1.103

Environment Configuration

File: .env

# Agent Backend
AGENT_BACKEND=openai
OPENAI_BASE_URL=http://192.168.10.116:8000/v1
OPENAI_MODEL=Qwen/Qwen2.5-Coder-7B-Instruct
# OPENAI_API_KEY not needed for vLLM

# Embeddings (auto-detected as local due to non-OpenAI URL)
# USE_LOCAL_EMBEDDINGS=true  # Optional explicit override

# Robot
MOCK_ROBOT=false
CONN_TYPE=webrtc
GO2_IP=192.168.1.103

Dependencies Status

Core dependencies (✅ installed):

# Python packages
chromadb>=0.4.22
langchain-chroma
langchain-openai
sentence-transformers
openai  # Client library for OpenAI-compatible APIs

# System dependencies
torch  # For sentence-transformers
transformers  # For tokenization

Verification:

# Test embeddings stack
python3 scripts/test_local_embeddings.py

# Expected: ✅ All tests pass (after DIMOS fix)

Known Issues

Issue 1: Missing Import in DIMOS ⚠️ CRITICAL BLOCKER

File: src/dimos-unitree/dimos/agents/memory/chroma_impl.py
Line: 147
Severity: CRITICAL - Blocks all local memory functionality

Error:

NameError: name 'SentenceTransformer' is not defined

Root Cause:

# Line 147:
self.model = SentenceTransformer(self.model_name, device=device)

# But import section (lines 1-40) is MISSING:
from sentence_transformers import SentenceTransformer

Fix (1 line):

# Add to imports section at top of file:
from sentence_transformers import SentenceTransformer

Impact: - ❌ LocalSemanticMemory.init() fails immediately - ❌ Agent falls back to no memory (no RAG) - ❌ Cannot test local embeddings end-to-end - ❌ Blocks full local AI operation

Resolution Plan: 1. ✅ Consolidate DIMOS branches (dev + fix/webrtc) 2. ⏳ Apply 1-line import fix to consolidated dev branch 3. ⏳ Test with ShadowHound (./start.sh) 4. ⏳ Verify local embeddings work (python3 scripts/test_local_embeddings.py) 5. ⏳ Submit PR to upstream DIMOS 6. ⏳ Update ShadowHound submodule SHA

Issue 2: OpenAIAgent Default Memory

File: dimos/agents/agent.py line 95

Issue:

self.agent_memory = agent_memory or OpenAISemanticMemory()

If agent_memory=None, it creates OpenAISemanticMemory() which: - Requires OpenAI API key - Calls /v1/embeddings endpoint - Fails with vLLM (no embeddings endpoint)

Workaround: - Always pass explicit agent_memory=LocalSemanticMemory() - Never pass None - Our code uses "skip" sentinel to prevent None

Status: Working as designed, documented in our code

Issue 3: DIMOS README Misleading

File: dimos-unitree/README.md

Claim:

OpenAI API key (required for all LLMAgents due to OpenAIEmbeddings)

Reality: - ❌ NOT required for HuggingFaceLocalAgent (uses LocalSemanticMemory by default) - ❌ NOT required for any agent if you pass agent_memory=LocalSemanticMemory() explicitly - ✅ Only required if using OpenAISemanticMemory (which is a default, not a requirement)

Status: Documented in our architecture guide, consider PR to DIMOS


Implementation in ShadowHound

Auto-Detection Logic

File: src/shadowhound_mission_agent/shadowhound_mission_agent/mission_executor.py
Lines: 260-285

# Determine embeddings strategy
use_local_env = os.getenv("USE_LOCAL_EMBEDDINGS", "").lower()

if use_local_env in ("true", "false"):
    # Explicit override
    use_local_embeddings = use_local_env == "true"
else:
    # Auto-detect based on OPENAI_BASE_URL
    base_url = os.getenv("OPENAI_BASE_URL", "https://api.openai.com/v1")
    use_local_embeddings = "api.openai.com" not in base_url

Benefits: - ✅ Works out-of-box for typical configs - ✅ Supports hybrid setups (local LLM + cloud embeddings) - ✅ Explicit override available for edge cases

Graceful Fallback

Lines: 288-323

if use_local_embeddings:
    try:
        agent_memory = LocalSemanticMemory(
            collection_name="shadowhound_memory",
            model_name="sentence-transformers/all-MiniLM-L6-v2",
        )
        self.logger.info("✅ LocalSemanticMemory initialized")
    except ImportError as e:
        self.logger.warning("⚠ Dependencies missing: chromadb langchain-chroma sentence-transformers")
        agent_memory = "skip"
    except Exception as e:
        self.logger.warning(f"⚠ LocalSemanticMemory failed: {str(e)}")
        if "SentenceTransformer" in str(e):
            self.logger.error("🐛 DIMOS Bug: Missing import in chroma_impl.py line 147")
        agent_memory = "skip"
else:
    agent_memory = None  # Will use OpenAISemanticMemory

Handles: - Missing dependencies (clear error message) - DIMOS bugs (identifies specific issue) - Falls back gracefully (no crash)

Skills Integration

Lines: 175-215

from dimos.robot.unitree.unitree_skills import MyUnitreeSkills

self.skills = MyUnitreeSkills(robot=self.robot)

self.logger.info(f"✅ Initialized {len(list(self.skills))} robot skills")

Skills Available: 1. Move - Forward/lateral velocity commands 2. Reverse - Backward movement 3. SpinLeft - Rotate left by degrees 4. SpinRight - Rotate right by degrees 5. Wait - Pause execution

Critical: These skills call self._robot.move_vel() and self._robot.spin() to directly control the Go2 hardware via WebRTC.


Next Steps

Phase 1: DIMOS Consolidation (In Progress)

Tasks: 1. ✅ Document all agent types and capabilities 2. ✅ Validate OpenAIAgent is correct choice 3. ⏳ Consolidate DIMOS branches (dev + fix/webrtc) 4. ⏳ Apply 1-line import fix 5. ⏳ Test consolidated branch with ShadowHound

Commands:

# See: docs/dimos_branch_consolidation.md
cd /workspaces/shadowhound/src/dimos-unitree

# Rebase fix/webrtc onto dev
git fetch origin
git checkout fix/webrtc-instant-commands-and-progress
git rebase origin/dev

# Apply import fix
# Edit: dimos/agents/memory/chroma_impl.py
# Add: from sentence_transformers import SentenceTransformer

# Test
cd /workspaces/shadowhound
rm -rf build/ install/ log/
./start.sh

Phase 2: Verification (Next)

Tasks: 1. Test embeddings stack independently 2. Test agent initialization 3. Test skills execution 4. Test end-to-end mission with memory

Commands:

# Test embeddings
python3 scripts/test_local_embeddings.py
# Expected: ✅ All tests pass

# Test agent
./start.sh
# Check logs for:
# - ✅ LocalSemanticMemory initialized
# - ✅ Initialized 5 robot skills
# - ✅ DIMOS OpenAI-compatible agent initialized

# Test mission
# Give command: "Move forward 2 meters"
# Expected:
# - LLM generates: Move(x=0.5, duration=4.0)
# - Robot executes movement
# - Result stored in memory

Phase 3: Upstream Contribution (After Validation)

Tasks: 1. Submit PR to DIMOS with import fix 2. Consider additional PRs: - Update README to clarify local options - Add examples of OpenAIAgent + vLLM - Document skills requirement clearly

Phase 4: Production Deployment (Final)

Tasks: 1. Sync laptop host with latest changes 2. Rebuild on laptop (all Go2 SDK packages) 3. Deploy to production 4. Monitor performance and quality 5. Document any issues


Testing Strategy

Unit Tests

Test embeddings stack:

python3 scripts/test_local_embeddings.py

Expected output:

✅ sentence-transformers installed
✅ chromadb installed
✅ langchain-chroma installed
✅ Model loaded: sentence-transformers/all-MiniLM-L6-v2
✅ Test embeddings: 384 dimensions
✅ ChromaDB collection created
✅ Documents stored successfully
✅ Semantic search working
✅ All tests passed!

Integration Tests

Test agent initialization:

cd /workspaces/shadowhound
./start.sh 2>&1 | grep -E "(LocalSemanticMemory|robot skills|OpenAI-compatible)"

Expected:

✅ LocalSemanticMemory initialized
✅ Initialized 5 robot skills:
   - Move
   - Reverse
   - SpinLeft
   - SpinRight
   - Wait
✅ DIMOS OpenAI-compatible agent initialized

End-to-End Tests

Test robot control: 1. Start system: ./start.sh 2. Give mission: "Spin left 90 degrees" 3. Verify: - LLM generates: SpinLeft(degrees=90.0) - Skill executes: self._robot.spin(degrees=90.0) - Robot physically rotates 90° left - Result stored in ChromaDB memory

Test memory/RAG: 1. Give mission: "Move forward 1 meter" 2. Execute and complete 3. Give mission: "Do the same thing again" 4. Verify: - Agent queries memory for "Move forward" - Finds previous execution in context - Generates same skill call: Move(x=0.5, duration=2.0)


Risk Assessment

Risk Probability Impact Mitigation
DIMOS rebase conflicts MEDIUM MEDIUM Backup branch, manual merge
Import fix breaks other code LOW LOW Only adds import, no logic change
vLLM quality insufficient LOW HIGH Can switch to cloud (OpenAI API key)
Local embeddings quality low LOW MEDIUM Can switch to OpenAI embeddings
Skills not working with vLLM LOW CRITICAL Test thoroughly, have cloud fallback

Overall Risk: LOW - Architecture validated, fix is trivial, fallbacks available


Success Criteria

Definition of Done:

  • ✅ vLLM serving on Thor (Qwen2.5-Coder-7B)
  • ✅ OpenAIAgent initialized with vLLM endpoint
  • ✅ LocalSemanticMemory initialized successfully
  • ✅ MyUnitreeSkills registered with agent
  • ✅ End-to-end mission execution works
  • ✅ Memory/RAG provides context in multi-turn missions
  • ✅ Robot responds to natural language commands
  • ✅ No cloud API calls (verified in logs)

Performance Targets:

  • LLM latency: < 2s for simple commands
  • Embeddings latency: < 100ms per query
  • Memory search: < 50ms
  • Skills execution: Depends on command (1-10s typical)

Documentation

Created/Updated: - ✅ docs/dimos_agent_architecture.md - Comprehensive agent comparison - ✅ docs/local_ai_implementation_status.md - This document - ✅ docs/dimos_branch_consolidation.md - Branch merge strategy - ✅ docs/dimos_development_policy.md - Submodule workflow - ✅ docs/local_llm_memory_roadmap.md - Original roadmap - ✅ scripts/test_local_embeddings.py - Test harness

References: - Architecture: docs/dimos_agent_architecture.md - Implementation: src/shadowhound_mission_agent/shadowhound_mission_agent/mission_executor.py - Testing: scripts/test_local_embeddings.py


Timeline

Estimated Timeline:

  • ✅ Week 1: Architecture investigation and validation (COMPLETE)
  • ⏳ Week 2: DIMOS consolidation and fix (IN PROGRESS)
  • Days 1-2: Branch consolidation
  • Day 3: Apply import fix
  • Days 4-5: Testing and verification
  • 📅 Week 3: Production deployment
  • Days 1-2: Laptop sync and rebuild
  • Days 3-4: End-to-end testing
  • Day 5: Production deployment

Current Status: End of Week 1, beginning Week 2


Contact and Support

Primary Developer: Daniel Martinez
Repository: https://github.com/danmartinez78/shadowhound
Branch: feature/local-llm-support
DIMOS Fork: https://github.com/danmartinez78/dimos-unitree
DIMOS Branches: dev (clean) + fix/webrtc-instant-commands-and-progress (ShadowHound)

Key Documents: 1. This status report 2. docs/dimos_agent_architecture.md - Agent type details 3. docs/dimos_branch_consolidation.md - Consolidation plan 4. docs/dimos_development_policy.md - Submodule policy


Last Updated: October 12, 2025
Next Review: After DIMOS branch consolidation