ShadowHound Local AI Implementation - Status Report¶

Date: October 12, 2025
Branch: feature/local-llm-support
Status: 🟡 Blocked - Single 1-line fix needed

Executive Summary¶

Goal: Fully local AI stack (LLM + embeddings + vector memory) with robot control via skills

Current Status: - ✅ Architecture validated and documented - ✅ vLLM running on Thor with Qwen2.5-Coder-7B - ✅ Dependencies installed (chromadb, sentence-transformers) - ✅ Robot skills (MyUnitreeSkills) integrated - ⚠️ BLOCKED: DIMOS has 1-line import bug preventing LocalSemanticMemory initialization

Blocker: src/dimos-unitree/dimos/agents/memory/chroma_impl.py line 147
Fix: Add from sentence_transformers import SentenceTransformer to imports

Architecture Decision: Why OpenAIAgent?¶

Requirements Analysis¶

Requirement	Priority	Status
Local LLM (no cloud costs)	MUST	✅ vLLM on Thor
Local embeddings (no cloud costs)	MUST	✅ sentence-transformers ready
Skills/function calling (robot control)	CRITICAL	✅ Only OpenAIAgent/ClaudeAgent/PlanningAgent
Vector memory (RAG)	HIGH	⚠️ Blocked by import bug

Agent Type Evaluation¶

Detailed comparison in docs/dimos_agent_architecture.md:

Agent Type	Local LLM	Local Memory	Skills	Robot Control
OpenAIAgent	✅ vLLM	✅ Explicit	✅ YES	✅ YES
ClaudeAgent	❌ Cloud only	✅ Explicit	✅ YES	✅ YES
PlanningAgent	✅ vLLM	✅ Explicit	✅ YES	✅ YES
HuggingFaceLocalAgent	✅ Yes	✅ Default	❌ NO	❌ NO
HuggingFaceRemoteAgent	❌ HF API	✅ Default	❌ NO	❌ NO
CTransformersGGUFAgent	✅ GGUF	✅ Default	❌ NO	❌ NO

Decision: OpenAIAgent + vLLM + LocalSemanticMemory

Why: - ✅ Only combination supporting ALL requirements - ✅ Works with vLLM's OpenAI-compatible API - ✅ Supports MyUnitreeSkills (Move, Reverse, SpinLeft, SpinRight, Wait) - ✅ Can explicitly pass LocalSemanticMemory to override default

Why NOT HuggingFaceLocalAgent: - ❌ NO skills/function calling support (grep found 0 matches) - ❌ Cannot control robot via MyUnitreeSkills - ❌ Would require complete rewrite of robot control architecture - ✅ Only suitable for simple text generation tasks

Technical Stack¶

Final Architecture¶

┌─────────────────────────────────────────────────────┐
│ ShadowHound Mission Agent (Laptop)                  │
│                                                     │
│ OpenAIAgent                                         │
│   ├─ LLM: vLLM API (Thor)                          │ ← Qwen/Qwen2.5-Coder-7B-Instruct
│   │   └─ http://192.168.10.116:8000/v1             │    8000:8000
│   │                                                 │
│   ├─ Memory: LocalSemanticMemory                   │ ← sentence-transformers/all-MiniLM-L6-v2
│   │   ├─ Embeddings: sentence-transformers         │    ChromaDB: ~/.chroma/
│   │   └─ Vector DB: ChromaDB                       │    384 dimensions
│   │                                                 │
│   └─ Skills: MyUnitreeSkills                       │ ← Robot control functions
│       ├─ Move(x, y, yaw, duration)                 │
│       ├─ Reverse(x, y, yaw, duration)              │
│       ├─ SpinLeft(degrees)                         │
│       ├─ SpinRight(degrees)                        │
│       └─ Wait(seconds)                             │
│                                                     │
└──────────────┬──────────────────────────────────────┘
               │
               ▼ WebRTC (go2_ros2_sdk)
         Unitree Go2 Robot
         192.168.1.103

Environment Configuration¶

File: .env

# Agent Backend
AGENT_BACKEND=openai
OPENAI_BASE_URL=http://192.168.10.116:8000/v1
OPENAI_MODEL=Qwen/Qwen2.5-Coder-7B-Instruct
# OPENAI_API_KEY not needed for vLLM

# Embeddings (auto-detected as local due to non-OpenAI URL)
# USE_LOCAL_EMBEDDINGS=true  # Optional explicit override

# Robot
MOCK_ROBOT=false
CONN_TYPE=webrtc
GO2_IP=192.168.1.103

Dependencies Status¶

Core dependencies (✅ installed):

# Python packages
chromadb>=0.4.22
langchain-chroma
langchain-openai
sentence-transformers
openai  # Client library for OpenAI-compatible APIs

# System dependencies
torch  # For sentence-transformers
transformers  # For tokenization

Verification:

# Test embeddings stack
python3 scripts/test_local_embeddings.py

# Expected: ✅ All tests pass (after DIMOS fix)

Known Issues¶

Issue 1: Missing Import in DIMOS ⚠️ CRITICAL BLOCKER¶

File: src/dimos-unitree/dimos/agents/memory/chroma_impl.py
Line: 147
Severity: CRITICAL - Blocks all local memory functionality

Error:

NameError: name 'SentenceTransformer' is not defined

Root Cause:

# Line 147:
self.model = SentenceTransformer(self.model_name, device=device)

# But import section (lines 1-40) is MISSING:
from sentence_transformers import SentenceTransformer

Fix (1 line):

# Add to imports section at top of file:
from sentence_transformers import SentenceTransformer

Impact: - ❌ LocalSemanticMemory.init() fails immediately - ❌ Agent falls back to no memory (no RAG) - ❌ Cannot test local embeddings end-to-end - ❌ Blocks full local AI operation

Resolution Plan: 1. ✅ Consolidate DIMOS branches (dev + fix/webrtc) 2. ⏳ Apply 1-line import fix to consolidated dev branch 3. ⏳ Test with ShadowHound (./start.sh) 4. ⏳ Verify local embeddings work (python3 scripts/test_local_embeddings.py) 5. ⏳ Submit PR to upstream DIMOS 6. ⏳ Update ShadowHound submodule SHA

Issue 2: OpenAIAgent Default Memory¶

File: dimos/agents/agent.py line 95

Issue:

self.agent_memory = agent_memory or OpenAISemanticMemory()

If agent_memory=None, it creates OpenAISemanticMemory() which: - Requires OpenAI API key - Calls /v1/embeddings endpoint - Fails with vLLM (no embeddings endpoint)

Workaround: - Always pass explicit agent_memory=LocalSemanticMemory() - Never pass None - Our code uses "skip" sentinel to prevent None

Status: Working as designed, documented in our code

Issue 3: DIMOS README Misleading¶

File: dimos-unitree/README.md

Claim:

OpenAI API key (required for all LLMAgents due to OpenAIEmbeddings)

Reality: - ❌ NOT required for HuggingFaceLocalAgent (uses LocalSemanticMemory by default) - ❌ NOT required for any agent if you pass agent_memory=LocalSemanticMemory() explicitly - ✅ Only required if using OpenAISemanticMemory (which is a default, not a requirement)

Status: Documented in our architecture guide, consider PR to DIMOS

Implementation in ShadowHound¶

Auto-Detection Logic¶

File: src/shadowhound_mission_agent/shadowhound_mission_agent/mission_executor.py
Lines: 260-285

# Determine embeddings strategy
use_local_env = os.getenv("USE_LOCAL_EMBEDDINGS", "").lower()

if use_local_env in ("true", "false"):
    # Explicit override
    use_local_embeddings = use_local_env == "true"
else:
    # Auto-detect based on OPENAI_BASE_URL
    base_url = os.getenv("OPENAI_BASE_URL", "https://api.openai.com/v1")
    use_local_embeddings = "api.openai.com" not in base_url

Benefits: - ✅ Works out-of-box for typical configs - ✅ Supports hybrid setups (local LLM + cloud embeddings) - ✅ Explicit override available for edge cases

Graceful Fallback¶

Lines: 288-323

if use_local_embeddings:
    try:
        agent_memory = LocalSemanticMemory(
            collection_name="shadowhound_memory",
            model_name="sentence-transformers/all-MiniLM-L6-v2",
        )
        self.logger.info("✅ LocalSemanticMemory initialized")
    except ImportError as e:
        self.logger.warning("⚠ Dependencies missing: chromadb langchain-chroma sentence-transformers")
        agent_memory = "skip"
    except Exception as e:
        self.logger.warning(f"⚠ LocalSemanticMemory failed: {str(e)}")
        if "SentenceTransformer" in str(e):
            self.logger.error("🐛 DIMOS Bug: Missing import in chroma_impl.py line 147")
        agent_memory = "skip"
else:
    agent_memory = None  # Will use OpenAISemanticMemory

Handles: - Missing dependencies (clear error message) - DIMOS bugs (identifies specific issue) - Falls back gracefully (no crash)

Skills Integration¶

Lines: 175-215

from dimos.robot.unitree.unitree_skills import MyUnitreeSkills

self.skills = MyUnitreeSkills(robot=self.robot)

self.logger.info(f"✅ Initialized {len(list(self.skills))} robot skills")

Skills Available: 1. Move - Forward/lateral velocity commands 2. Reverse - Backward movement 3. SpinLeft - Rotate left by degrees 4. SpinRight - Rotate right by degrees 5. Wait - Pause execution

Critical: These skills call self._robot.move_vel() and self._robot.spin() to directly control the Go2 hardware via WebRTC.

Next Steps¶

Phase 1: DIMOS Consolidation (In Progress)¶

Tasks: 1. ✅ Document all agent types and capabilities 2. ✅ Validate OpenAIAgent is correct choice 3. ⏳ Consolidate DIMOS branches (dev + fix/webrtc) 4. ⏳ Apply 1-line import fix 5. ⏳ Test consolidated branch with ShadowHound

Commands:

# See: docs/dimos_branch_consolidation.md
cd /workspaces/shadowhound/src/dimos-unitree

# Rebase fix/webrtc onto dev
git fetch origin
git checkout fix/webrtc-instant-commands-and-progress
git rebase origin/dev

# Apply import fix
# Edit: dimos/agents/memory/chroma_impl.py
# Add: from sentence_transformers import SentenceTransformer

# Test
cd /workspaces/shadowhound
rm -rf build/ install/ log/
./start.sh

Phase 2: Verification (Next)¶

Tasks: 1. Test embeddings stack independently 2. Test agent initialization 3. Test skills execution 4. Test end-to-end mission with memory

Commands:

# Test embeddings
python3 scripts/test_local_embeddings.py
# Expected: ✅ All tests pass

# Test agent
./start.sh
# Check logs for:
# - ✅ LocalSemanticMemory initialized
# - ✅ Initialized 5 robot skills
# - ✅ DIMOS OpenAI-compatible agent initialized

# Test mission
# Give command: "Move forward 2 meters"
# Expected:
# - LLM generates: Move(x=0.5, duration=4.0)
# - Robot executes movement
# - Result stored in memory

Phase 3: Upstream Contribution (After Validation)¶

Tasks: 1. Submit PR to DIMOS with import fix 2. Consider additional PRs: - Update README to clarify local options - Add examples of OpenAIAgent + vLLM - Document skills requirement clearly

Phase 4: Production Deployment (Final)¶

Tasks: 1. Sync laptop host with latest changes 2. Rebuild on laptop (all Go2 SDK packages) 3. Deploy to production 4. Monitor performance and quality 5. Document any issues

Testing Strategy¶

Unit Tests¶

Test embeddings stack:

python3 scripts/test_local_embeddings.py

Expected output:

✅ sentence-transformers installed
✅ chromadb installed
✅ langchain-chroma installed
✅ Model loaded: sentence-transformers/all-MiniLM-L6-v2
✅ Test embeddings: 384 dimensions
✅ ChromaDB collection created
✅ Documents stored successfully
✅ Semantic search working
✅ All tests passed!

Integration Tests¶

Test agent initialization:

cd /workspaces/shadowhound
./start.sh 2>&1 | grep -E "(LocalSemanticMemory|robot skills|OpenAI-compatible)"

Expected:

✅ LocalSemanticMemory initialized
✅ Initialized 5 robot skills:
   - Move
   - Reverse
   - SpinLeft
   - SpinRight
   - Wait
✅ DIMOS OpenAI-compatible agent initialized

End-to-End Tests¶

Test robot control: 1. Start system: ./start.sh 2. Give mission: "Spin left 90 degrees" 3. Verify: - LLM generates: SpinLeft(degrees=90.0) - Skill executes: self._robot.spin(degrees=90.0) - Robot physically rotates 90° left - Result stored in ChromaDB memory

Test memory/RAG: 1. Give mission: "Move forward 1 meter" 2. Execute and complete 3. Give mission: "Do the same thing again" 4. Verify: - Agent queries memory for "Move forward" - Finds previous execution in context - Generates same skill call: Move(x=0.5, duration=2.0)

Risk Assessment¶

Risk	Probability	Impact	Mitigation
DIMOS rebase conflicts	MEDIUM	MEDIUM	Backup branch, manual merge
Import fix breaks other code	LOW	LOW	Only adds import, no logic change
vLLM quality insufficient	LOW	HIGH	Can switch to cloud (OpenAI API key)
Local embeddings quality low	LOW	MEDIUM	Can switch to OpenAI embeddings
Skills not working with vLLM	LOW	CRITICAL	Test thoroughly, have cloud fallback

Overall Risk: LOW - Architecture validated, fix is trivial, fallbacks available

Success Criteria¶

Definition of Done:

✅ vLLM serving on Thor (Qwen2.5-Coder-7B)
✅ OpenAIAgent initialized with vLLM endpoint
✅ LocalSemanticMemory initialized successfully
✅ MyUnitreeSkills registered with agent
✅ End-to-end mission execution works
✅ Memory/RAG provides context in multi-turn missions
✅ Robot responds to natural language commands
✅ No cloud API calls (verified in logs)

Performance Targets:

LLM latency: < 2s for simple commands
Embeddings latency: < 100ms per query
Memory search: < 50ms
Skills execution: Depends on command (1-10s typical)

Documentation¶

Created/Updated: - ✅ docs/dimos_agent_architecture.md - Comprehensive agent comparison - ✅ docs/local_ai_implementation_status.md - This document - ✅ docs/dimos_branch_consolidation.md - Branch merge strategy - ✅ docs/dimos_development_policy.md - Submodule workflow - ✅ docs/local_llm_memory_roadmap.md - Original roadmap - ✅ scripts/test_local_embeddings.py - Test harness

References: - Architecture: docs/dimos_agent_architecture.md - Implementation: src/shadowhound_mission_agent/shadowhound_mission_agent/mission_executor.py - Testing: scripts/test_local_embeddings.py

Timeline¶

Estimated Timeline:

✅ Week 1: Architecture investigation and validation (COMPLETE)
⏳ Week 2: DIMOS consolidation and fix (IN PROGRESS)
Days 1-2: Branch consolidation
Day 3: Apply import fix
Days 4-5: Testing and verification
📅 Week 3: Production deployment
Days 1-2: Laptop sync and rebuild
Days 3-4: End-to-end testing
Day 5: Production deployment

Current Status: End of Week 1, beginning Week 2

Contact and Support¶

Primary Developer: Daniel Martinez
Repository: https://github.com/danmartinez78/shadowhound
Branch: feature/local-llm-support
DIMOS Fork: https://github.com/danmartinez78/dimos-unitree
DIMOS Branches: dev (clean) + fix/webrtc-instant-commands-and-progress (ShadowHound)

Key Documents: 1. This status report 2. docs/dimos_agent_architecture.md - Agent type details 3. docs/dimos_branch_consolidation.md - Consolidation plan 4. docs/dimos_development_policy.md - Submodule policy

Last Updated: October 12, 2025
Next Review: After DIMOS branch consolidation