Skip to content

ShadowHound Architecture Review Summary

Date: October 3, 2025

Changes Made

1. Comprehensive Architecture Design

Problem: Documentation was aspirational with vague boundaries and missing implementation details.

Solution: Created a clear four-layer architecture with explicit responsibilities:

System Architecture Figure 1: Complete system architecture showing the layered design from Web UI through Mission Agent, DIMOS Skills, ROS2 Bridge, to GO2 hardware.

  • Application Layer: Launch files, configs, deployment
  • Agent Layer: LLM/VLM orchestration, mission planning
  • Skills Layer: Execution engine with safety guards
  • Robot Layer: ROS2 bridge to hardware
  • Hardware Layer: go2_ros2_sdk (external)

2. Concrete Package Structure

Before: Generic package names with unclear purposes After: Five core packages with clear roles:

shadowhound_interfaces/   → Custom ROS2 messages/services/actions
shadowhound_robot/        → Hardware interface (go2_ros2_sdk bridge)
shadowhound_skills/       → Skills registry + implementations
shadowhound_agent/        → Mission planner + LLM integration
shadowhound_bringup/      → Launch files and configurations

Each package now has detailed internal structure documented.

3. Clear Data Flows

Command Flow: User → Agent → Skills → Robot → Hardware Feedback Flow: Hardware → Robot → Skills → Agent → User

Both flows now explicitly documented with interfaces at each boundary.

4. Phase-Based Implementation Plan

Before: Phases marked as "complete" with no actual code After: Honest status assessment and actionable phases:

  • Phase 0 (CURRENT): Bootstrap - Package scaffolding
  • Phase 1: Basic Skills - Implement without hardware
  • Phase 2: Robot Integration - Connect to Go2
  • Phase 3: Agent Integration - Natural language missions
  • Phase 4: Advanced Navigation - Mapping and autonomy
  • Phase 5: Vision & Perception - VLM integration
  • Phase 6: Onboard Deployment - Run on Jetson Thor

Each phase has specific tasks and exit criteria.

5. Skills API Specification

Before: Abstract concept without implementation guidance After: Complete specification with: - Registration pattern with decorators - Validation requirements - Result structure (SkillResult with data + telemetry) - Safety requirements (timeouts, clamps, error handling) - Code examples for implementation and usage

6. Robot Interface Patterns

New: Explicit separation between Skills and ROS topics - Skills never publish directly to /cmd_vel - All hardware access through RobotInterface - Safety clamps built into interface - State monitoring and aggregation

7. Updated Documentation

Files Updated: 1. docs/project.md (Backed up old version) - Complete architecture with diagrams - Full package structure with file listings - Skills API reference - Development workflow - All 6 implementation phases - Safety constraints and ROS topics - Configuration and deployment guides

  1. .github/copilot-instructions.md
  2. Phase-aware guidance
  3. Code templates for skills
  4. Common patterns and anti-patterns
  5. Troubleshooting guides
  6. Testing patterns

  7. README.md

  8. User-focused quick start
  9. Clear status (Phase 0)
  10. Architecture overview
  11. Development workflow
  12. Skills API examples
  13. Roadmap with checkboxes

Key Architectural Decisions

1. Skills-First Design

  • All robot control goes through Skills API
  • No direct ROS topic publishing from agents
  • Enforces safety, validation, and telemetry

2. Type-Safe Interfaces

  • Python type hints required
  • Parameter validation before execution
  • Structured result objects (not bare returns)

3. Safety by Default

  • Every skill has timeout
  • Velocity commands clamped to safe ranges
  • Input validation required
  • Telemetry for all operations

4. Layered Responsibilities

  • Agent doesn't know about ROS topics
  • Skills don't know about LLMs
  • Robot interface doesn't know about missions
  • Clear separation of concerns

5. Container-First Development

  • All development in devcontainer
  • Consistent environment across team
  • Pre-configured tools and aliases

What's Next

Immediate (Phase 0)

  1. Create shadowhound_interfaces package with custom messages
  2. Create skeleton packages for robot, skills, agent, bringup
  3. Set up basic launch files
  4. Create shadowhound.repos for go2_ros2_sdk
  5. Verify clean build: cb succeeds without errors

Short Term (Phase 1)

  1. Implement Skills registry and executor
  2. Create 4 basic skills (say, stop, rotate, snapshot)
  3. Write unit tests for all skills
  4. CLI tool for testing skills

Medium Term (Phase 2-3)

  1. Import and integrate go2_ros2_sdk
  2. Implement RobotInterface
  3. Test skills on real hardware
  4. Add LLM client and mission planner
  5. Natural language mission execution

Documentation Structure

ShadowHound/
├── README.md                          # Quick start, user-focused
├── docs/
│   └── project.md                     # Complete architecture (THIS IS THE SOURCE OF TRUTH)
├── .github/
│   └── copilot-instructions.md        # AI coding agent guide
└── src/
    ├── shadowhound_interfaces/
    ├── shadowhound_robot/
    ├── shadowhound_skills/
    ├── shadowhound_agent/
    └── shadowhound_bringup/

Documentation Hierarchy: 1. docs/project.md - Architecture authority 2. .github/copilot-instructions.md - Development patterns 3. README.md - User onboarding 4. Package READMEs - Implementation details

Success Metrics

Phase 0 Complete When:

  • [ ] All 5 packages created with proper structure
  • [ ] cb builds workspace without errors
  • [ ] ros2 pkg list | grep shadowhound shows all packages
  • [ ] Each package has README.md

Phase 1 Complete When:

  • [ ] Skills registry functional
  • [ ] 4 basic skills implemented and tested
  • [ ] Unit tests pass: pytest src/shadowhound_skills/test/
  • [ ] Skills callable via CLI

Overall Project Complete When:

  • [ ] Demo mission works: "Find the blue ball in the kitchen"
  • [ ] Natural language → navigation → perception → reporting
  • [ ] Works on real Go2 hardware
  • [ ] Deployable on Jetson Thor

Feedback & Iteration

This architecture provides: - ✅ Clear boundaries between layers - ✅ Testable components (can test skills without hardware) - ✅ Safe defaults (all control through validated Skills API) - ✅ Extensible (easy to add new skills) - ✅ Maintainable (clear responsibilities)

Questions for team: 1. Does the package structure make sense? 2. Are the phase priorities correct? 3. Should we add any additional safety mechanisms? 4. Any concerns about the Skills API design?