Troubleshooting Index¶

Purpose¶

Centralize troubleshooting guides to reduce mean time to recovery across hardware, software, and networking failures. This index covers startup validation, robot testing procedures, and diagnostic workflows.

Prerequisites¶

Access to telemetry logs or observability dashboards
Knowledge of the impacted subsystem
Familiarity with ROS 2 diagnostics tools

Active Troubleshooting Guides¶

Startup & Validation¶

Startup Validation Flow — Two-layer LLM backend validation (pre-flight checks + runtime)
Start script pre-flight checks (fail fast)
Mission agent runtime validation
Ollama and OpenAI backend validation

Robot Testing¶

Quick Start: Robot Testing — Complete testing procedure with local LLM
GPU setup and monitoring (jtop)
Ollama configuration (phi4:14b)
End-to-end robot command testing
Performance validation

Common Issues & Solutions¶

LLM Backend Issues¶

Symptom: Mission agent fails to start or hangs
Solution: See Startup Validation for pre-flight checks

Symptom: Slow or no responses from LLM
Solution: Check backend configuration in Backend Validation

Robot Connectivity Issues¶

Symptom: Robot not responding to commands
Solution: 1. Verify DDS connectivity: DDS Direct Test 2. Check WebRTC connection: WebRTC Direct Test 3. Validate network topology: Network Topologies

ROS 2 Topic Issues¶

Symptom: Topics not visible or no data
Diagnostic Commands:

# List all topics
ros2 topic list

# Check topic info
ros2 topic info /topic_name

# Echo topic data
ros2 topic echo /topic_name

# Check DDS discovery
ros2 daemon status
ros2 daemon stop  # If needed to reset
ros2 daemon start

Diagnostic Workflow¶

1. Identify Subsystem¶

Hardware: Power, sensors, networking → See Hardware Docs
Software: ROS 2, agent, skills → See Software Docs
Networking: DDS, WebRTC, WiFi → See Networking Docs

2. Gather Information¶

# Check system logs
journalctl -xe

# ROS 2 node status
ros2 node list
ros2 node info /node_name

# Network connectivity
ping 192.168.10.103  # GO2 robot
ping 192.168.10.1    # Router

# GPU status (on Thor)
jtop

3. Apply Solution¶

Follow relevant troubleshooting guide
Document resolution steps
Update this index if new pattern found

4. Verify Resolution¶

Test the fixed functionality
Monitor for recurrence
Update telemetry/alerting if needed

Steps¶

Identify affected subsystem using diagnostic workflow above
Gather diagnostic information (logs, topic status, network connectivity)
Follow relevant troubleshooting guide from Active Guides section
Verify resolution and document lessons learned

Validation¶

[ ] Each troubleshooting guide tested on current build
[ ] Diagnostic commands validated and produce expected output
[ ] Resolution procedures documented with verification steps
[ ] Cross-links to related docs verified

References¶

Documentation Root
Hardware Index
Software Index
Networking Index
ROS 2 Troubleshooting: https://docs.ros.org/en/humble/Tutorials/Beginner-CLI-Tools.html