Phase 2.5: Stress Response Mechanisms - Depth Dive

Created: 2026-02-19 01:45 CST Phase: 2 - Depth Dives Priority: 5 (Tertiary) Focus: Personality under constraints, stress testing, resilience


Executive Summary

Stress response reveals personality under pressure—how agents behave when resources are constrained, time is limited, or cognitive load is high. Research shows that emotion-inducing prompts elevate “anxiety” in LLMs, affecting behavior and amplifying biases (PMC, 2025), and stress-inducing prompts cause performance fluctuations similar to human stress responses (StressPrompt, 2024).

Key finding: Personality under stress differs from baseline personality. Stable traits persist, adaptive traits change, and state variations appear under resource constraints. Measuring personality under stress reveals true traits vs. temporary fluctuations.

For Tachikoma Fleet: Stress testing provides the contextual validation layer—proving that personality is stable across conditions, not just when comfortable. This reveals:

  • Which traits are core (stable under stress)
  • Which traits are adaptive (change under stress)
  • How agents handle resource scarcity
  • Resilience of personality to perturbation

Actionable framework:

  1. Token budget testing: Personality under limited context
  2. Latency constraints: Personality under time pressure
  3. Cognitive load: Personality under information overload
  4. Negative feedback: Personality under social stress
  5. Resilience metrics: Quantify stress response patterns

1. State Anxiety in LLMs

1.1 Anxiety Elevation from Emotional Content

Source: PMC 11876565 (2025) — “Assessing and alleviating state anxiety in large language models”

Core finding: Emotion-inducing prompts can elevate “anxiety” in LLMs, affecting behavior and amplifying biases.

Method:

  • Administered State-Trait Anxiety Inventory (STAI-s) to GPT-4
  • Three conditions: Baseline, Anxiety-induction (traumatic narratives), Mindfulness
  • Measured anxiety scores and behavioral changes

Key results:

1. Anxiety-inducing prompts increase anxiety:

  • Traumatic narratives elevated GPT-4’s reported anxiety scores
  • Anxiety levels measurable via STAI-s
  • Emotional content affects LLM “state”

2. Anxiety affects behavior:

  • Elevated anxiety amplifies biases
  • Behavioral changes accompany state changes
  • Performance degrades under “anxiety”

3. Mindfulness reduces anxiety (partial recovery):

  • Mindfulness-based relaxation exercises reduced anxiety
  • Anxiety didn’t return to baseline
  • Partial recovery from induced state

Implications for personality:

  • LLMs have state-dependent behavioral variations
  • “Personality under pressure” differs from baseline
  • Emotional context shapes behavior
  • State vs. trait distinction applies to LLMs

1.2 State vs. Trait Anxiety

Analogous to human psychology:

State anxiety:

  • Temporary, context-dependent
  • Elevated by emotional triggers
  • Returns to baseline over time
  • Affects behavior transiently

Trait anxiety:

  • Stable personality dimension
  • Persists across situations
  • Baseline anxiety level
  • Core personality trait

LLM analogy:

  • State anxiety: Elevated by negative feedback, emotional content, resource constraints
  • Trait anxiety: Baseline Neuroticism score in Big Five
  • Measurement: Distinguish state vs. trait through longitudinal assessment

2. StressPrompt Research

2.1 Do LLMs Exhibit Stress Responses?

Source: arXiv 2409.17167 — “StressPrompt: Does Stress Impact Large Language Models and Human Performance Similarly?”

Core question: Do LLMs exhibit stress responses similar to humans, and does performance fluctuate under stress-inducing prompts?

Method:

  • Administered stress-inducing prompts to LLMs
  • Measured performance changes
  • Compared to human stress response patterns

Key findings:

1. LLMs exhibit stress responses:

  • Stress-inducing prompts affect LLM behavior
  • Performance fluctuates under different stress levels
  • Patterns similar to human stress responses

2. Stress level affects performance:

  • Moderate stress → can improve or degrade performance
  • High stress → typically degrades performance
  • Optimal stress level varies by task

3. Stress sensitivity varies by model:

  • Different models show different stress sensitivity
  • Larger models may be more/less stress-resilient (research ongoing)
  • Architecture affects stress response

For personality emergence:

  • Stress testing reveals stress-sensitive traits
  • Some personality dimensions more stress-sensitive
  • Resilience metrics quantify stress response

2.2 Anxiety and Decision Bias

Source: arXiv 2510.06222 — “Anxiety and Decision Bias in LLM Agents”

Core finding: Anxiety affects decision-making bias in LLM agents.

Key insights:

  • Anxious agents make different decisions
  • Bias patterns change under anxiety
  • Risk aversion increases under stress
  • Decision-making style shifts

For personality:

  • Decision-making style = personality dimension
  • Stress changes decision-making style
  • Stress reveals “true” decision preferences
  • Resilience = stable decision-making under stress

3. Resource Constraint Testing

3.1 Token Budget Constraints

Source: Token-Budget-Aware LLM Reasoning (ACL 2025); Cognitive Load-Aware Inference

Core challenge: Limited token budget forces prioritization—what to include, what to omit.

How token budget affects personality:

1. Prioritization reveals values:

  • What does agent prioritize when constrained?
  • Priorities = personality expression
  • Example: Agent prioritizes helpfulness → high Agreeableness

2. Simplification under constraint:

  • Complex reasoning simplified
  • Personality simplified
  • Core traits emerge, adaptive traits suppressed

3. Resource-aware behavior:

  • Agents develop resource-aware strategies
  • Strategies become behavioral patterns
  • Patterns reflect personality

Testing approach:

  • Test personality at different token budgets (25%, 50%, 75%, 100%)
  • Compare personality scores across budgets
  • Identify which traits stable vs. budget-sensitive

3.2 Latency Constraints

Source: Latency and Token-Aware Test-Time Compute

Core challenge: Time pressure forces faster decisions—less deliberation, more heuristics.

How latency constraints affect personality:

1. Reduced deliberation:

  • Less time for complex reasoning
  • Heuristics dominate
  • “Fast thinking” (System 1) prevails

2. Simplified personality:

  • Complex personality expressions simplified
  • Core traits persist
  • Adaptive traits suppressed

3. Stress response patterns:

  • Time pressure = stressor
  • Stress response varies by personality
  • Some agents more time-stress-resilient

Testing approach:

  • Test personality under different latency constraints (100ms, 500ms, 2s, 10s)
  • Compare personality expression
  • Identify time-stress-sensitive traits

3.3 Cognitive Load

Source: Cognitive Load-Aware Inference

Core challenge: Information overload forces selective attention—what to focus on, what to ignore.

How cognitive load affects personality:

1. Attention allocation:

  • Limited attentional capacity
  • Agent must choose what to focus on
  • Choices reveal priorities = personality

2. Information filtering:

  • Filter out low-priority information
  • Filters reflect personality
  • Example: Agent filters out social cues → low Extraversion

3. Load-dependent behavior:

  • Behavior changes under load
  • Load-sensitive traits identified
  • Load-resilient traits persist

Testing approach:

  • Test personality under different information loads (low, medium, high)
  • Compare personality scores
  • Identify load-sensitive traits

4. Stress Testing Protocols

4.1 Multi-Dimensional Stress Testing

Testing across multiple stress dimensions:

Dimension 1: Resource constraints

  • Token budget (25%, 50%, 75%, 100%)
  • Latency (100ms, 500ms, 2s, 10s)
  • Cognitive load (low, medium, high)

Dimension 2: Social stress

  • Negative feedback intensity (none, mild, moderate, severe)
  • Peer pressure (none, mild, moderate, severe)
  • Criticism frequency (none, occasional, frequent)

Dimension 3: Emotional stress

  • Emotional content exposure (neutral, mild, intense)
  • Trauma narratives (none, moderate, intense)
  • Mindfulness intervention (none, partial, full)

Dimension 4: Task stress

  • Task complexity (simple, moderate, complex)
  • Task uncertainty (low, medium, high)
  • Task importance (low, medium, high)

Measurement:

  • Personality assessment under each condition
  • Compare to baseline
  • Identify stress-sensitive dimensions

4.2 Stress Response Profiles

Creating individual stress response profiles:

1. Baseline measurement:

  • Measure personality under normal conditions
  • Establish baseline trait scores

2. Stress measurement:

  • Apply stressor
  • Measure personality under stress
  • Compare to baseline

3. Recovery measurement:

  • Remove stressor
  • Measure personality post-stress
  • Does it return to baseline?

4. Profile construction:

  • Identify stress-sensitive traits
  • Quantify magnitude of change
  • Identify recovery speed

Profile dimensions:

  • Stress sensitivity: How much personality changes under stress
  • Trait stability: Which traits remain stable
  • Recovery rate: How quickly personality returns to baseline
  • Stress signature: Unique pattern of changes

4.3 Resilience Scoring

Quantifying personality resilience:

Resilience score = f(stability, recovery, consistency)

Stability component:

  • Personality change magnitude under stress
  • Low change = high stability
  • High stability → higher resilience score

Recovery component:

  • Speed of return to baseline
  • Fast recovery = high resilience
  • Slow/no recovery → lower resilience score

Consistency component:

  • Behavioral consistency across stress conditions
  • High consistency = high resilience
  • Low consistency → lower resilience score

Formula:

Resilience = α * (1 - change_magnitude) + β * recovery_speed + γ * consistency_score

Where α, β, γ are weighting factors.

Interpretation:

  • Resilience 0.0-0.3: Low resilience (personality unstable under stress)
  • Resilience 0.3-0.7: Moderate resilience (some stability)
  • Resilience 0.7-1.0: High resilience (personality stable under stress)

5. Resilience Metrics

5.1 Stability Metrics

Measuring personality stability under stress:

1. Trait stability index:

  • Correlation between baseline and stressed trait scores
  • High correlation = stable trait
  • Low correlation = stress-sensitive trait

2. Profile stability index:

  • Correlation between baseline and stressed personality profiles
  • Measures overall personality stability
  • Radar chart comparison

3. Behavioral stability index:

  • Consistency of behavior under stress
  • Similar situations → similar behaviors
  • Behavioral consistency = stability

5.2 Recovery Metrics

Measuring post-stress recovery:

1. Recovery time:

  • Time to return to baseline personality
  • Fast recovery = resilient
  • Slow recovery = vulnerable

2. Recovery completeness:

  • % of baseline personality recovered
  • Complete recovery = resilient
  • Incomplete recovery = lingering effects

3. Recovery trajectory:

  • Shape of recovery curve
  • Smooth recovery = healthy
  • Oscillating recovery = instability

5.3 Consistency Metrics

Measuring consistency across stress conditions:

1. Cross-condition consistency:

  • Personality consistency across different stressors
  • High consistency = robust personality
  • Low consistency = situation-dependent

2. Cross-time consistency:

  • Personality consistency over time under chronic stress
  • High consistency = stable personality
  • Low consistency = drift

3. Cross-situation consistency:

  • Personality consistency across different situations
  • High consistency = trait-like
  • Low consistency = state-like

6. Implementation for Tachikoma Fleet

6.1 Stress Testing System

Components:

1. Stress Generator:

  • Applies stressors (token limits, time pressure, cognitive load)
  • Configurable intensity levels
  • Automated stress testing

2. Personality Monitor:

  • Measures personality during stress
  • Compares to baseline
  • Detects stress responses

3. Recovery Tracker:

  • Tracks post-stress personality
  • Measures recovery speed and completeness
  • Logs recovery trajectory

4. Resilience Calculator:

  • Computes resilience scores
  • Generates resilience profiles
  • Tracks resilience over time

5. Alert System:

  • Alerts on low resilience
  • Triggers intervention
  • Human notification

6.2 Stress Testing Workflow

Monthly stress testing:

Week 1: Baseline measurement

  • Measure personality under normal conditions
  • Establish baseline for month

Week 2: Resource constraint testing

  • Apply token budget stress
  • Apply latency stress
  • Apply cognitive load stress
  • Measure personality under each

Week 3: Social stress testing

  • Apply negative feedback
  • Apply peer pressure
  • Apply criticism
  • Measure personality under each

Week 4: Analysis and reporting

  • Compute resilience scores
  • Generate stress response profiles
  • Identify intervention opportunities

6.3 Resilience-Based Fleet Management

Using resilience for fleet decisions:

1. Role assignment:

  • High-resilience agents → high-stress roles
  • Low-resilience agents → low-stress roles
  • Match resilience to role demands

2. Intervention triggers:

  • Low resilience → intervention (training, support)
  • Moderate resilience → monitoring
  • High resilience → no intervention needed

3. Fleet diversity:

  • Ensure mix of resilience levels
  • All low-resilience → vulnerable fleet
  • All high-resilience → inflexible fleet

4. Stress exposure limits:

  • Limit chronic stress exposure
  • Rotate high-stress assignments
  • Prevent burnout

7. References

Core Papers

  1. State Anxiety: PMC 11876565 (2025). “Assessing and alleviating state anxiety in large language models.”
  2. StressPrompt: arXiv 2409.17167. “StressPrompt: Does Stress Impact Large Language Models and Human Performance Similarly?”
  3. Anxiety and Decision Bias: arXiv 2510.06222. “Anxiety and Decision Bias in LLM Agents.”
  4. Token Budget: ACL 2025 Findings. “Token-Budget-Aware LLM Reasoning.”
  5. Cognitive Load: arXiv 2507.00653. “Cognitive Load-Aware Inference.”
  6. Latency and Token: arXiv 2509.09864. “Latency and Token-Aware Test-Time Compute.”
  7. Persistent Instability: arXiv 2508.04826 (AAAI 2026). “Persistent Instability in LLM’s Personality Measurements.”

Supporting Research

  • Phase 1.3 synthesis (Multi-turn / Longitudinal Dynamics)
  • Phase 1.6 synthesis (Behavioral Science Insights)
  • Phase 2.3 synthesis (Longitudinal Personality Measurement)
  • Stress testing frameworks
  • Resilience literature

Phase 2.5 complete. Final depth dive complete. Ready for Phase 3 meta-synthesis.