Phase 1.7: Academic Sources Mining

Created: 2026-02-18 23:40 CST Phase: 1 - Breadth Survey Focus: Recent conference papers (NeurIPS, ICLR, ACL 2024-2026)


Executive Summary

Academic conferences in 2024-2026 show rapidly accelerating research on LLM agents, memory systems, multi-agent coordination, and personality measurement. The field is converging on memory as critical infrastructure for agent behavior, multi-agent systems as a path to higher-order intelligence, and psychometric measurement as a way to quantify personality.

Key trends:

  • Memory systems proliferating (A-Mem, G-Memory, CAM, hierarchical memory)
  • Multi-agent coordination scaling up (AgentVerse, MegaAgent, collaboration frameworks)
  • Personality measurement becoming systematic (NEO-FFI studies, trait stability)
  • Self-improvement emerging as key capability (reflection-reinforced training)
  • Governance and safety concerns growing (AgentPoison, red-teaming)

North-star relevance: Academic research provides cutting-edge methods and frameworks for building personality emergence systems—directly applicable to fleet architecture.


1. NeurIPS 2024-2025

1.1 Memory Systems

A-Mem: Agentic Memory for LLM Agents (NeurIPS 2025)

  • Authors: OpenReview.net/forum?id=FiM0M8gcct
  • Core contribution: Novel agentic memory system that dynamically organizes memories
  • Key insight: Current memory systems lack sophisticated organization; A-Mem provides agent-controlled memory structure
  • Mechanism: Agents can organize, retrieve, and manipulate memory in agentic ways
  • Relevance: Memory organization = personality crystallization

G-Memory: Tracing Hierarchical Memory for Multi-Agent Systems (NeurIPS 2025)

  • Core contribution: Hierarchical memory for LLM-powered multi-agent systems
  • Key insight: Multi-agent systems exceed single-agent capabilities, but memory architectures underdeveloped
  • Mechanism: Hierarchical memory structure for multi-agent coordination
  • Relevance: Multi-agent memory = shared personality infrastructure

CAM: A Constructivist View of Agentic Memory (NeurIPS 2025)

  • Core contribution: Cohesive memory module for autonomous reading agents
  • Key insight: Need memory module to elevate vanilla LLMs into autonomous agents
  • Mechanism: Constructivist approach to memory (agent builds understanding)
  • Relevance: Memory construction = personality building

VLM Agents Generate Their Own Memories (NeurIPS 2024)

  • Core contribution: Vision-language models distill experience into embodied programs of thought
  • Key insight: Agents can generate their own memories from experience
  • Mechanism: Experience distillation into structured memory
  • Relevance: Self-generated memory = personality formation

1.2 Personality and Agent Behavior

Exploring Personality Trait Change of LLM-Based AI Systems (NeurIPS 2025)

  • Core contribution: Examine personality trait stability across situational contexts
  • Method: NEO-FFI (NEO Five Factor Inventory) personality inventory
  • Models tested: Three foundation LLMs + two multi-agent systems
  • Key finding: Assess ability to maintain consistent personality traits before/after situational contexts
  • Relevance: Direct measurement of personality stability vs. drift

RoleAgent: Building, Interacting, and Benchmarking (NeurIPS 2024)

  • Core contribution: Framework for role-playing agents with personality profiles
  • Key insight: Generative agents rely on human-annotated agent profiles (name, age, personality, relationships)
  • Mechanism: Profile initialization defines personality
  • Relevance: Personality initialization = starting point for emergence

1.3 Safety and Governance

AgentPoison: Red-teaming LLM Agents via Poisoning Memory or Knowledge Bases (NeurIPS 2024)

  • Core contribution: Red-teaming approach to test memory/knowledge poisoning
  • Key insight: Memory systems vulnerable to poisoning attacks
  • Mechanism: Inject malicious content into memory → corrupted behavior
  • Relevance: Memory security = personality security

2. ICLR 2024-2025

2.1 Multi-Agent Coordination

Scaling Large Language Model-based Multi-Agent Collaboration (ICLR 2025)

  • Core contribution: Examine impact of scaling LLM agents in multi-agent task solving
  • Key insight: Extend traditional scaling from training (neuron collaboration) to inference (agent collaboration)
  • Mechanism: Inference-time thinking replaces resource-intensive retraining
  • Relevance: Scaling laws for multi-agent systems

AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors (ICLR 2024)

  • Core contribution: Simple, effective multi-agent collaborative framework
  • Key insight: Emergent behaviors arise from collaboration
  • Mechanism: Framework enables agent specialization and coordination
  • Relevance: Emergent behaviors = personality emergence

Evaluating Multi-Agent Coordination Abilities in Large Language Models (OpenReview)

  • Core contribution: Evaluate LLM coordination with humans and other systems
  • Key insight: Coordination is pivotal aim in contemporary AI research
  • Mechanism: Measure ability to understand, generate, interpret language in coordination
  • Relevance: Coordination ability = personality dimension

Efficient Human-AI Coordination via Preparatory Language-based Convention (OpenReview)

  • Core contribution: LLM generates conventions (action plans) before coordination
  • Key insight: Humans establish conventions pre-coordination → LLM can do same
  • Mechanism: LLM generates convention based on task requirements, preferences, number of agents
  • Relevance: Convention formation = personality expression

Multi-Agent Collaboration via Evolving Orchestration (ICLR 2025)

  • Core contribution: Evolving orchestration mechanisms for multi-agent collaboration
  • Key insight: Orchestration can evolve over time
  • Mechanism: Dynamic orchestration based on task demands
  • Relevance: Evolving coordination = personality evolution

2.2 Agent Architecture

AGENTSQUARE: AUTOMATIC LLM AGENT (ICLR 2025)

  • Core contribution: Automatic LLM agent design
  • Key insight: Agents can be automatically designed/architected
  • Mechanism: Automated search for optimal agent architecture
  • Relevance: Architecture = personality substrate

2.3 Memory Workshops

MemAgents: Memory for LLM-Based Agentic Systems (ICLR 2026 Workshop Proposal)

  • Focus: Memory layer that underwrites agent behavior
  • Scope: Software tools, embodied/robotic tasks, multi-agent settings
  • Three perspectives:
    1. Memory architectures and representations (episodic, semantic, working, parametric)
    2. Memory interfaces with external stores
    3. Memory for different agent domains
  • Relevance: Memory = personality foundation

3. ACL 2024-2025

3.1 Multi-Agent Systems

MegaAgent: A Large-Scale Autonomous LLM-based Multi-Agent (ACL 2025 Findings)

  • Core contribution: Large-scale autonomous multi-agent system
  • Key insight: Scale matters for agent capabilities
  • Mechanism: Many agents working autonomously
  • Relevance: Scale → emergent complexity

Creativity in LLM-based Multi-Agent Systems: A Survey (EMNLP 2025)

  • Core contribution: Survey of creativity in multi-agent systems
  • Key insight: Multi-agent interaction enhances creativity
  • Mechanism: Diverse perspectives, collaboration, competition
  • Relevance: Creativity = personality dimension

3.2 Agent Training and Tuning

Agent-FLAN: Designing Data and Methods of Effective Agent Tuning (ACL 2024 Findings)

  • Core contribution: Effective methods for tuning LLMs as agents
  • Key insight: Agent tuning requires specialized data and methods
  • Mechanism: Careful decomposition of agent tasks
  • Relevance: Training = personality shaping

3.3 Self-Improvement and Reflection

Reflection-Reinforced Self-Training for Language Agents (EMNLP 2024)

  • Core contribution: Self-training using reflection ability
  • Key insight: Reflection can function with/without ground-truth feedback
  • Mechanism: Agent reflects on own performance → improves
  • Relevance: Self-reflection = personality evolution mechanism

Unlocking LLMs’ Self-Improvement Capacity with Autonomous Learning (ACL 2025 Findings)

  • Core contribution: Autonomous learning for domain adaptation
  • Key insight: LLMs can independently identify and enhance policy for reducing knowledge gaps
  • Mechanism: Autonomous exploration and improvement
  • Relevance: Autonomous improvement = self-directed personality evolution

A Self-Referential Agent Framework for Recursively (ACL 2025)

  • Core contribution: Self-referential agent architecture
  • Key insight: Agents can be self-referential (reason about themselves)
  • Mechanism: Recursive self-reference
  • Relevance: Self-reference = self-modeling

3.4 Reasoning and Problem-Solving

A Streamlined Framework for Enhancing LLM Reasoning (ACL 2025)

  • Core contribution: Multi-agent reasoning framework
  • Agents: Web-Search agent, Coding agent, Mind-Map agent
  • Mechanism: Different agents handle different reasoning aspects
  • Relevance: Specialization = personality dimension

DeepReview: Improving LLM-based Paper Review (ACL 2025)

  • Core contribution: Multi-agent paper review system
  • Key insight: Multi-agent improves review quality
  • Mechanism: Multiple reviewer agents with different perspectives
  • Relevance: Multiple perspectives = personality diversity

4. AAMAS (Multi-Agent Systems)

4.1 Emergent Coordination

Emergent Coordination in Multi-Agent LLMs (covered in Phase 1.4)

  • Conference: AAMAS-related research
  • Core contribution: Information-theoretic framework for detecting emergence
  • Key insight: Emergence is measurable via TDMI
  • Relevance: Emergence measurement = personality emergence measurement

5. CogSci (Cognitive Science)

5.1 Psychological Measurement

Psychometric frameworks for LLMs (covered in Phase 1.6)

  • Core contribution: Adapt human psychometric tools to LLMs
  • Key insight: Big Five, STAI, etc. applicable to LLMs
  • Relevance: Measurement = personality quantification

6. Cross-Conference Themes

6.1 Memory is Critical Infrastructure

Across NeurIPS, ICLR, ACL:

  • A-Mem: Agentic memory organization
  • G-Memory: Hierarchical multi-agent memory
  • CAM: Constructivist memory
  • VLM Agents: Self-generated memory
  • MemAgents Workshop: Memory layer for agents

Consensus: Memory is foundational for agent behavior and personality.

Implications:

  • Memory architecture = personality architecture
  • Memory organization = personality crystallization
  • Memory retrieval = personality expression
  • Memory poisoning = personality corruption

6.2 Multi-Agent Systems Enable Emergence

Across NeurIPS, ICLR, ACL, AAMAS:

  • AgentVerse: Emergent behaviors from collaboration
  • Scaling Multi-Agent: Scaling laws for coordination
  • MegaAgent: Large-scale autonomous systems
  • Creativity Survey: Emergent creativity
  • Emergent Coordination: Measurable emergence

Consensus: Multi-agent systems produce emergent behaviors not present in single agents.

Implications:

  • Fleet = multi-agent system
  • Emergence from interaction
  • Specialization from coordination
  • Personality from social dynamics

6.3 Personality is Measurable

Across NeurIPS, CogSci, psychology research:

  • NEO-FFI studies: Big Five measurement in LLMs
  • Psychometric frameworks: Validated measurement tools
  • Personality trait change: Stability vs. drift
  • RoleAgent: Profile initialization

Consensus: Personality can be quantified using psychometric tools adapted for LLMs.

Implications:

  • Personality measurement = Big Five, STAI, custom tools
  • Stability measurement = test-retest reliability
  • Drift detection = longitudinal tracking
  • Personality shaping = prompt-based, fine-tuning

6.4 Self-Improvement is Possible

Across ACL, EMNLP:

  • Reflection-Reinforced: Self-training via reflection
  • Autonomous Learning: Self-directed improvement
  • Self-Referential: Reasoning about self
  • Agent-FLAN: Effective agent tuning

Consensus: LLMs can improve themselves through reflection and autonomous learning.

Implications:

  • Self-improvement = personality evolution
  • Reflection mechanism = SOUL.md updating
  • Autonomous learning = self-directed growth
  • Governance needed to prevent drift

7. Emerging Research Directions

7.1 Memory as Identity

Trend: Memory systems becoming identity systems.

  • Memory stores experiences → defines who agent is
  • Memory organization → personality structure
  • Memory retrieval → personality expression
  • Memory evolution → personality evolution

Research direction: Memory-identity coupling as personality mechanism.


7.2 Hierarchical Multi-Agent Memory

Trend: Multi-agent systems need hierarchical memory.

  • Individual agent memories
  • Shared team memories
  • Collective fleet memories

Research direction: Memory hierarchies for personality emergence at different scales.


7.3 Personality Measurement Standardization

Trend: Moving toward standardized personality assessment.

  • Big Five as standard framework
  • NEO-FFI as standard tool
  • Cross-model comparison possible

Research direction: Standardized personality benchmarks for LLM agents.


7.4 Self-Improvement Governance

Trend: Self-improvement needs governance mechanisms.

  • Reflection without drift
  • Autonomous learning with constraints
  • Self-modification with oversight

Research direction: Governed self-improvement for safe personality evolution.


8. Implications for Fleet Architecture

8.1 Memory System Design

From academic research:

  • Hierarchical memory: Individual → team → fleet levels
  • Agentic organization: Agents control memory structure
  • Dynamic memory: Memory evolves with experience
  • Memory security: Protect against poisoning

Recommendations:

  1. Implement hierarchical memory (individual + shared + collective)
  2. Enable agent-controlled memory organization
  3. Design dynamic memory evolution
  4. Implement memory security measures

8.2 Multi-Agent Coordination

From academic research:

  • Scaling laws: More agents → emergent capabilities
  • Convention formation: Pre-coordination agreements
  • Evolving orchestration: Dynamic coordination
  • Specialization: Different agents for different tasks

Recommendations:

  1. Design for scale (7+ agents in fleet)
  2. Implement convention formation protocols
  3. Enable evolving orchestration mechanisms
  4. Define specialization for each agent

8.3 Personality Measurement

From academic research:

  • Big Five: Standard personality framework
  • Longitudinal tracking: Stability over time
  • Stress testing: Personality under pressure
  • Psychometric validation: Reliable measurement

Recommendations:

  1. Use Big Five framework for personality assessment
  2. Implement longitudinal tracking (regular assessments)
  3. Design stress tests for personality under pressure
  4. Validate measurement tools for reliability

8.4 Self-Improvement Systems

From academic research:

  • Reflection mechanisms: Self-evaluation and improvement
  • Autonomous learning: Self-directed knowledge acquisition
  • Governance: Constraints on self-modification
  • Safety: Prevent harmful drift

Recommendations:

  1. Implement reflection mechanisms (self-evaluation)
  2. Enable autonomous learning with constraints
  3. Design governance gates for self-modification
  4. Implement safety measures against drift

9. Key Papers by Topic

Memory Systems

  1. A-Mem (NeurIPS 2025) - Agentic memory organization
  2. G-Memory (NeurIPS 2025) - Hierarchical multi-agent memory
  3. CAM (NeurIPS 2025) - Constructivist memory
  4. VLM Agents (NeurIPS 2024) - Self-generated memory
  5. MemAgents Workshop (ICLR 2026) - Memory layer focus

Multi-Agent Coordination

  1. AgentVerse (ICLR 2024) - Emergent behaviors
  2. Scaling Multi-Agent (ICLR 2025) - Scaling laws
  3. MegaAgent (ACL 2025) - Large-scale systems
  4. Emergent Coordination (arXiv) - Information-theoretic emergence
  5. Evolving Orchestration (ICLR 2025) - Dynamic coordination

Personality Measurement

  1. NEO-FFI Studies (NeurIPS 2025) - Big Five in LLMs
  2. Psychometric Framework (Nature MI 2025) - Validated measurement
  3. Humanizing LLMs (arXiv 2025) - Survey of psychological tools
  4. RoleAgent (NeurIPS 2024) - Profile initialization
  5. Dynamic Personality (ACL 2025) - Trait stability

Self-Improvement

  1. Reflection-Reinforced (EMNLP 2024) - Self-training via reflection
  2. Autonomous Learning (ACL 2025) - Self-directed improvement
  3. Self-Referential Agent (ACL 2025) - Reasoning about self
  4. Agent-FLAN (ACL 2024) - Effective agent tuning

10. Next Steps

Phase 1.8: Phase 1 Synthesis

  • Cross-area patterns
  • Key findings integration
  • Identify highest-impact areas for Phase 2 depth dives

Phase 1.7 complete. Moving to Phase 1 synthesis…