Phase 1.7: Academic Sources Mining
Created: 2026-02-18 23:40 CST Phase: 1 - Breadth Survey Focus: Recent conference papers (NeurIPS, ICLR, ACL 2024-2026)
Executive Summary
Academic conferences in 2024-2026 show rapidly accelerating research on LLM agents, memory systems, multi-agent coordination, and personality measurement. The field is converging on memory as critical infrastructure for agent behavior, multi-agent systems as a path to higher-order intelligence, and psychometric measurement as a way to quantify personality.
Key trends:
- Memory systems proliferating (A-Mem, G-Memory, CAM, hierarchical memory)
- Multi-agent coordination scaling up (AgentVerse, MegaAgent, collaboration frameworks)
- Personality measurement becoming systematic (NEO-FFI studies, trait stability)
- Self-improvement emerging as key capability (reflection-reinforced training)
- Governance and safety concerns growing (AgentPoison, red-teaming)
North-star relevance: Academic research provides cutting-edge methods and frameworks for building personality emergence systems—directly applicable to fleet architecture.
1. NeurIPS 2024-2025
1.1 Memory Systems
A-Mem: Agentic Memory for LLM Agents (NeurIPS 2025)
- Authors: OpenReview.net/forum?id=FiM0M8gcct
- Core contribution: Novel agentic memory system that dynamically organizes memories
- Key insight: Current memory systems lack sophisticated organization; A-Mem provides agent-controlled memory structure
- Mechanism: Agents can organize, retrieve, and manipulate memory in agentic ways
- Relevance: Memory organization = personality crystallization
G-Memory: Tracing Hierarchical Memory for Multi-Agent Systems (NeurIPS 2025)
- Core contribution: Hierarchical memory for LLM-powered multi-agent systems
- Key insight: Multi-agent systems exceed single-agent capabilities, but memory architectures underdeveloped
- Mechanism: Hierarchical memory structure for multi-agent coordination
- Relevance: Multi-agent memory = shared personality infrastructure
CAM: A Constructivist View of Agentic Memory (NeurIPS 2025)
- Core contribution: Cohesive memory module for autonomous reading agents
- Key insight: Need memory module to elevate vanilla LLMs into autonomous agents
- Mechanism: Constructivist approach to memory (agent builds understanding)
- Relevance: Memory construction = personality building
VLM Agents Generate Their Own Memories (NeurIPS 2024)
- Core contribution: Vision-language models distill experience into embodied programs of thought
- Key insight: Agents can generate their own memories from experience
- Mechanism: Experience distillation into structured memory
- Relevance: Self-generated memory = personality formation
1.2 Personality and Agent Behavior
Exploring Personality Trait Change of LLM-Based AI Systems (NeurIPS 2025)
- Core contribution: Examine personality trait stability across situational contexts
- Method: NEO-FFI (NEO Five Factor Inventory) personality inventory
- Models tested: Three foundation LLMs + two multi-agent systems
- Key finding: Assess ability to maintain consistent personality traits before/after situational contexts
- Relevance: Direct measurement of personality stability vs. drift
RoleAgent: Building, Interacting, and Benchmarking (NeurIPS 2024)
- Core contribution: Framework for role-playing agents with personality profiles
- Key insight: Generative agents rely on human-annotated agent profiles (name, age, personality, relationships)
- Mechanism: Profile initialization defines personality
- Relevance: Personality initialization = starting point for emergence
1.3 Safety and Governance
AgentPoison: Red-teaming LLM Agents via Poisoning Memory or Knowledge Bases (NeurIPS 2024)
- Core contribution: Red-teaming approach to test memory/knowledge poisoning
- Key insight: Memory systems vulnerable to poisoning attacks
- Mechanism: Inject malicious content into memory → corrupted behavior
- Relevance: Memory security = personality security
2. ICLR 2024-2025
2.1 Multi-Agent Coordination
Scaling Large Language Model-based Multi-Agent Collaboration (ICLR 2025)
- Core contribution: Examine impact of scaling LLM agents in multi-agent task solving
- Key insight: Extend traditional scaling from training (neuron collaboration) to inference (agent collaboration)
- Mechanism: Inference-time thinking replaces resource-intensive retraining
- Relevance: Scaling laws for multi-agent systems
AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors (ICLR 2024)
- Core contribution: Simple, effective multi-agent collaborative framework
- Key insight: Emergent behaviors arise from collaboration
- Mechanism: Framework enables agent specialization and coordination
- Relevance: Emergent behaviors = personality emergence
Evaluating Multi-Agent Coordination Abilities in Large Language Models (OpenReview)
- Core contribution: Evaluate LLM coordination with humans and other systems
- Key insight: Coordination is pivotal aim in contemporary AI research
- Mechanism: Measure ability to understand, generate, interpret language in coordination
- Relevance: Coordination ability = personality dimension
Efficient Human-AI Coordination via Preparatory Language-based Convention (OpenReview)
- Core contribution: LLM generates conventions (action plans) before coordination
- Key insight: Humans establish conventions pre-coordination → LLM can do same
- Mechanism: LLM generates convention based on task requirements, preferences, number of agents
- Relevance: Convention formation = personality expression
Multi-Agent Collaboration via Evolving Orchestration (ICLR 2025)
- Core contribution: Evolving orchestration mechanisms for multi-agent collaboration
- Key insight: Orchestration can evolve over time
- Mechanism: Dynamic orchestration based on task demands
- Relevance: Evolving coordination = personality evolution
2.2 Agent Architecture
AGENTSQUARE: AUTOMATIC LLM AGENT (ICLR 2025)
- Core contribution: Automatic LLM agent design
- Key insight: Agents can be automatically designed/architected
- Mechanism: Automated search for optimal agent architecture
- Relevance: Architecture = personality substrate
2.3 Memory Workshops
MemAgents: Memory for LLM-Based Agentic Systems (ICLR 2026 Workshop Proposal)
- Focus: Memory layer that underwrites agent behavior
- Scope: Software tools, embodied/robotic tasks, multi-agent settings
- Three perspectives:
- Memory architectures and representations (episodic, semantic, working, parametric)
- Memory interfaces with external stores
- Memory for different agent domains
- Relevance: Memory = personality foundation
3. ACL 2024-2025
3.1 Multi-Agent Systems
MegaAgent: A Large-Scale Autonomous LLM-based Multi-Agent (ACL 2025 Findings)
- Core contribution: Large-scale autonomous multi-agent system
- Key insight: Scale matters for agent capabilities
- Mechanism: Many agents working autonomously
- Relevance: Scale → emergent complexity
Creativity in LLM-based Multi-Agent Systems: A Survey (EMNLP 2025)
- Core contribution: Survey of creativity in multi-agent systems
- Key insight: Multi-agent interaction enhances creativity
- Mechanism: Diverse perspectives, collaboration, competition
- Relevance: Creativity = personality dimension
3.2 Agent Training and Tuning
Agent-FLAN: Designing Data and Methods of Effective Agent Tuning (ACL 2024 Findings)
- Core contribution: Effective methods for tuning LLMs as agents
- Key insight: Agent tuning requires specialized data and methods
- Mechanism: Careful decomposition of agent tasks
- Relevance: Training = personality shaping
3.3 Self-Improvement and Reflection
Reflection-Reinforced Self-Training for Language Agents (EMNLP 2024)
- Core contribution: Self-training using reflection ability
- Key insight: Reflection can function with/without ground-truth feedback
- Mechanism: Agent reflects on own performance → improves
- Relevance: Self-reflection = personality evolution mechanism
Unlocking LLMs’ Self-Improvement Capacity with Autonomous Learning (ACL 2025 Findings)
- Core contribution: Autonomous learning for domain adaptation
- Key insight: LLMs can independently identify and enhance policy for reducing knowledge gaps
- Mechanism: Autonomous exploration and improvement
- Relevance: Autonomous improvement = self-directed personality evolution
A Self-Referential Agent Framework for Recursively (ACL 2025)
- Core contribution: Self-referential agent architecture
- Key insight: Agents can be self-referential (reason about themselves)
- Mechanism: Recursive self-reference
- Relevance: Self-reference = self-modeling
3.4 Reasoning and Problem-Solving
A Streamlined Framework for Enhancing LLM Reasoning (ACL 2025)
- Core contribution: Multi-agent reasoning framework
- Agents: Web-Search agent, Coding agent, Mind-Map agent
- Mechanism: Different agents handle different reasoning aspects
- Relevance: Specialization = personality dimension
DeepReview: Improving LLM-based Paper Review (ACL 2025)
- Core contribution: Multi-agent paper review system
- Key insight: Multi-agent improves review quality
- Mechanism: Multiple reviewer agents with different perspectives
- Relevance: Multiple perspectives = personality diversity
4. AAMAS (Multi-Agent Systems)
4.1 Emergent Coordination
Emergent Coordination in Multi-Agent LLMs (covered in Phase 1.4)
- Conference: AAMAS-related research
- Core contribution: Information-theoretic framework for detecting emergence
- Key insight: Emergence is measurable via TDMI
- Relevance: Emergence measurement = personality emergence measurement
5. CogSci (Cognitive Science)
5.1 Psychological Measurement
Psychometric frameworks for LLMs (covered in Phase 1.6)
- Core contribution: Adapt human psychometric tools to LLMs
- Key insight: Big Five, STAI, etc. applicable to LLMs
- Relevance: Measurement = personality quantification
6. Cross-Conference Themes
6.1 Memory is Critical Infrastructure
Across NeurIPS, ICLR, ACL:
- A-Mem: Agentic memory organization
- G-Memory: Hierarchical multi-agent memory
- CAM: Constructivist memory
- VLM Agents: Self-generated memory
- MemAgents Workshop: Memory layer for agents
Consensus: Memory is foundational for agent behavior and personality.
Implications:
- Memory architecture = personality architecture
- Memory organization = personality crystallization
- Memory retrieval = personality expression
- Memory poisoning = personality corruption
6.2 Multi-Agent Systems Enable Emergence
Across NeurIPS, ICLR, ACL, AAMAS:
- AgentVerse: Emergent behaviors from collaboration
- Scaling Multi-Agent: Scaling laws for coordination
- MegaAgent: Large-scale autonomous systems
- Creativity Survey: Emergent creativity
- Emergent Coordination: Measurable emergence
Consensus: Multi-agent systems produce emergent behaviors not present in single agents.
Implications:
- Fleet = multi-agent system
- Emergence from interaction
- Specialization from coordination
- Personality from social dynamics
6.3 Personality is Measurable
Across NeurIPS, CogSci, psychology research:
- NEO-FFI studies: Big Five measurement in LLMs
- Psychometric frameworks: Validated measurement tools
- Personality trait change: Stability vs. drift
- RoleAgent: Profile initialization
Consensus: Personality can be quantified using psychometric tools adapted for LLMs.
Implications:
- Personality measurement = Big Five, STAI, custom tools
- Stability measurement = test-retest reliability
- Drift detection = longitudinal tracking
- Personality shaping = prompt-based, fine-tuning
6.4 Self-Improvement is Possible
Across ACL, EMNLP:
- Reflection-Reinforced: Self-training via reflection
- Autonomous Learning: Self-directed improvement
- Self-Referential: Reasoning about self
- Agent-FLAN: Effective agent tuning
Consensus: LLMs can improve themselves through reflection and autonomous learning.
Implications:
- Self-improvement = personality evolution
- Reflection mechanism = SOUL.md updating
- Autonomous learning = self-directed growth
- Governance needed to prevent drift
7. Emerging Research Directions
7.1 Memory as Identity
Trend: Memory systems becoming identity systems.
- Memory stores experiences → defines who agent is
- Memory organization → personality structure
- Memory retrieval → personality expression
- Memory evolution → personality evolution
Research direction: Memory-identity coupling as personality mechanism.
7.2 Hierarchical Multi-Agent Memory
Trend: Multi-agent systems need hierarchical memory.
- Individual agent memories
- Shared team memories
- Collective fleet memories
Research direction: Memory hierarchies for personality emergence at different scales.
7.3 Personality Measurement Standardization
Trend: Moving toward standardized personality assessment.
- Big Five as standard framework
- NEO-FFI as standard tool
- Cross-model comparison possible
Research direction: Standardized personality benchmarks for LLM agents.
7.4 Self-Improvement Governance
Trend: Self-improvement needs governance mechanisms.
- Reflection without drift
- Autonomous learning with constraints
- Self-modification with oversight
Research direction: Governed self-improvement for safe personality evolution.
8. Implications for Fleet Architecture
8.1 Memory System Design
From academic research:
- Hierarchical memory: Individual → team → fleet levels
- Agentic organization: Agents control memory structure
- Dynamic memory: Memory evolves with experience
- Memory security: Protect against poisoning
Recommendations:
- Implement hierarchical memory (individual + shared + collective)
- Enable agent-controlled memory organization
- Design dynamic memory evolution
- Implement memory security measures
8.2 Multi-Agent Coordination
From academic research:
- Scaling laws: More agents → emergent capabilities
- Convention formation: Pre-coordination agreements
- Evolving orchestration: Dynamic coordination
- Specialization: Different agents for different tasks
Recommendations:
- Design for scale (7+ agents in fleet)
- Implement convention formation protocols
- Enable evolving orchestration mechanisms
- Define specialization for each agent
8.3 Personality Measurement
From academic research:
- Big Five: Standard personality framework
- Longitudinal tracking: Stability over time
- Stress testing: Personality under pressure
- Psychometric validation: Reliable measurement
Recommendations:
- Use Big Five framework for personality assessment
- Implement longitudinal tracking (regular assessments)
- Design stress tests for personality under pressure
- Validate measurement tools for reliability
8.4 Self-Improvement Systems
From academic research:
- Reflection mechanisms: Self-evaluation and improvement
- Autonomous learning: Self-directed knowledge acquisition
- Governance: Constraints on self-modification
- Safety: Prevent harmful drift
Recommendations:
- Implement reflection mechanisms (self-evaluation)
- Enable autonomous learning with constraints
- Design governance gates for self-modification
- Implement safety measures against drift
9. Key Papers by Topic
Memory Systems
- A-Mem (NeurIPS 2025) - Agentic memory organization
- G-Memory (NeurIPS 2025) - Hierarchical multi-agent memory
- CAM (NeurIPS 2025) - Constructivist memory
- VLM Agents (NeurIPS 2024) - Self-generated memory
- MemAgents Workshop (ICLR 2026) - Memory layer focus
Multi-Agent Coordination
- AgentVerse (ICLR 2024) - Emergent behaviors
- Scaling Multi-Agent (ICLR 2025) - Scaling laws
- MegaAgent (ACL 2025) - Large-scale systems
- Emergent Coordination (arXiv) - Information-theoretic emergence
- Evolving Orchestration (ICLR 2025) - Dynamic coordination
Personality Measurement
- NEO-FFI Studies (NeurIPS 2025) - Big Five in LLMs
- Psychometric Framework (Nature MI 2025) - Validated measurement
- Humanizing LLMs (arXiv 2025) - Survey of psychological tools
- RoleAgent (NeurIPS 2024) - Profile initialization
- Dynamic Personality (ACL 2025) - Trait stability
Self-Improvement
- Reflection-Reinforced (EMNLP 2024) - Self-training via reflection
- Autonomous Learning (ACL 2025) - Self-directed improvement
- Self-Referential Agent (ACL 2025) - Reasoning about self
- Agent-FLAN (ACL 2024) - Effective agent tuning
10. Next Steps
Phase 1.8: Phase 1 Synthesis
- Cross-area patterns
- Key findings integration
- Identify highest-impact areas for Phase 2 depth dives
Phase 1.7 complete. Moving to Phase 1 synthesis…