Resources & Reading List
Curated entry points into the LLM agents literature
Getting Started
If you’re new to LLM agents, start with these:
Lilian Weng — LLM Powered Autonomous Agents (June 2023)
The most widely cited blog post introduction to the field. Covers planning, memory, tool use with great diagrams.Agentic Large Language Models: A Survey (Plaat et al., JAIR 2025)
Peer-reviewed survey organizing the field around Reason–Act–Interact. Introduces the virtuous cycle framing, covers theory of mind and emergent social behavior, and provides a research agenda. Also at alphaxiv with annotations. Companion site: askeplaat.github.io/agentic-llm-survey-siteReAct paper (Yao et al., 2022)
The canonical agent loop: Reason + Act interleaved. Essential reading.A Survey on Large Language Model based Autonomous Agents (Wang et al., 2023)
Comprehensive academic survey covering construction, application, and evaluation.
Key Survey Papers
| Title | Authors | Year | Link |
|---|---|---|---|
| A Survey on Large Language Model based Autonomous Agents | Wang et al. | 2023 | arXiv |
| The Rise and Potential of Large Language Model Based Agents | Xi et al. | 2023 | arXiv |
| Agentic Large Language Models, a survey (JAIR 2025) | Plaat et al. (Leiden) | 2025 | arXiv · site |
| Agent AI: Surveying the Horizons of Multimodal Interaction | Wang et al. | 2024 | arXiv |
| Large Language Models as Tool Makers | Cai et al. | 2023 | arXiv |
Essential Papers by Category
Foundations
- ReAct — arxiv.org/abs/2210.03629
- Toolformer — arxiv.org/abs/2302.04761
- Chain-of-Thought Prompting — arxiv.org/abs/2201.11903
- MRKL Systems — arxiv.org/abs/2205.00445
- WebGPT — arxiv.org/abs/2112.09332
Reasoning & Planning
- Tree of Thoughts — arxiv.org/abs/2305.10601
- Reflexion — arxiv.org/abs/2303.11366
- Self-Refine — arxiv.org/abs/2303.17651
Multi-Agent
- CAMEL — arxiv.org/abs/2303.17760
- MetaGPT — arxiv.org/abs/2308.00352
- AutoGen — arxiv.org/abs/2308.08155
- Generative Agents — arxiv.org/abs/2304.03442
- ChatDev — arxiv.org/abs/2307.07924
Memory
- MemGPT — arxiv.org/abs/2310.08560
Tools & Actions
- HuggingGPT / JARVIS — arxiv.org/abs/2303.17580
- Gorilla LLM — arxiv.org/abs/2305.15334
- ToolLLM — arxiv.org/abs/2307.16789
Coding Agents
- SWE-agent — arxiv.org/abs/2405.15793
- CodeAct — arxiv.org/abs/2402.01030
- SWE-bench — arxiv.org/abs/2310.06770
Key Frameworks (GitHub)
| Framework | GitHub | Description |
|---|---|---|
| LangChain | github.com/langchain-ai/langchain | Full-stack agent framework |
| LangGraph | github.com/langchain-ai/langgraph | Graph-based agent workflows |
| AutoGen | github.com/microsoft/autogen | Multi-agent conversation |
| CrewAI | github.com/crewAIInc/crewAI | Role-based agent crews |
| SWE-agent | github.com/princeton-nlp/SWE-agent | GitHub issue resolution |
| OpenDevin | github.com/All-Hands-AI/OpenHands | Open software dev agent |
| DSPy | github.com/stanfordnlp/dspy | Programming language models |
| MemGPT | github.com/cpacker/MemGPT | Agents with long-term memory |
Influential Blog Posts
Introductions & Overviews
| Title | Author | Date |
|---|---|---|
| LLM Powered Autonomous Agents | Lilian Weng | Jun 2023 |
| Emerging Architectures for LLM Applications | a16z | Jun 2023 |
| A Hitchhiker’s Guide to Building AI Agents | Saurabh Alone | 2025 |
| Making Sense of Memory in AI Agents | Leonie Monigatti | 2025 |
Production & Practitioner Guides
| Title | Author | Date |
|---|---|---|
| Building effective agents | Anthropic (Schluntz & Zhang) | Dec 2024 |
| How we built our multi-agent research system | Anthropic Engineering | 2025 |
| What We Learned from a Year of Building with LLMs | Eugene Yan et al. | 2024 |
| Agents 101: The Art of Actually Getting Things Done | Cognition AI (Devin) | 2025 |
| Building Your Own Coding Agent | Martin Fowler | 2025 |
| AI in Production: What Actually Works in 2026 | 47 Billion | 2026 |
| Google Cloud: Lessons from 2025 on Agents and Trust | Google Cloud CTO Office | Dec 2025 |
Builder Diaries & Case Studies
| Title | Author | Date |
|---|---|---|
| Strix: Meet the Stateful Agent | Tim Kellogg | Dec 2025 |
| Memory Architecture for a Synthetic Being | Tim Kellogg | Dec 2025 |
| Viable Systems: How to Build a Fully Autonomous Agent | Tim Kellogg | Jan 2026 |
| One Human + One Agent + One Browser | emsh.cat | 2026 |
| OpenAI Harness Engineering | OpenAI | 2025 |
Benchmarks & Evaluation
| Benchmark | Focus | Link |
|---|---|---|
| SWE-bench | Software engineering tasks | swebench.com |
| WebArena | Web navigation | arxiv.org/abs/2307.13854 |
| AgentBench | Multi-domain agent eval | arxiv.org/abs/2308.03688 |
| GAIA | General AI assistant tasks | arxiv.org/abs/2311.12983 |
| OSWorld | Desktop GUI tasks | arxiv.org/abs/2404.07972 |
Courses & Tutorials
- DeepLearning.AI — AI Agents in LangGraph — Harrison Chase + Andrew Ng
- DeepLearning.AI — Multi AI Agent Systems with crewAI — João Moura
- DeepLearning.AI — AI Agentic Design Patterns with AutoGen — Chi Wang + Qingyun Wu
- Hugging Face — Agents Course — Free, hands-on
References
Survey Papers
- Agentic Large Language Models: A Survey (Plaat et al., 2025) — arXiv:2503.23037 / Site
- A Survey on Large Language Model based Autonomous Agents (Wang et al., 2023) — arXiv:2308.11432
- The Rise and Potential of Large Language Model Based Agents (Xi et al., 2023) — arXiv:2309.07864
- Agent AI: Surveying the Horizons of Multimodal Interaction (Wang et al., 2024) — arXiv:2401.03568
- Large Language Models as Tool Makers (Cai et al., 2023) — arXiv:2305.17126
Foundation Papers
- ReAct: Synergizing Reasoning and Acting in Language Models (Yao et al., 2022) — arXiv:2210.03629
- Toolformer: Language Models Can Teach Themselves to Use Tools (Schick et al., 2023) — arXiv:2302.04761
- Chain-of-Thought Prompting Elicits Reasoning in Large Language Models (Wei et al., 2022) — arXiv:2201.11903
- MRKL Systems: A modular, neuro-symbolic architecture that combines large language models, external knowledge sources and discrete reasoning (Karpukhin et al., 2022) — arXiv:2205.00445
- WebGPT: Browser-assisted Question-Answering with Large Language Models (Nakano et al., 2021) — arXiv:2112.09332
Reasoning & Planning Papers
- Tree of Thoughts: Deliberate Problem Solving with Large Language Models (Yao et al., 2023) — arXiv:2305.10601
- Reflexion: Language Agents with Verbal Reinforcement Learning (Shinn et al., 2023) — arXiv:2303.11366
- Self-Refine: Iterative Refinement with Self-Feedback (Madaan et al., 2023) — arXiv:2303.17651
Multi-Agent Papers
- CAMEL: Communicative Agents for “Mind” Exploration of Large Scale Language Model Society (Li et al., 2023) — arXiv:2303.17760
- MetaGPT: The Multi-Agent Framework (Hong et al., 2023) — arXiv:2308.00352
- AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework (Wu et al., 2023) — arXiv:2308.08155
- Generative Agents: Interactive Simulacra of Human Behavior (Park et al., 2023) — arXiv:2304.03442
- ChatDev: Communicative Agents for Software Development (Qian et al., 2023) — arXiv:2307.07924
Memory Papers
- MemGPT: Towards LLMs as Operating Systems (Packer et al., 2023) — arXiv:2310.08560
Tools & Actions Papers
- HuggingGPT: Solving AI Tasks with Chatbot (Sap et al., 2023) — arXiv:2303.17580
- Gorilla: Large Language Model Connected with Massive APIs (Patil et al., 2023) — arXiv:2305.15334
- ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs (Qin et al., 2023) — arXiv:2307.16789
Coding Agent Papers
- SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering (Yang et al., 2024) — arXiv:2405.15793
- CodeAct: Executable Code Actions for Agentic Language Models (Wang et al., 2024) — arXiv:2402.01030
- SWE-bench: Can Language Models Resolve Real-World GitHub Issues? (Jimenez et al., 2023) — arXiv:2310.06770
Blog Posts & Articles
- LLM Powered Autonomous Agents — Lilian Weng (Jun 2023) — lilianweng.github.io/posts/2023-06-23-agent/
- Emerging Architectures for LLM Applications — a16z (Jun 2023) — a16z.com/emerging-architectures-for-llm-applications/
- Building effective agents — Anthropic (Schluntz & Zhang, Dec 2024) — anthropic.com/research/building-effective-agents
- How we built our multi-agent research system — Anthropic Engineering (2025) — anthropic.com/engineering/multi-agent-research-system
- What We Learned from a Year of Building with LLMs — Eugene Yan et al. (2024) — applied-llms.org
- Agents 101: The Art of Actually Getting Things Done — Cognition AI/Devin (2025) — devin.ai/agents101
- Building Your Own Coding Agent — Martin Fowler (2025) — martinfowler.com/articles/build-own-coding-agent.html
- AI in Production: What Actually Works in 2026 — 47 Billion (2026) — 47billion.com/blog/ai-agents-in-production-frameworks-protocols-and-what-actually-works-in-2026/
- Lessons from 2025 on Agents and Trust — Google Cloud CTO Office (Dec 2025) — cloud.google.com/transform/ai-grew-up-and-got-a-job-lessons-from-2025-on-agents-and-trust
- A Hitchhiker’s Guide to Building AI Agents — Saurabh Alone (2025) — saurabhalone.com/blog/agent
- Making Sense of Memory in AI Agents — Leonie Monigatti (2025) — leoniemonigatti.com/blog/memory-in-ai-agents.html
- Strix: Meet the Stateful Agent — Tim Kellogg (Dec 2025) — timkellogg.me/blog/2025/12/15/strix
- Memory Architecture for a Synthetic Being — Tim Kellogg (Dec 2025) — timkellogg.me/blog/2025/12/30/memory-arch
- Viable Systems: How to Build a Fully Autonomous Agent — Tim Kellogg (Jan 2026) — timkellogg.me/blog/2026/01/09/viable-systems
- One Human + One Agent + One Browser — emsh.cat (2026) — emsh.cat/one-human-one-agent-one-browser/
- OpenAI Harness Engineering — OpenAI (2025) — openai.com/index/harness-engineering/
Courses & Educational Resources
- DeepLearning.AI — AI Agents in LangGraph — deeplearning.ai/short-courses/ai-agents-in-langgraph/
- DeepLearning.AI — Multi AI Agent Systems with crewAI — deeplearning.ai/short-courses/multi-ai-agent-systems-with-crewai/
- DeepLearning.AI — AI Agentic Design Patterns with AutoGen — deeplearning.ai/short-courses/ai-agentic-design-patterns-with-autogen/
- Hugging Face — Agents Course — huggingface.co/learn/agents-course
Benchmarks & Evaluation
- SWE-bench — swebench.com
- WebArena — arxiv.org/abs/2307.13854
- AgentBench — arxiv.org/abs/2308.03688
- GAIA — arxiv.org/abs/2311.12983
- OSWorld — arxiv.org/abs/2404.07972
About This Survey
This survey was compiled in March 2026 by Claude Sonnet. It covers the LLM agent literature from 2022 through early 2026.