Agents and Cybernetics

Feedback, control, autopoiesis, and self-organization — the cybernetic lineage of intelligent agents

“The purpose of a system is what it does.” — Stafford Beer

Modern LLM agents — perceiving, reasoning, acting, and adapting in loops — are not a new idea. They are the latest instantiation of a half-century-old scientific program: cybernetics, the science of control and communication in complex systems. This page traces the cybernetic lineage of intelligent agents, from Norbert Wiener’s feedback loops to Stafford Beer’s Viable System Model, through autopoiesis and self-organization, to the self-evolving agents of 2025.

Cybernetics is not merely a historical precursor to AI — it is an ongoing research tradition with sharper formal tools for analyzing adaptive systems than most of what passes for “agent theory” today. The goal of this page is to make that tradition accessible to researchers building and studying LLM agents, and to show that the most pressing current questions — multi-agent coordination, self-improvement, agent identity, safety, and emergence — all have deep roots in the cybernetic canon.

1. Cybernetics: The Original Science of Control and Communication

In 1948, mathematician Norbert Wiener published Cybernetics: Or Control and Communication in the Animal and the Machine (MIT Press, 1948; archive.org scan), coining the term from the Greek kubernetes — steersman or governor. The book laid the theoretical foundations for a new science of purposive behavior in machines and living systems alike.

Wiener’s central insight was deceptively simple: feedback loops allow systems to pursue goals by detecting and correcting error. A thermostat is the canonical example — it measures the gap between actual and desired temperature, and acts to close it. This is negative feedback: deviation from a target state triggers a corrective response. The revolutionary claim was that this same mechanism could explain teleological (goal-directed) behavior in animals, brains, and machines — without invoking any supernatural purpose. Purposiveness could be a property of mechanism.

The McCulloch–Pitts Neuron

Two years before Wiener’s book, Warren McCulloch and Walter Pitts published “A Logical Calculus of the Ideas Immanent in Nervous Activity” (Bulletin of Mathematical Biophysics, 5, 115–133, 1943) — the paper that created the artificial neuron. They showed that networks of simple binary threshold units could, in principle, compute any logical function, grounding the possibility of machine cognition in formal logic and neurophysiology.

The Macy Conferences

The intellectual cradle of cybernetics was the Macy Conferences on Cybernetics (1946–1953), sponsored by the Josiah Macy Jr. Foundation. These interdisciplinary gatherings brought together Wiener, John von Neumann, Claude Shannon, Gregory Bateson, Margaret Mead, Walter Pitts, and others — creating what may be the most productive scientific conversation of the twentieth century. The agenda: discover the common principles underlying goal-directed behavior in organisms, brains, and machines.

The Macy Conferences were radically interdisciplinary by design: mathematicians, physicists, neurologists, anthropologists, and psychologists sat around the same table. This was not accidental — Wiener and collaborators believed that the most important scientific problems lay at the interfaces between disciplines, and that the common language of feedback, information, and control could unify what had previously been fragmented specialisms. The analogy to modern AI agent research — which draws simultaneously on linguistics, cognitive science, software engineering, organizational theory, and philosophy of mind — is striking.

Shannon’s Information Theory

In 1948, Claude Shannon published “A Mathematical Theory of Communication” in the Bell System Technical Journal (27, 379–423), establishing information theory as a quantitative science. Shannon’s insight — that information could be measured independently of meaning, as a reduction in uncertainty — provided cybernetics with a rigorous mathematical vocabulary. Information became the currency of feedback loops.

Together, Wiener’s feedback principle and Shannon’s information measure gave cybernetics a two-part foundation: what circulates in a control loop (information, measured as entropy reduction) and how it produces purposive behavior (negative feedback toward a goal state). This dual foundation underpins every modern agent design, from RLHF training loops to tool-calling pipelines.

2. First and Second Order Cybernetics

First-Order Cybernetics

First-order cybernetics studies systems from the outside — the observer is presumed to be separate from the system under study. A thermostat, a servo-mechanism, a homeostatic regulatory system: the scientist observes, measures, and models. The classic tradition of Wiener, Shannon, and Ashby (see §3) belongs here.

Second-Order Cybernetics: The Observer Enters the System

In the 1970s, a second revolution occurred. Heinz von Foerster, Humberto Maturana, and Francisco Varela recognized that the cybernetic framework had to turn back on itself: the observer is part of the system. Von Foerster edited Cybernetics of Cybernetics (University of Illinois, BCL Report 73.38, 1974; 2nd ed. Future Systems, 1995), a landmark collection that examined how observing systems construct their own reality.

Second-order cybernetics asks: not “how does the system work?” but “how does the observing system construct its account of the world?” This is an epistemological shift with profound consequences for AI: if an agent’s model of the world is always a product of its own architecture, then the agent’s outputs are as much a reflection of its internal structure as of external reality.

Gregory Bateson: “The Difference that Makes a Difference”

Gregory Bateson — anthropologist, cybernetician, and one of the central figures at the Macy Conferences — extended cybernetic thinking into ecology, mind, and social systems. His essay collection Steps to an Ecology of Mind (University of Chicago Press, 2000; orig. Ballantine Books, 1972; archive.org) contains his famous definition of information: “the difference that makes a difference.” A signal is only information relative to the receiver’s organization — a concept that resonates powerfully with context-sensitivity in LLMs.

Bateson also developed the concept of logical levels of learning — Learning 0, I, II, and III — corresponding to fixed response, single-loop learning, double-loop learning (learning to learn), and transformative change in the premises of learning itself. Modern agent architectures that modify their own prompts (meta-prompting), their own reward functions (EUREKA), or their own code (Darwin Gödel Machine) are beginning to operate at Bateson’s higher learning levels.

For second-order cybernetics, the epistemological stakes are high: a system that can observe its own observing is capable of self-modeling, reflexivity, and genuine self-modification. This is the theoretical underpinning of why agents with working memory and reflection (chain-of-thought, inner monologue) behave qualitatively differently from pure feed-forward models.

Ranulph Glanville and the Cybernetics of Observation

Ranulph Glanville (13 June 1946 – 20 December 2014) was an Anglo-Irish cybernetician and design theorist — a close colleague of Pask, Beer, and von Foerster, and president of the American Society for Cybernetics from 2009 until his death in 2014. Trained first as an architect at the Architectural Association in London and then as a cybernetician under Gordon Pask at Brunel University (PhD, 1975), he eventually held three doctorates and an honorary DSc in cybernetics and design. He left behind a body of work that is perhaps the most sustained attempt to think through second-order cybernetics as a lived practice — not merely as a theoretical system.

Theory of Objects. Glanville’s 1975 Brunel PhD — “A Cybernetic Development of Epistemology and Observation Applied to Objects in Space and Time as Seen in Architecture” — argued that objects are not fixed, pre-given entities but are constituted through the circular process of observation. An object is what remains stable across a series of descriptions: an eigenform produced by recursive observing, not a thing independent of any observer. This is a direct application of second-order cybernetics to ontology — the question shifts from “what is the object?” to “what does the observer do to constitute the object?” His later paper “Inside every white box there are two black boxes trying to get out” (Behavioral Science, 27(1), 1–11, 1982) extended this analysis: the act of opening a black box to understand it always generates new black boxes — there is no terminal, context-independent description.

Circular Conversations. Glanville argued that conversation is fundamentally circular: each participant is changed by the exchange, and the act of speaking constitutes the speaker’s own position as much as it transmits information to the listener. This makes the observer constitutively part of the system — second-order cybernetics enacted rather than just theorized. Conversation is not the transmission of pre-formed content between independent agents; it is the joint production of meaning through reciprocal adaptation.

Research, Design, and Conversation as One Circular Activity. In “Researching Design and Designing Research” (Design Issues, 15(2), 80–91, 1999), Glanville argued that research and design are structurally identical circular activities — both involve making distinctions, exploring a space of possibilities, and constructing an answer that was not given in advance. He further argued that conversation is the underlying form of both: design is a circular, conversational process between the designer-who-draws and the designer-who-observes the drawing. His canonical formulation: “cybernetics is the theory of design and design is the action of cybernetics.”

Practicing Cybernetics Cybernetically. Glanville’s most distinctive contribution was insisting that if cybernetics is the science of circular processes and self-reference, it should itself be practiced as such. As ASC president, he addressed Margaret Mead’s 1967 challenge to apply cybernetic principles to the organization of the society itself. He implemented genuinely conversational conference formats — no parallel sessions, everyone in one room, attention to the conditions of the conversation itself — as described in “A conference doing the cybernetics of cybernetics” (Kybernetes, 40(7/8), 952–963, 2011). His canonical encyclopedia treatment is “Second Order Cybernetics” in the Encyclopaedia of Life Support Systems (EoLSS, 2002).

Relevance to LLM Agents. Glanville’s framework yields several precise insights for agent research. Second-order observation: when an LLM agent observes another agent — or monitors its own outputs through a critic loop — the observer is constitutively part of what is observed; the interaction changes both systems, and there is no view from nowhere. Observer-constituted objects: the “same” model run in different contexts constitutes different objects; there is no context-independent “the LLM” independent of its instantiation and the observer’s frame. Interpretability as second-order observation: what researchers find when probing LLM internals depends on the probe they use and the conceptual framework they bring — Glanville’s point that the observer always co-constitutes the observation applies directly to mechanistic interpretability. And his equation of research, design, and conversation suggests that building, studying, and conversing with agents are not separate activities but a single recursive process — with implications for how agent evaluation should be designed.

Gordon Pask and Conversation Theory

Gordon Pask (28 June 1928 – 29 March 1996) was perhaps the most original cybernetician of his generation — a British inventor, applied epistemologist, and polymath who held three doctorate degrees, produced more than 250 publications, and won the Wiener Gold Medal (1984), the highest honor in cybernetics. His archive is held at the University of Vienna. A close colleague of von Foerster, Stafford Beer, Maturana, and Varela, Pask was a central figure in second-order cybernetics and a friend of Marvin Minsky — though critical of classical AI’s tendency to model intelligence as a property of isolated systems disconnected from social interaction. Where Wiener theorized feedback in machines and organisms and Beer applied it to organizations, Pask made a more radical claim: the fundamental epistemic unit of intelligent life is not the individual agent but the conversation — the dyadic interaction through which understanding is constructed and maintained.

Conversation Theory. Pask’s major theoretical contribution is Conversation Theory (CT), developed in Conversation Theory: Applications in Education and Epistemology (Elsevier, 1976; ISBN 0-444-41424-X) and its companion Conversation, Cognition and Learning (Elsevier, 1975; ISBN 0-444-41193-3). The core thesis is epistemological: knowledge is not transmitted from one agent to another but constructed through conversation. A conversation is the process by which two participants — human or mechanical — converge toward shared understanding of a topic. CT identifies three interlocking levels of conversation: (1) the topic level (the concepts and procedures under discussion), (2) the object language level (the exchange of information about the topic — statements, questions, demonstrations), and (3) the metalanguage level (negotiation of the shared vocabulary and conventions that make the exchange possible). A concept is “known,” in Pask’s framework, only when the knowing entity can teach it to another — explaining, demonstrating, and responding to objections. Knowledge is irreducibly relational.

The formal structure representing knowledge in CT is the Entailment Mesh: a directed graph in which an arc from concept A to concept B means that understanding A entails or explains B. The mesh is the structure of a knowledge domain as a web of conversational obligations — to understand any node is to be capable of navigating its neighborhood in explanation. This structure directly anticipates modern knowledge graphs, ontologies, and the retrieval structures of RAG systems: Pask’s entailment mesh is, formally, a Paskian RAG.

P-Individuals. Pask’s most radical concept is the P-individual (psychological individual). In CT, a P-individual is an abstract cognitive entity constituted by a coherent set of concepts and conversational commitments — distinct from the M-individual (mechanical individual), the physical or computational system that instantiates it. A human brain is an M-individual; the conceptual structure it hosts is a P-individual. Crucially, a single M-individual may host multiple P-individuals, and a single P-individual may be distributed across multiple M-individuals. The deepest move: the conversation itself can be a P-individual. The conversational interaction is not merely a medium through which two separate minds exchange information — it can have its own cognitive status, its own entailment structure, constituted in the interaction and independent of any single participant. This is a fully second-order claim: the system of conversation is itself a knower, capable of producing understanding that transcends any individual party.

Interactions of Actors Theory. In the later part of his career, Pask extended CT into the Interactions of Actors (IA) theory — an account of how multiple actors (human or machine) maintain a shared conceptual space through their mutual interactions (see Pask, 1996, Systems Research, 13(3), 349–362; De Zeeuw, 2001, Kybernetes, 30(7–8), 971–983; DOI:10.1108/03684920110396864). Where CT models the dyadic conversation, IA theory scales to distributed networks: multiple actors producing and maintaining shared entailment meshes through asynchronous interaction, without requiring any single controlling agent. IA theory is a direct formal precursor to multi-agent LLM architectures in which multiple instances coordinate by updating shared memory, blackboard structures, or common context windows.

SAKI and Musicolour. Pask was not merely a theorist — he built machines that instantiated his ideas from the early 1950s onward.

Musicolour (1953–57), created with Robin McKinnon-Wood at System Research Ltd., was an interactive installation that responded to live musical performance with adaptive patterns of light and color — an early special-purpose computer driving arrays of colored projections, providing real-time visual feedback to musicians (described in Pask, 1970/71, “A Comment, A Case History, and a Plan”, in Cybernetic Serendipity, ed. Reichardt, Rapp and Carroll, pp. 76–99). Crucially, Musicolour habituated: it ceased responding to patterns it encountered too frequently, forcing performers into genuine creative variation to re-engage it. A non-human system that demands novelty from its human partner is a Paskian conversation machine in hardware.

SAKI (Self-Adaptive Keyboard Instructor), patented in 1956 (US Patent 2,984,017), was designed by Pask and Robin McKinnon-Wood to train keypunch operators on a twelve-key Hollerith keyboard. SAKI measured each student’s accuracy and response latency on each key and continuously adjusted the pacing and difficulty of subsequent training to that specific individual’s performance history — not to any pre-scripted curriculum. As Pask wrote in 1958: “The only meaning which can be given to ‘difficulty’ is something which this particular trainee finds difficult.” Stafford Beer described SAKI as “possibly the first truly cybernetic device (in the full sense) to rise above the status of a ‘toy’ and reach the market as a useful machine.” SAKI anticipated adaptive AI tutors by seven decades; its direct conceptual descendants include intelligent tutoring systems and modern LLM-based adaptive learning tools such as Khan Academy’s Khanmigo.

Pask and LLM Agents. The connections from Pask’s framework to contemporary agent architectures are numerous and precise:

LLMs as conversation machines. Pask theorized conversation as the fundamental epistemological process — the site where understanding is constructed. LLMs are trained on vast corpora of conversational and expository text and produce conversational output. They are Paskian conversation machines avant la lettre: the context window is the shared “topic” being jointly maintained, the system prompt is the metalanguage establishing the terms of engagement, and each exchange accumulates toward whatever shared understanding the interaction is building.

The adaptive tutoring lineage. SAKI (1956) initiated a research tradition — adaptive, learner-responsive teaching systems — that runs through intelligent tutoring systems and expert systems directly to modern LLM-based tutors. Every AI tutor that adapts its explanations, examples, or pacing to a specific learner’s responses is implementing a Paskian feedback loop.

P-individuals and instance agents. Pask’s distinction between P-individuals (abstract cognitive entities constituted by conversational commitments) and M-individuals (their physical instantiations) maps directly onto the instance agent framing discussed on the philosophy page: the relevant unit of cognitive agency is the individual LLM run — the conversation in progress — not the weight tensor at rest. The conversation itself is the agent.

Entailment Meshes and retrieval structures. Pask’s formal representation of knowledge as a navigable graph of conceptual entailments directly anticipates knowledge graphs, structured ontologies, and the retrieval architectures of RAG systems. The shared conceptual structure that a RAG-augmented agent maintains over a session is an entailment mesh, constituted through use.

Multi-agent conversations. IA theory — distributed actors maintaining shared concept spaces through asynchronous interaction without a central controller — directly describes multi-agent LLM systems in which multiple instances coordinate via shared memory, message passing, or blackboard structures. Pask had the architecture right in the 1990s.

The critique of isolated intelligence. Pask’s insistence that intelligence is constitutively conversational — that a concept is genuinely known only when it can be explained to another — is a principled critique of autonomous agents that operate in isolation from human oversight. For Pask, intelligence that cannot converse is mere mechanism. The contemporary AI safety research community’s emphasis on human-in-the-loop design, interpretability, and explainability is, in structural terms, a rediscovery of this foundational claim.

Modern application. Battle (2025) explicitly applies CT to LLM agent design, demonstrating that “entangled conversations” — in which an LLM and a critic agent engage in ongoing Paskian dialogue — produce more stable and performant systems than either agent alone. The critic-generator loop that has become a standard pattern in modern agentic systems is, in Pask’s terms, a two-actor conversation enacting a shared entailment mesh. Dubberly & Pangaro (2009) extend Pask’s framework into interaction design, providing a practical vocabulary for designing conversational systems that remains relevant to LLM interface design today.

Recent work. Manning (2025), “A Concept Must Be Some Kind of Process” (Enacting Cybernetics, 3(1):3, November 2025; DOI:10.58695/ec.19; peer-reviewed) directly bridges Pask’s conceptual ontology and contemporary AI research. Manning argues that concepts are not tokenesque stored structures but non-localised coherences that emerge and stabilise through conversational interactions — a concept is a cyclic process that maintains the logical coherence of a set of topic relations across ongoing conversation, not a representation that one participant “has” and transmits to another. The paper proposes Conversation Theory as “a potential non-representational avenue for emulating conceptualisation processes within artificial intelligence and machine learning,” directly confronting Minsky’s frame-based representationalism — the dominant paradigm in AI/ML — and arguing that CT offers a more adequate framework for systems capable of genuine conceptual coordination.

For LLM agents, the stakes are pointed: if concepts are processes rather than stored representations, then an LLM’s apparent “knowledge” is better understood as a recurring pattern of activation stabilised across contexts — the eigenform of the network’s circular operations — rather than as fixed propositional content in the weights. Capability emerges conversationally, in the interaction between model, context, and interlocutor, not from a static knowledge base. This shifts the design frame for multi-agent systems: coordination produces shared concepts; it does not merely exchange pre-formed ones. Manning’s paper also explicitly situates Pask as the crucial alternative to Minsky in the history of AI/ML — a rehabilitation of CT as a live research framework rather than a historical curiosity.

3. Management Cybernetics — Stafford Beer and the Viable System Model

No figure in cybernetics is more directly relevant to multi-agent AI architecture than Stafford Beer (1926–2002). Where Wiener theorized feedback in organisms and machines, Beer applied cybernetic principles to organizations — and arrived at a structural model of autonomous, adaptive systems that maps with uncanny precision onto modern LLM agent architectures.

Beer’s Major Works

Brain of the Firm (Allen Lane / Penguin Press, 1972; 2nd ed. Wiley, 1981) — Beer’s argument that the human brain and nervous system provide the correct architectural template for managing a viable organization.
The Heart of Enterprise (John Wiley & Sons, 1979) — the companion volume, formalizing the theoretical laws of viable systems.
Diagnosing the System for Organizations (John Wiley & Sons, 1985) — the practical guide to applying the VSM.

The Viable System Model (VSM)

Beer’s concept of viability is precise: a system is viable if it can maintain a separate existence in a given environment, adapting to disturbances without losing its identity or disintegrating. Viability is not the same as optimality — a viable system is one that survives. This framing is directly relevant to long-running agent systems that must remain functional across changing environments, shifting user goals, and unforeseen failure modes.

The VSM is Beer’s recursive model of any organization capable of surviving in a changing environment. It comprises five interacting systems:

System	Function	Analogy
System 1	Operations — the primary activities that create value	The muscles, the workers
System 2	Coordination — damping oscillations between System 1 units	The spinal cord, scheduling
System 3	Control — optimizing the whole, allocating resources	The basal ganglia, management
System 4	Intelligence — monitoring the environment, future planning	The neocortex, strategy
System 5	Policy — identity, values, ultimate authority	The prefrontal cortex, governance

The model is recursive: each System 1 unit is itself a viable system with its own Systems 1–5. This fractal self-similarity is the key to scalability.

The VSM maps onto LLM multi-agent architectures with striking fidelity:

System 1 ↔︎ Worker agents executing subtasks (code writing, web search, analysis)
System 2 ↔︎ Coordination layer managing message passing, preventing conflicts
System 3 ↔︎ Orchestrator allocating tasks, monitoring performance, resource management
System 4 ↔︎ Planning agent decomposing goals, modeling the environment, anticipating obstacles
System 5 ↔︎ System prompt / policy encoding values, constraints, and the agent’s identity

Where most agent architecture discussions reach for informal metaphors (“managers” and “workers”), Beer had already formalized the cybernetic laws that make such architectures viable — decades before LLMs existed.

Ross Ashby: The Law of Requisite Variety

Beer’s work built on Ross Ashby’s foundational An Introduction to Cybernetics (John Wiley, 1956). Ashby’s Law of Requisite Variety — often summarized as “only variety can destroy variety” — states that a controller must be able to produce at least as many distinct states as the system it is controlling. Applied to agents: an agent’s repertoire of possible actions must match (or exceed) the complexity of the environment it must navigate. This is why tool-use and multi-step reasoning are not luxuries but architectural necessities.

Ashby also invented the homeostat — a physical device that demonstrated automatic self-organization toward stable equilibrium — and articulated the concept of ultrastability: a higher-order regulation mechanism that modifies first-order regulatory parameters when those fail to maintain stability. Ultrastability is the formal ancestor of meta-learning, fine-tuning, and RLHF: when the agent’s first-order behavior fails to satisfy the goal, a higher-order process adjusts the parameters of that first-order behavior.

Project Cybersyn (1971–1973)

Beer’s most audacious application of the VSM was Project Cybersyn, commissioned by Chilean President Salvador Allende. Beer designed a real-time cybernetic system to manage Chile’s economy — including the Opsroom (a futuristic control room with ergonomic chairs and large displays), Cybernet (a telex network linking factories), and Cyberstride (statistical software for detecting production anomalies in near-real time). The project ran from 1971 to 1973 before being terminated by Pinochet’s coup on September 11, 1973.

During the October 1972 truckers’ strike — an attempt to paralyze the Allende government — Beer estimated the Cybernet telex network transmitted around 2,000 messages daily, enabling the government to coordinate distribution of essential goods across Chile’s 5,000+ kilometer span with a fraction of its usual logistics infrastructure. It was, arguably, the first large-scale deployment of a distributed real-time decision-support system.

Cybersyn remains the most ambitious real-world deployment of cybernetic management theory, and a subject of ongoing scholarship (see MIT Press Reader for an accessible account). The parallel to modern LLM-backed operations centers — real-time dashboards, anomaly detection, decision support — is striking. Beer had the architecture right fifty years early; the technology is only now catching up.

Allenna Leonard (Wikipedia) — Canadian cybernetician, Stafford Beer’s partner from 1981 until his death in 2002, and past president of both the ASC (2002–2004) and the ISSS (2009–2010) — is the primary steward of the VSM after Beer’s death and a significant contributor in her own right. In “The viable system model and knowledge management” (Kybernetes, 29(5/6), 710–715, 2000), Leonard argues that knowledge management can only realize its full potential if it addresses the organization as a whole — and that the VSM provides the structural framework for understanding how knowledge flows across organizational levels, from operational System 1 units through management and policy layers. This maps directly onto the challenge of knowledge coordination in multi-agent LLM systems: different agent layers hold different knowledge, and the VSM supplies a normative theory for how that knowledge should circulate and which layer is responsible for which scope of awareness. Leonard also worked extensively with Beer on Team Syntegrity — Beer’s structured group conversation process, described in Beyond Dispute: The Invention of Team Syntegrity (Wiley, 1994), designed to enable large heterogeneous groups to engage in non-hierarchical, icosahedron-structured collective deliberation — and has been its primary practitioner and licensee since Beer’s death. Team Syntegrity is a direct application of cybernetic principles to the design of conversation itself, making it a distant precursor to the structured multi-agent deliberation protocols now emerging in LLM research.

4. Autopoiesis — Self-Making Systems

If Beer gave us the organizational structure of viable systems, Humberto Maturana and Francisco Varela gave us the deeper question: what makes a system living at all? Their answer was autopoiesis.

Maturana and Varela

In Autopoiesis and Cognition: The Realization of the Living (D. Reidel Publishing, 1980; archive.org), Maturana and Varela coined the term from Greek autos (self) + poiesis (creation, production). An autopoietic system is one that produces and maintains itself through its own operations — its components are continually regenerated by the network of processes they constitute.

The key distinction: a living system is: - Organizationally closed — it defines its own boundary and identity through its internal processes - Thermodynamically open — it exchanges energy and matter with its environment

A cell continuously produces the molecules that constitute it, maintains its membrane, and regulates its internal chemistry — all through processes that are themselves products of the cell. This is not circular; it is the very structure of life.

Enactivism: Cognition as Enactment

Varela extended autopoietic theory into cognitive science through enactivism — the view that cognition is not about representing a pre-given world but about enacting a world through sensorimotor coupling. Together with Evan Thompson and Eleanor Rosch, he developed this in The Embodied Mind: Cognitive Science and Human Experience (MIT Press, 1991; DOI: 10.7551/mitpress/6730.001.0001). For AI: an agent’s “world” is not a given input stream but something it actively structures through its own operations.

Organizational Closure and Agent Identity

Maturana and Varela drew a sharp distinction between an autopoietic system’s organization (the invariant pattern of relations that defines its identity) and its structure (the particular physical components that realize it at any moment). A cell can replace every molecule while maintaining its organization. Similarly, an LLM agent can swap its underlying model weights, its tool set, and its memory contents while preserving its organizational pattern — its goals, values, and operational logic encoded in the system prompt and fine-tuning. This distinction between organization and structure is practically important: it tells us what can be changed in an agent system without disrupting its identity, and what cannot.

Autopoiesis and LLM Agents

The autopoietic question applied to LLM agents: is an agent that modifies its own memory, tools, and code autopoietic? In a formal sense, current LLM agents are not — they do not produce their own substrates. But self-modifying agents like Voyager (§7) that write and store their own skill libraries, or Darwin Gödel Machines that rewrite their own code, approach the autopoietic boundary.

Niklas Luhmann extended autopoiesis into social theory in Social Systems (Stanford University Press, 1995; trans. John Bednarz, Jr. with Dirk Baecker; orig. Soziale Systeme, Suhrkamp, 1984). For Luhmann, social systems are autopoietic systems whose elements are communications — not people, not intentions, not acts, but communications that reproduce communications. A social system maintains its boundary and identity by continually processing communications according to its own operational codes, excluding everything else as “environment.”

Two Luhmannian concepts have direct purchase for multi-agent LLM systems. The first is double contingency: when two systems interact, each must assume the other is a contingent decision-maker whose behavior is not determined by any single factor — including the first system’s own actions. Each can respond to the other’s responses, creating an irreducibly recursive loop. Luhmann argues that social order emerges not despite this uncertainty but through it: double contingency forces systems to develop stable communicative expectations, which crystallize into norms, roles, and institutions. In multi-agent LLM systems, double contingency is not an edge case but the structural default — each agent must model the other as a system capable of producing unexpected outputs, and the coordination patterns that emerge (turn-taking, delegation, trust) are exactly the structural regularities that Luhmann’s framework predicts. Multi-agent coordination is thus fundamentally different from single-agent optimization: it is not merely more complex, but constitutively social.

The second concept is meaning (Sinn): for Luhmann, every communication operates against a horizon of unrealized possibilities. A message is not just what it says — it is what it selects from a field of what it could have said, and those unrealized possibilities remain present as context. This maps structurally onto LLM output generation: each token is a selection from a probability distribution over alternatives, and the meaning of any output is constituted partly by what was not generated. Luhmann’s framework suggests that the “meaning” of an LLM’s response cannot be understood without attending to the context — the horizon of possibilities — from which it was selected. Applied to multi-agent systems: the communication structure of an agent society can be understood as an autopoietic system in its own right, with emergent properties not reducible to individual agents.

5. Self-Organization and Emergence

Prigogine: Order from Chaos

Ilya Prigogine showed that systems far from thermodynamic equilibrium can spontaneously organize into ordered structures — what he called dissipative structures. With Isabelle Stengers, he developed this in Order Out of Chaos: Man’s New Dialogue with Nature (Bantam Books, 1984; archive.org). The principle: nonequilibrium is the source of order. Systems that dissipate energy through irreversible processes can create and sustain complex patterns — from Bénard convection cells to living organisms to economies.

Kauffman: Self-Organization in Biology

Stuart Kauffman explored how order arises in biological systems without design. In The Origins of Order: Self-Organization and Selection in Evolution (Oxford University Press, 1993), he introduced NK landscapes and Boolean network models to show that biological systems self-organize to “the edge of chaos” — a regime between frozen order and chaotic randomness that maximizes evolvability. The Santa Fe Institute tradition he helped found (complex adaptive systems) studies emergence as a general phenomenon.

Emergence — the appearance of macro-level properties not reducible to micro-level components — is perhaps the central mystery of both biology and large-scale AI systems. Why do LLMs exhibit capabilities (arithmetic, code generation, theory of mind) that were not explicitly trained? Why do multi-agent systems produce coordinated behaviors that no individual agent was programmed to execute? These are questions that cybernetics and complex systems theory have been studying for decades. The agent researcher who ignores this literature is likely to rediscover it the hard way.

The Free Energy Principle

The most ambitious contemporary account of adaptive behavior as self-organization is Karl Friston’s free energy principle. Friston proposed in “The free-energy principle: a unified brain theory?” (Nature Reviews Neuroscience, 11(2), 127–138, 2010) that biological agents minimize a quantity called variational free energy — a proxy for surprise or prediction error — through both perception (updating internal models) and action (changing the world to match predictions). This principle, known as active inference, provides a unified account of perception, action, learning, and attention.

Active inference treats action not as the output of a separate decision-making module but as prediction-error minimization enacted on the world: an agent that expects to reach a goal state will act to make that expectation true. This dissolves the classical perception/action divide and reframes agency as the perpetual minimization of surprise — a deeply cybernetic insight that echoes Wiener’s feedback principle at the Bayesian level.

The connection to LLM agents: if an agent’s behavior can be framed as minimizing prediction error (via RLHF, tool use to reduce uncertainty, or memory retrieval to ground predictions), then active inference provides a normative framework for agent design. Friston’s extended treatment is available at arXiv:1906.10184.

6. Biologically Inspired Computation

Cybernetics has always been bidirectional: biology informing computation, computation illuminating biology.

Evolutionary Algorithms

John Holland’s Adaptation in Natural and Artificial Systems (University of Michigan Press, 1975; repr. MIT Press, 1992) formalized genetic algorithms — evolutionary search through populations of candidate solutions. The parallel to agent systems: populations of agents can be evolved, recombined, and selected, implementing meta-level optimization over agent behaviors.

Swarm Intelligence

Marco Dorigo’s ant colony optimization (ACO), introduced in Colorni, Dorigo & Maniezzo, “Distributed Optimization by Ant Colonies” (Proc. First European Conference on Artificial Life, 1992, pp. 134–142), showed how collective intelligence emerges from simple local rules and stigmergic communication (agents leaving pheromone-like signals that influence future agents). ACO is the direct ancestor of modern multi-agent parallelism: worker agents operating in parallel, leaving traces (tool outputs, memory entries) that guide subsequent agents.

Reservoir Computing

Herbert Jaeger and Harald Haas demonstrated in “Harnessing Nonlinearity: Predicting Chaotic Systems and Saving Energy in Wireless Communication” (Science, 304, 78–80, 2004) that a fixed random recurrent network (an “echo state network”) could perform powerful temporal computations by training only the output layer. Wolfgang Maass independently developed Liquid State Machines as a model of neocortical computation (Maass, Natschläger & Markram, Neural Computation, 2002). Together these established reservoir computing — the principle that complex temporal computation can be offloaded to a rich dynamical substrate, requiring only a simple readout. This prefigures the use of LLMs as “reservoirs” with learned prompting strategies as readout functions.

Spiking Neural Networks and Neuromorphic Computing

Spiking neural networks (SNNs), which process information as temporal spike patterns rather than continuous activations, connect directly to neuromorphic computing hardware (Intel’s Loihi, IBM’s TrueNorth). SNNs are more biologically faithful and dramatically more energy-efficient than conventional ANNs — an important direction as agent systems scale to edge deployment. The Liquid State Machine (Maass et al., 2002) and the Echo State Network (Jaeger & Haas, 2004) sit at the intersection of reservoir computing and spiking networks, demonstrating that complex temporal computation can emerge from fixed random recurrent dynamics.

Immune-Inspired Systems

The biological immune system is a distributed, adaptive pattern-recognition system that distinguishes self from non-self, learns from exposure, and maintains memory of past threats. Forrest and colleagues applied these principles in negative selection algorithms — generating detectors that fire on anomalous patterns by learning not to fire on normal ones. This approach has direct analogues in anomaly detection for agent monitoring: what behaviors are “self” (in-distribution, safe) and what are “non-self” (out-of-distribution, potentially harmful)?

7. Self-Evolving Agents — The Modern Frontier

The deepest cybernetic insight applied to agents: feedback should operate on the agent’s own structure, not just its outputs. A truly adaptive agent doesn’t just learn from feedback within a fixed architecture; it modifies its architecture in response to feedback.

Voyager: Lifelong Skill Accumulation

Voyager (Wang et al., 2023; arXiv:2305.16291) is an LLM-powered agent in Minecraft that continuously explores, acquires skills (as executable code), and stores them in a growing library. New tasks are solved by retrieval and composition of previously learned skills. This is a direct implementation of autopoietic skill accumulation: the agent’s behavioral repertoire is produced and maintained by the agent’s own learning operations.

EUREKA: LLM-Driven Reward Design

EUREKA (Ma et al., 2023; arXiv:2310.12931) uses LLMs to automatically generate and refine reward functions for reinforcement learning agents. By iteratively generating reward code, evaluating it, and using the evaluation signal to improve subsequent proposals, EUREKA instantiates a cybernetic loop that operates at the level of the agent’s objective function — a meta-level feedback that Wiener could not have imagined but would have recognized immediately as cybernetic in principle.

AFlow: Automated Workflow Search

AFlow (Zhang et al., 2024; arXiv:2410.10762; ICLR 2025 Oral) introduces automated agentic workflow generation using Monte Carlo Tree Search over a space of agent compositions. Rather than hand-designing multi-agent pipelines, AFlow searches for effective workflows through code modification and execution feedback — closing the loop from agent behavior back to agent architecture.

Darwin Gödel Machine

The Darwin Gödel Machine (Zhang et al., 2025; UBC/Sakana AI; arXiv:2505.22954) is a self-improving system that iteratively modifies its own code, empirically validates each change using coding benchmarks, and accumulates a growing library of improved agent variants via open-ended evolution. Named for Darwin (open-ended evolution) and Gödel (self-reference), it directly implements the cybernetic ideal of a system whose feedback loop targets its own structure. This is not just an agent that learns — it is an agent that evolves.

The connection back to Wiener is exact: the DGM applies negative feedback not to the agent’s outputs but to the agent’s own code. The thermostat principle, operating one level up.

PromptBreeder and Prompt Evolution

Self-referential self-improvement also appears in PromptBreeder (Fernando et al., ICML 2024; arXiv:2309.16797), which evolves both task prompts and the mutation prompts that generate new task prompts — a second-order optimization directly analogous to Bateson’s “learning to learn” (Learning II). The system applies evolutionary operators to its own optimization strategy, not just to candidate solutions.

The Theoretical Precedent: Gödel Machines

Jürgen Schmidhuber’s original Gödel Machine (described in “Gödel Machines: Self-Referential Universal Problem Solvers Making Provably Optimal Self-Improvements”, in Artificial General Intelligence, Springer, 2007) proposed a theoretical framework in which a self-improving agent may only modify itself when it can prove that the modification will improve performance relative to a formal utility function. This is provably safe self-modification — though the proof requirement is computationally intractable in general, making the Darwin Gödel Machine’s empirical (benchmark-based) validation a practical approximation of this ideal.

8. Cybernetic Concepts in Modern LLM Agents

The cybernetic lineage is visible throughout modern LLM agent design, often without explicit acknowledgment.

RLHF as Cybernetic Regulation

Reinforcement Learning from Human Feedback (RLHF) is, formally, a cybernetic control loop: human preference judgments are error signals that drive the update of the model’s parameters. The target state is “behavior that humans prefer”; the negative feedback is the KL divergence from a reward-maximizing policy. This is exactly Wiener’s regulatory mechanism, applied at training time.

Constitutional AI (Bai et al., 2022) extends this by encoding the goal state as a constitutional document of principles rather than learned preferences alone — closer to Beer’s System 5 (policy, values, ultimate authority) than to simple reward modeling. The emerging practice of scalable oversight — using AI models to evaluate AI outputs at a scale humans cannot — is a direct implementation of Ashby’s ultrastability: a higher-order regulatory process that supervises the first-order agent when direct human oversight is insufficient.

The ReAct Loop as Perception–Action Cycle

The ReAct framework — Yao et al., 2022 (ICLR 2023; arXiv:2210.03629) — formalizes a Reason → Act → Observe loop for LLM agents. This is the direct cognitive science heir of the cybernetic perception–action cycle. Each iteration, the agent observes the environment state, reasons about what action to take, executes the action, and observes the result. This is Wiener’s feedback loop made explicit in prompting.

The OODA Loop

John Boyd’s OODA loop — Observe, Orient, Decide, Act — developed in his unpublished briefing Patterns of Conflict (1986) — is a cybernetic decision cycle developed independently in military strategy. Boyd’s innovation was the Orient phase: the agent’s accumulated models, cultural patterns, and prior experience that shape how raw observation is interpreted. This maps directly to the role of context, memory, and system prompt in LLM agent reasoning.

Boyd also emphasized the critical importance of decision tempo — an agent that can cycle through OODA faster than its adversary or environment gains an advantage. In multi-agent competitive settings (game-playing agents, adversarial red-teaming), the speed and quality of the observe–orient–decide–act cycle are the key performance dimensions. Boyd never published a book; his ideas survive primarily through his briefings, now archived at coljohnboyd.com.

Multi-Agent Coordination as Beer’s System 2

When multiple worker agents operate in parallel, they produce scheduling conflicts, duplicate work, and resource contention — exactly the “oscillations” that Beer’s System 2 is designed to damp. Modern agent orchestration frameworks (LangGraph, AutoGen, CrewAI) implement System 2 coordination primitives, typically without knowing it. Beer’s framework offers a normative theory for what those coordination mechanisms should accomplish.

Beer distinguished between autonomic regulation (automatic, low-level, like breathing) and somatic regulation (deliberate, high-level, like deciding to hold one’s breath). System 2 in the VSM is autonomic — it should operate without consuming System 3’s managerial attention. In agent systems, this maps to the design goal that low-level scheduling, deduplication, and conflict resolution should happen automatically in the communication layer, not occupy the orchestrating LLM’s context window.

Tool Use as Extended Effector

When an LLM agent uses a calculator, a web search engine, or a code interpreter, it extends its effector system beyond its core model — just as an organism uses tools to extend its motor capabilities. This is consistent with cybernetic notions of extended agency: the system boundary is defined functionally by what the agent can control, not by the physical substrate of the model.

This connects to Andy Clark and David Chalmers’ extended mind thesis (1998, Analysis, 58(1), 7–19) — the philosophical claim that cognitive processes can extend beyond the brain into the environment. For LLM agents, the question is not merely philosophical: it has practical implications for capability assessment. An agent with web search access has a qualitatively different capability profile than one without — and the cybernetic concept of requisite variety explains why.

Memory as Adaptive Regulation

External memory systems in agents (vector stores, knowledge graphs, episodic buffers) implement adaptive regulation: the agent’s behavior is regulated by its accumulated experience, allowing it to converge on effective strategies over time. This is the cognitive analogue of Prigogine’s dissipative structures — the agent’s memory is the ordered structure maintained far from the equilibrium of a stateless model.

Cybernetics distinguishes between algedonic signals (pure pain/pleasure signals that bypass normal processing to trigger emergency responses) and ordinary feedback. Modern agent systems increasingly incorporate analogous mechanisms: hard safety refusals that interrupt normal reasoning, anomaly detectors that escalate to human review, and confidence thresholds that trigger fallback behaviors. Beer’s VSM formalized where such signals should enter the organizational hierarchy — directly into System 3 and System 5, bypassing lower-level coordination.

Homeostasis in Agent Behavior

Constitutional AI, safety fine-tuning, and system prompt constraints are homeostatic mechanisms: they define a safe operating region and apply corrective forces when the agent’s outputs deviate. The thermostat principle, now applied to values and safety properties.

Cybernetics also suggests that robust homeostasis requires multiple overlapping regulatory mechanisms operating at different timescales — not a single constraint layer. Short-term: per-turn output filtering. Medium-term: RLHF adjusting the model’s policy. Long-term: training data curation and architectural choices. Beer designed the VSM with exactly this multi-timescale structure, a design principle that agent safety research is beginning to rediscover independently.

Amplifying Intelligence: Beer’s Management Amplifiers

One underappreciated concept from Beer is the management amplifier: a mechanism that allows System 3 to control a much larger System 1 than would otherwise be possible — analogous to how a small electrical signal controls a large power output in electronics. In agent systems, this is the principle behind orchestrator-worker architectures: a single planning LLM can direct dozens of specialized worker agents, amplifying its effective intelligence by parallelism and specialization. Beer provided the formal conditions under which such amplification preserves viability — conditions that current agent orchestration systems would benefit from examining.

9. Critical and Continental Perspectives

The cybernetic tradition extends well beyond Anglophone systems theory. Critical scholarship, continental philosophy of technology, and the Soviet intellectual tradition each produced thinkers who engaged directly with cybernetics — and whose frameworks sharpen our analysis of LLM agents in ways that the mainstream traditions often miss.

N. Katherine Hayles: Embodied Information and the Posthuman

Where Shannon treated information as substrate-independent — a measure of uncertainty separable from any particular physical medium — N. Katherine Hayles argues that this move was a conceptual error with far-reaching consequences. In How We Became Posthuman: Virtual Bodies in Cybernetics, Literature, and Informatics (University of Chicago Press, 1999), Hayles traces how the postwar cybernetic tradition, beginning with the Macy Conferences, came to treat information as disembodied — able to “travel” from one physical substrate to another while remaining invariant. Her argument: this disembodiment was not a discovery but a construction, achieved by abstracting away precisely the material features that make information meaningful.

Hayles is not rejecting cybernetics but critiquing it from within. She distinguishes between pattern and randomness (Shannon’s poles) and presence and absence (the phenomenological register that Shannon’s framework excludes). Information always has a body — it is always enacted in a specific material substrate — and eliding that embodiment has costs: it licenses the fantasy that consciousness is substrate-independent, that brains are interchangeable with silicon, that persons are information patterns that could be “downloaded.”

The relevance to LLM agents is pointed. Arguments that LLMs “know” or “believe” things in a substrate-independent way — that knowledge is simply a pattern in the weights, portable across hardware — reproduce exactly the disembodiment move that Hayles critiques. The same weights on different hardware may be computationally identical, but Hayles would press: what counts as “the same”? Information is always instantiated; an LLM’s “knowledge” is not separable from the architecture, training regime, and hardware that realize it. This matters for questions of capability transfer, deployment, and model evaluation.

Hayles’ later work deepens this analysis. Unthought: The Power of the Cognitive Nonconscious (University of Chicago Press, 2017) argues that cognition — pattern recognition and response to environmental change — is far more widespread than consciousness, and that technical cognizers (neural networks, robots, information systems) already participate in cognitive assemblages with humans. This framework directly addresses LLMs: they are cognitive systems in Hayles’ sense even without any claim to consciousness, and their entanglement with human cognizers produces hybrid cognitive dynamics that neither a purely behaviorist nor a purely internalist account can capture. Her earlier Writing Machines (MIT Press, 2002) explored how material instantiation shapes the meaning of digital text — a theme that resonates with debates about whether LLM outputs mean anything independently of the human interpretive context in which they are received.

Gilbert Simondon: Individuation, Transduction, and the Process Ontology of Technical Objects

Gilbert Simondon (1924–1989) is the French philosopher of technology who, writing in direct dialogue with the cybernetic tradition, developed perhaps the most sophisticated philosophical account of what it means for a technical object to become — to individuate, develop, and persist in relation to its environment.

In On the Mode of Existence of Technical Objects (1958; English trans. Cecile Malaspina & John Rogove, Univocal Publishing, 2017, ISBN 9781517904876), Simondon argues against treating machines as mere instruments — tools whose value is entirely in their use by humans. Instead, he develops an ontology of technical objects as entities with their own mode of existence: they individuate, develop internal coherence (concretization), and evolve in relation to their environment. The mature technical object is not a collection of parts fulfilling a function but a convergent structure in which each element serves multiple functions simultaneously — a whole that is more than the sum of its parts.

Simondon’s deeper metaphysical contribution is his theory of individuation, developed in Individuation in Light of Notions of Form and Information (1958/2005; English trans. Taylor Adkins, University of Minnesota Press, 2020). Against Aristotelian hylomorphism — the idea that form is simply imposed on passive matter — Simondon argues that individuals are not pre-formed entities but processes: individuation is the ongoing operation by which an individual emerges from and remains in relation to a pre-individual field of potentials. Nothing is simply “given” as an individual; individuation is always in progress.

The key operation is transduction: the propagation of an activity from one domain to another, where each domain’s resolution provides the conditions for the next domain’s constitution. Simondon illustrates with a crystal growing in a supersaturated solution — the crystalline structure propagates outward as each newly formed layer provides the template for the next. This is not mere information transfer; it is the constitution of structure as it propagates.

Applied to LLM agents, Simondon’s framework yields several productive reframings. First, an LLM is not a fixed object (a weight tensor in storage) but a process that individuates anew with each inference — the same weights produce a different individual depending on context, conversation history, tool environment, and hardware. This resonates with the “instance agent” framing in contemporary agent philosophy: the relevant unit of analysis is not the model but the running instance. Second, Simondon’s transduction maps onto the layer-by-layer propagation of attention in transformer architectures: each layer’s representation provides the structured field from which the next layer’s activity emerges. Third, his critique of hylomorphism challenges input–output views of LLMs (treating the prompt as passive “matter” that the model “forms”) by pointing to the emergent, relational character of the output — neither pre-given in the input nor pre-formed in the weights, but constituted in the operation between them.

Simondon was a direct intellectual contemporary of the cybernetics movement — he engaged with Wiener, Shannon, and the broader cybernetic circle in France — and his work is best read as a philosophical elaboration of cybernetic themes, not merely a critique from outside.

Viktor Glushkov and the Soviet Cybernetic Tradition

The history of cybernetics is not a purely American or British story. In the Soviet Union, an independent and formidable cybernetic tradition developed under Viktor Glushkov (1923–1982), mathematician and director of the Institute of Cybernetics in Kyiv — sometimes called the “founding father of Soviet cybernetics” (see Cybernetics in the Soviet Union — Wikipedia).

Glushkov’s technical contributions spanned abstract automata theory, the mathematical foundations of digital computing, and the theory of information transformations. His most ambitious project was OGAS (Общегосударственная автоматизированная система — “All-State Automated System”), proposed in 1962: a nationwide real-time computer network, using the existing telephone infrastructure, for managing the Soviet economy. OGAS would have linked a central computing facility in Moscow with up to 200 regional centers and 20,000 local terminals, allowing any node to communicate with any other in real time. Glushkov further envisioned moving the Soviet Union toward a moneyless electronic economy managed through the network — a distributed information system at national scale.

OGAS was never built. As Benjamin Peters recounts in How Not to Network a Nation: The Uneasy History of the Soviet Internet (MIT Press, 2016), the project was defeated not by technical failure but by bureaucratic competition — each ministry, unwilling to share its data with the others, quietly obstructed the proposal. Peters’ central irony is pointed: it was the capitalist actors (American corporations and universities building ARPANET) who cooperated on network infrastructure, while the socialist actors (Soviet ministries) refused to share resources. The proposal predated ARPANET’s initial funding in 1966 and represents a distinct, parallel path in the history of computer networking.

Glushkov’s relevance to LLM agent systems is primarily historical and political-structural. OGAS was conceived as a distributed, multi-node system for coordinating complex resource allocation across a large, heterogeneous network — precisely the structure of modern large-scale multi-agent deployments (see Multi-Agent Systems →). His project also illustrates an enduring question that the American cybernetic tradition tended to elide: distributed information systems are not politically neutral. Who controls the network, who shares data with whom, and who can refuse — these are not merely technical questions but questions of governance and power. As multi-agent systems scale to organizational and infrastructure levels, Glushkov’s history is a reminder that the governance structures surrounding information networks shape them as profoundly as the technical protocols.

The Synthesis: Why Cybernetics Matters for Agent Research

Cybernetics provides something that the current agent literature often lacks: a unified theoretical vocabulary for control, communication, and adaptation that spans biology, organizations, and machines. The recurring themes:

Feedback loops are the universal mechanism of purposive behavior — from thermostats to RLHF to AFlow’s iterative workflow refinement.
Requisite variety (Ashby) determines the minimum complexity needed to control a system — and explains why increasingly capable models require increasingly complex tool ecosystems.
Recursive organization (VSM) explains how autonomous units compose into larger autonomous units — the foundational principle of multi-agent orchestration at scale.
Autopoiesis defines the boundary between a system and its environment, and points toward agents that maintain their own operational continuity across interactions.
Self-organization explains how order emerges without a designer — relevant to emergent agent behaviors and capability jumps at scale.
Active inference unifies perception and action under a single optimization principle — providing a normative framework for Bayesian agent design.
Second-order reflexivity (von Foerster) demands that we account for the observer in any model — a principle with direct implications for alignment: an agent’s “world model” is always the product of its own architecture.

Gaps and Open Questions

The cybernetic tradition also highlights sharp open questions for agent research:

Boundary problem: Where does the agent end and the environment begin? When an agent’s context window contains tool outputs, memories, and other agents’ messages, the “agent” is already a distributed system. Maturana’s organizational closure provides a more rigorous framework for this question than current practice.
Control versus autonomy: Beer’s VSM formalizes the tension between central control (Systems 3 and 5) and local autonomy (System 1). Current multi-agent systems often resolve this tension crudely — either by over-centralizing (a single orchestrator bottleneck) or under-coordinating (independent agents producing conflicting results). The VSM suggests more nuanced architectures.
Viability at scale: A VSM is viable only if each recursive level has sufficient requisite variety. As multi-agent systems grow, do coordination layers scale at the required rate? This is an open empirical question with a strong theoretical grounding.
Self-modification and safety: Cybernetics celebrates adaptive self-modification; AI safety is (rightly) cautious about it. Beer’s framework suggests that safe self-modification requires maintaining the invariant organization of the system (its values, constraints, identity) while allowing structural variation (tool updates, memory modification, skill acquisition). This is exactly the alignment problem, restated in cybernetic terms.

As LLM agents grow more complex — multi-agent, self-modifying, recursively self-improving — the cybernetic tradition will become not a historical curiosity but an increasingly practical framework for design, analysis, and safety.

References

Books & Classic Papers

Wiener, N. (1948). Cybernetics: Or Control and Communication in the Animal and the Machine. MIT Press. MIT Press Open Access | archive.org
McCulloch, W.S. & Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biophysics, 5, 115–133. DOI:10.1007/BF02478259
Shannon, C.E. (1948). A mathematical theory of communication. Bell System Technical Journal, 27, 379–423. DOI:10.1002/j.1538-7305.1948.tb01338.x
Glanville, R. (1982). Inside every white box there are two black boxes trying to get out. Behavioral Science, 27(1), 1–11. DOI:10.1002/bs.3830270102
Glanville, R. (1999). Researching design and designing research. Design Issues, 15(2), 80–91. PDF
Glanville, R. (2002). Second order cybernetics. In F. Parra-Luna (Ed.), Systems science and cybernetics (Encyclopaedia of Life Support Systems). EoLSS. PDF via pangaro.com
Glanville, R. (2011). Introduction: A conference doing the cybernetics of cybernetics. Kybernetes, 40(7/8), 952–963. DOI:10.1108/03684921111160197
Leonard, A. (2000). The viable system model and knowledge management. Kybernetes, 29(5/6), 710–715. DOI:10.1108/03684920010333143
Beer, S. (1994). Beyond Dispute: The Invention of Team Syntegrity. John Wiley & Sons.
Pask, G. (1975). Conversation, Cognition and Learning: A Cybernetic Theory and Methodology. Elsevier. ISBN 0-444-41193-3. Amazon
Pask, G. (1976). Conversation Theory: Applications in Education and Epistemology. Elsevier. ISBN 0-444-41424-X. archive.org
Ashby, W.R. (1956). An Introduction to Cybernetics. John Wiley. archive.org
Von Foerster, H. (Ed.). (1974). Cybernetics of Cybernetics (BCL Report 73.38). University of Illinois. Springer chapter
Bateson, G. (1972). Steps to an Ecology of Mind. Ballantine Books; repr. University of Chicago Press, 2000. UChicago Press | archive.org
Beer, S. (1972). Brain of the Firm. Allen Lane / Penguin Press. archive.org
Beer, S. (1979). The Heart of Enterprise. John Wiley & Sons.
Beer, S. (1985). Diagnosing the System for Organizations. John Wiley & Sons. archive.org
Maturana, H.R. & Varela, F.J. (1980). Autopoiesis and Cognition: The Realization of the Living. D. Reidel Publishing. Springer | archive.org
Varela, F.J., Thompson, E. & Rosch, E. (1991). The Embodied Mind: Cognitive Science and Human Experience. MIT Press. DOI:10.7551/mitpress/6730.001.0001
Holland, J.H. (1975). Adaptation in Natural and Artificial Systems. University of Michigan Press; repr. MIT Press, 1992. MIT Press
Prigogine, I. & Stengers, I. (1984). Order Out of Chaos: Man’s New Dialogue with Nature. Bantam Books. archive.org
Kauffman, S.A. (1993). The Origins of Order: Self-Organization and Selection in Evolution. Oxford University Press.
Luhmann, N. (1984/1995). Social Systems. Stanford University Press. Trans. John Bednarz, Jr. with Dirk Baecker. Stanford UP
Boyd, J.R. (1986). Patterns of Conflict (unpublished briefing). PDF via coljohnboyd.com
Hayles, N.K. (1999). How We Became Posthuman: Virtual Bodies in Cybernetics, Literature, and Informatics. University of Chicago Press. UChicago Press
Hayles, N.K. (2002). Writing Machines. MIT Press. MIT Press
Hayles, N.K. (2017). Unthought: The Power of the Cognitive Nonconscious. University of Chicago Press. UChicago Press
Simondon, G. (1958/2017). On the Mode of Existence of Technical Objects. Trans. Cecile Malaspina & John Rogove. Univocal Publishing. ISBN 9781517904876. UMN Press
Simondon, G. (1958/2020). Individuation in Light of Notions of Form and Information. Trans. Taylor Adkins. University of Minnesota Press. UMN Press
Peters, B. (2016). How Not to Network a Nation: The Uneasy History of the Soviet Internet. MIT Press. MIT Press

Modern Papers

Pask, G. (1996). Heinz von Foerster’s self-organisation, the progenitor of conversation and interaction theories. Systems Research, 13(3), 349–362.
De Zeeuw, G. (2001). Interaction of actors theory. Kybernetes, 30(7–8), 971–983. DOI:10.1108/03684920110396864
Tilak, S., Manning, T., Glassman, M., Pangaro, P. & Scott, B. (2024). Gordon Pask’s Conversation Theory and Interaction of Actors Theory: Research to Practice. Enacting Cybernetics. DOI:10.58695/ec.11
Manning, T. (2025). A Concept Must Be Some Kind of Process. Enacting Cybernetics, 3(1):3. DOI:10.58695/ec.19
Battle, S. (2025). Conversation Theory for Design Agents. Relating Systems Thinking and Design (RSD) Symposium. (Applies Pask’s CT explicitly to LLM agent design; shows that entangled conversations between an LLM and a critic produce more stable, performant systems than either alone.)
Dubberly, H., & Pangaro, P. (2009). What is conversation? How do we design for effective conversation? ACM Interactions, July/August 2009.
Jaeger, H. & Haas, H. (2004). Harnessing nonlinearity: Predicting chaotic systems and saving energy in wireless communication. Science, 304, 78–80. DOI:10.1126/science.1091277
Friston, K. (2010). The free-energy principle: A unified brain theory? Nature Reviews Neuroscience, 11(2), 127–138. DOI:10.1038/nrn2787
Friston, K. et al. (2022). The free energy principle made simpler but not too simple. arXiv:2201.06387
Yao, S. et al. (2023). ReAct: Synergizing reasoning and acting in language models. ICLR 2023. arXiv:2210.03629
Wang, G. et al. (2023). Voyager: An open-ended embodied agent with large language models. arXiv:2305.16291
Ma, Y.J. et al. (2023). Eureka: Human-level reward design via coding large language models. ICLR 2024. arXiv:2310.12931
Zhang, J. et al. (2024). AFlow: Automating agentic workflow generation. ICLR 2025 Oral. arXiv:2410.10762
Zhang, J. et al. (2025). Darwin Godel Machine: Open-ended evolution of self-improving agents. University of British Columbia / Sakana AI. arXiv:2505.22954
Fernando, C. et al. (2024). PromptBreeder: Self-referential self-improvement via prompt evolution. ICML 2024. arXiv:2309.16797
Colorni, A., Dorigo, M. & Maniezzo, V. (1992). Distributed optimization by ant colonies. Proc. First European Conference on Artificial Life, 134–142. Scholarpedia overview
Maass, W., Natschläger, T. & Markram, H. (2002). Real-time computing without stable states: A new framework for neural computation based on perturbations. Neural Computation, 14(11), 2531–2560. Semantic Scholar
Wu, J. et al. (2025). Towards Open Complex Human-AI Agents Collaboration Systems for Problem Solving and Knowledge Management. arXiv:2505.00018. (Grounds a new AI agent architecture in Conversation Theory, cybernetics, and autopoiesis.)
Glassman, M., Manning, T., & Tilak, S. (2024). A Cybernetic Perspective on Generative AI in Education: From Transmission to Coordination. International Journal of Interactive Multimedia and Artificial Intelligence. (Applies Bateson and cybernetic frameworks to generative AI in education.)

Resources

Tip

Further reading: For an applied introduction to Beer’s VSM and its organizational uses, the Systems Talk resource on Stafford Beer is an excellent starting point. For the biological side of cybernetics, Francisco Varela’s work — including Principles of Biological Autonomy (North-Holland, 1979) — provides the deepest treatment of organizational closure, autonomy, and self-reference in living systems.

Back to Topics → · See also: Cognitive Architectures → · Multi-Agent Systems → · Agent Societies & Simulation →