Most agent frameworks optimize the capability axis: tools, memory, self-modify, multi-step planning. That axis answers *what powers exist*.
This piece adds a different lens — the emergence axis — borrowed loosely from biology. The question is qualitative: what kind of entity is the system turning into, not only which features are toggled on.
It pairs cleanly with the capability-axis map on this site. Same territory, different ruler.
Atom — inference
The smallest unit: one prompt, one completion. No durable state, no memory, no branching policy. You ask; it answers.
``
"What is the capital of Japan?" → "Tokyo"
``
Like an atom in physics (as metaphor, not chemistry), inference is the irreducible building block. Everything above is made of these calls.
Traits: stateless, single turn, no decisions beyond next-token prediction, human is the full operator. Stateless chat APIs are the canonical form — e.g. OpenAI Chat Completions or Anthropic Messages with no tools and no server-side memory.
Molecule — chain / pipeline
String atoms in a fixed order and you get a molecule: each step’s output becomes the next step’s input. Example triage for user feedback:
- Classify — bug vs feature vs complaint.
- Extract — severity, surface area, repro steps into structured fields.
- Generate — draft a ticket body from that structure.
The crucial constraint: the graph is chosen before execution. Step 2 always follows step 1. No mid-flight branching, no skipping, no inventing a new stage. The LLM executes; it does not govern the graph.
Analogy: a conveyor belt. Stations A → B → C are bolted to the floor.
Traits: predetermined steps, sequential data flow, no agency over topology — powerful versus a single call, still rigid. Classic “LLM chains” in early LangChain-style tutorials sit here; orchestration DSLs that only route fixed graphs behave the same even if they look fancy.
Cell — agent
A cell is not “more molecules.” It is alive enough to steer. An agent runs a loop — observe → think → act → observe — and decides which tool fires next, when to stop, and whether to add steps nobody hardcoded.
Versus the conveyor: the worker walks to whichever station matters.
Useful multipliers (they multiply; a zero anywhere tends to zero the outcome):
- Tools — without them, the loop is only words; with them, the loop changes files, APIs, browsers.
- Context — retrieval, files, telemetry; richer state → better choices.
- Instructions — system policy, safety, style, scope.
- Planning — explicit decomposition before action.
- Memory — what happened last turn / last week still reachable.
- Self-correction — critique–revise inside the loop.
- Model quality — the substrate for all of the above.
The shorthand I keep: model × tools × context × instructions. Weak any leg for your use case and the product shows it.
Traits: dynamic control flow, tool calls chosen not scripted, can recover from surprises, still one entity / one mission. The ReAct pattern is the standard reference for tool loops with reasoning traces; most shipped “agents” are cells with guardrails.
Organ — multi-agent system
Specialized cells + coordination → an organ. Crowds are not organs. A loose bag of unrelated agents is noise. An organ is a team: roles, interfaces, shared objective.
Example: automated PR review.
- PR analyzer — parse diff, classify change type.
- Style reviewer — conventions, readability, naming.
- Security auditor — secrets, dependencies, risky patterns (often parallel to style).
- Test strategist — coverage gaps, edge cases.
- Coordinator — merge streams, resolve conflicts, request rework.
Difference from a pipeline: each specialist has autonomy inside its scope. Security can deep-scan without being “told” line-by-line. The coordinator can send work back upstream. Feedback loops exist; it is not only left-to-right.
Common coordination shapes:
- Hierarchical — lead delegates, merges results.
- Peer-to-peer — agents message each other without a single hub.
- Pipeline-of-agents — fixed sequence, but each hop is a thinking cell, not a dumb transform.
- Blackboard — shared writable state everyone reacts to; classic in distributed AI (Blackboard systems — overview).
Traits: parallelism, explicit roles, still triggered from the outside — job in, result out, then idle. Think freelance squad: engagement scoped, then disbands.
Frameworks in this band include LangGraph, CrewAI, and AutoGen / AG2 — they give topology and messaging primitives; you still supply policy and evaluation.
Organism — autonomous system
Organs wait for a ticket. Organisms keep running.
The shift: continuous observe → think → act even when no human ping arrives. They notice drift, accumulate hypotheses, and sometimes act without a fresh prompt.
Concrete contrast:
- Organ — open PR → review agents run → comment → done.
- Organism — always watching the repo; notices three similar PRs touching the same module, flags emerging tech debt, files a suggestion, tracks adoption, adapts the next nudge based on what landed.
Three layers that turn an organ into an organism:
Layer 1 — nervous system
More than routing. Orchestration connects boxes; a nervous system interprets intent, does light cross-domain synthesis, prioritizes what matters now, and keeps thread continuity across sessions.
Operator switchboard vs brain: the brain models *why* you called and sometimes moves before you ask.
Layer 2 — memory and awareness
Tools without durable memory reset to naive every Monday. Persistent memory — scoped, governed, queryable — is what makes behavior compound over calendar time instead of only over context length.
Layer 3 — self-initiative
Schedules, monitors, anomaly triggers. “Deadline is tight and burn-down looks wrong — want a prioritized cut plan?” or “Spend pattern shifted 20% — review?” Humans stay in the loop for risky sends, but the impulse can originate inside the system.
Feedback loop
The three layers should reinforce each other: richer memory improves prioritization; better prioritization improves when to ping; proactive probes write new memory. That compounding loop is what makes the thing feel less like a static tool on day 100 than on day 1.
Traits: self-initiative within policy, long-horizon memory, self-regulation from outcomes, continuous existence rather than “runs when invoked.” Product analogues are thin on the ground; most “autonomous” stacks are still organs with cron — honest labeling matters.
Ecosystem — multi-organism network
One organism can be formidable. An ecosystem is many organisms across ownership boundaries, negotiating.
Company A’s supply agent talks to Company B’s procurement agent; both consult Company C’s market monitor; quotes, constraints, fulfillment — agent-to-agent, not only human-mediated SaaS clicks.
New phenomena at this tier:
- Emergent behavior — nobody authored the macro pattern line-by-line; it shows up from interaction (the usual caution: emergence is not magic; it still needs measurement and audit).
- Economy and incentives — cooperation needs reputation, contracts, pricing, dispute paths — the boring sociology of open multi-party automation.
- No single owner — each organism has a landlord; the mesh does not.
Today’s internet is a proto-ecosystem of APIs, but who calls whom, when, and on what terms is still mostly hardcoded. A real agent ecosystem is when those decisions are themselves delegated to agents with governance, not only to static integration tests.
Patterns across the ladder
Three repeats every time you climb a rung:
More autonomy — human moves from operator → director → manager → supervisor → governor as the system carries more initiative and policy surface.
Longer-lived processes — atoms are milliseconds; organisms are weeks; ecosystems outlive any single component.
More emergence — low tiers are predictable from the spec; high tiers produce interaction effects you cannot fully read off each part in isolation.
Practical implications
Climb in order. Cells need reliable atoms. Organs need trustworthy cells. Organisms need organs that do not lie in their logs.
Each jump needs new foundations. Molecule → cell needs real decision policy and evaluation. Cell → organ needs coordination contracts and shared metrics. Organ → organism needs nervous-system judgment, durable memory, and governed initiative — not only more agents.
Sweet spot (2026) — organism-shaped behavior with humans as supervisors on high-impact actions is the realistic frontier for most teams: continuous sensing, compounding memory, narrow autonomy, hard gates on spend and customer-facing changes.
Do not level up without a reason. If a pipeline solves it, skip the multi-agent theater. If an organ solves it, do not bolt on fake “always-on” complexity. Complexity without a job is just cost.
The emergence axis is not about building the most intricate system. It is about naming the level where your problem actually lives — then building there, not one floor above for vanity.
Further reading
- ReAct — reasoning and acting in language models
- Blackboard systems (Wikipedia)
- LangGraph
- CrewAI
- AutoGen / AG2
- DeepMind — levels of AGI (a different, benchmark-oriented ladder — useful contrast)
- Capability-axis levels on this site