Inside The Machine

Authored by Neal Lloyd · Daily AI Series

Inside The Machine

Issue 23 · AI Corner · Inside The Machine

Day 23

AGI · Intelligence · The Question That Changes Everything

What Is AGI and Are We Close?
The Most Important Question in Artificial Intelligence Is Also the One With the Least Settled Answer. Here Is What We Actually Know.

Artificial General Intelligence — the hypothetical point at which AI systems can perform any intellectual task a human can perform — is either five years away, fifty years away, or already here depending on who you ask and how they define it. The definition matters more than most people realise. And the question of whether we are close is less important than the question of what we do while we are finding out.

Neal Lloyd

Author · Inside The Machine · June 2026

10 min read

11 min read

14 min read

“We are tantalizingly close to AGI.” Dario Amodei said this in March 2026 — in the same paper in which he defined AGI as a system equivalent to a “virtual employee” able to perform most white-collar tasks at or above PhD level. Sam Altman says OpenAI is on a path to superintelligence. Yann LeCun of Meta says current AI architectures cannot produce AGI at all. All three are serious people with serious arguments. The fact that they cannot agree on whether AGI is imminent, impossible with current approaches, or already within reach is not a sign of confusion. It is a sign of how hard the question actually is.”

Neal Lloyd · Inside The Machine, Day 23

In March 2026, Anthropic CEO Dario Amodei published a paper titled “The Urgency of Interpretability” in which he described the present moment as one in which we are “tantalizingly close to AGI.” He defined what he meant: a system roughly equivalent to a “virtual employee” able to complete most cognitive tasks at or above the level of a highly skilled PhD-level human worker. He also said, in the same paper, that we do not yet know how to verify whether a system has achieved this and that our inability to inspect what is actually happening inside frontier models is one of the most urgent problems in science. Sam Altman has said repeatedly that OpenAI is on a path to superintelligence — a system that exceeds human performance across all domains — and that the path is shorter than most people expect. Yann LeCun, Chief AI Scientist at Meta and one of the three recipients of the 2018 Turing Award for foundational AI work, says that current large language model architectures cannot produce AGI and that the field needs a fundamentally different approach to achieve human-level general intelligence. This is Day 23 of Inside The Machine, and today we try to make sense of this disagreement — who is right, what they mean by the terms they are using, and what the answer tells us about where AI actually goes from here.

Section I — The Definition Problem

AGI Means Different Things to Different People. The Differences Are Not Trivial.

The AGI debate is partly a scientific disagreement and partly a definitional disagreement, and the two are entangled in ways that make the debate harder to resolve than it needs to be. Before asking whether AGI is close, it is worth being precise about what AGI means — because the answer to “are we close?” depends almost entirely on the definition.

Definition 1: Task-based AGI. A system that can perform any specific cognitive task that a human can perform, at human level or above. On this definition, we arguably have narrow task-based AGI already — AI systems that outperform humans at chess, Go, protein folding, specific medical diagnoses, code generation, and many other specific tasks. But a system that excels at all of these simultaneously, across the full breadth of human cognitive capability, does not yet exist.

Definition 2: Economic AGI. Dario Amodei’s formulation: a “virtual employee” capable of completing most white-collar cognitive tasks at or above PhD level. This is a practical, deployment-focused definition that asks not whether the system is philosophically general but whether it is economically substitutable for a highly skilled human worker across most knowledge work domains. By this definition, frontier models like Claude Fable 5 and GPT-5.5 are close — they can perform many such tasks — but not yet reliably general across the full range that “most tasks” implies, and their performance in novel, unpredictable situations falls short of what a skilled human can do.

Definition 3: Cognitive AGI. A system that reasons the way humans reason — with genuine understanding, causal models of the world, common sense, and the ability to transfer learning across domains in the way humans do. This is Yann LeCun’s target definition, and his argument is that current LLM architectures cannot achieve it because they lack the world models and causal reasoning capabilities that biological intelligence possesses. On this definition, we are not close — and getting close would require architectural innovation of the kind that has not yet been demonstrated at scale.

Definition 4: Superintelligence. A system that exceeds human cognitive performance across all domains, potentially by a significant margin. Sam Altman’s framing. By this definition, we are not there — no current system is at human level across all domains, let alone above it. The path to superintelligence runs through AGI as an intermediate milestone, and the timeline depends on which definition of AGI you use and how quickly the path from AGI to superintelligence can be traversed once AGI is achieved.

⚡ The AGI Definition Spectrum

Task-specific superhuman AI: Already exists. AI beats humans at chess, Go, protein folding, specific image diagnosis, specific coding tasks. Economic AGI (Amodei): Virtual employee performing most white-collar tasks at PhD level. Current frontier models: getting close in specific domains, not yet reliably general. Cognitive AGI (LeCun): Human-style reasoning with world models, causal understanding, cross-domain transfer. Current LLMs: missing foundational capabilities by most accounts. Strong AGI: Full human-level capability across all domains. Not demonstrated. Timeline: contested from “5-10 years” (Altman) to “requires new architecture” (LeCun). Superintelligence: Exceeds human capability across all domains, potentially by large margin. Not demonstrated. OpenAI considers this the explicit target of its research programme.

Section II — The Case That AGI Is Close

Why Dario, Sam, and the Scaling Hypothesis Believers Think the Finish Line Is Visible

The argument that AGI — in the economic definition — is close rests primarily on the empirical observation of the past several years: AI capabilities have improved dramatically and apparently predictably as the scale of models and training data has increased. The scaling hypothesis holds that intelligence is an emergent property of scale — that given enough compute, data, and parameters, AI systems will continue to improve across all cognitive dimensions in ways that eventually produce human-level and then superhuman-level performance. The evidence for this hypothesis over the period from 2017 to 2026 is impressive. GPT-3 could barely hold a conversation. GPT-4 passed the bar exam. GPT-5.5 writes production-quality code, conducts scientific literature reviews, and reasons through complex multi-step problems. Claude Fable 5, before its recall, was benchmarked as the most capable AI ever publicly released. Each generation has substantially surpassed the previous one in ways that were apparent in the benchmarks before deployment and confirmed after it.

The extrapolation from this trajectory to AGI is not unreasonable. If the capability improvement from GPT-3 to Fable 5 — achieved in approximately five years — represents progress of that magnitude, and if scaling continues to produce capability improvements at a comparable rate, the argument runs that economic AGI is within the horizon of the next few years of development. Dario Amodei’s “tantalizingly close” language reflects this extrapolation. His definition — a virtual employee at PhD level — is arguably achievable through continued refinement of current approaches rather than architectural revolution.

The specific capability areas where frontier models are most conspicuously approaching human performance are also the areas most relevant to economic AGI: reasoning, coding, writing, research synthesis, mathematical problem-solving, and complex analysis. These are the core competencies of knowledge work. A system that performs them reliably at PhD level, with appropriate context management and task decomposition, would represent economic AGI by the Amodei definition regardless of whether it has human-level understanding in the deeper cognitive sense.

The empirical record of AI capability improvement over the past five years is genuinely extraordinary. The question is whether the next five years of scaling produce comparable gains or whether we are approaching the limits of what current architectures can achieve through scale alone. Nobody knows the answer to this question with certainty. Dario and Sam think the gains continue. Yann thinks they plateau. The difference between those views is the difference between AGI in 2030 and AGI requiring a new paradigm entirely.

Neal Lloyd · Inside The Machine, Day 23

Section III — The Case That AGI Requires Something New

Why LeCun and the Architectural Sceptics Think Current LLMs Hit a Ceiling

Yann LeCun’s critique of current AI architectures as a path to AGI is technically substantive and deserves careful attention. His argument, developed across multiple papers and public statements, holds that LLMs are fundamentally limited by what they can learn from text alone — that language, however large the training corpus, is a thin representation of the embodied, causal, perceptual world that biological intelligence navigates. Humans learn about the world not primarily through reading about it but through interacting with it, manipulating objects, experiencing cause and effect directly, and building rich internal models of how the world works. These world models — which cognitive scientists sometimes call intuitive physics and intuitive psychology — underlie the common sense reasoning that humans apply so effortlessly and that AI systems still struggle with in unpredictable edge cases.

The specific failure modes that LeCun points to as evidence for his position are consistent with the hallucination problem covered in Day 16 and the task drift problem covered in Day 18: AI systems that are impressively capable in the centre of their training distribution and conspicuously unreliable at the edges. A human with PhD-level expertise makes mistakes, but the nature of their mistakes reflects deep structural understanding — they can usually identify where they went wrong and why. An LLM’s failures are often structurally different — confident, fluent, and disconnected from the kind of causal reasoning that would have prevented the error. LeCun’s proposed alternative — a world-model-based architecture that learns from multimodal interaction with the world rather than from text prediction — is the direction that Meta’s AI research is pursuing. Whether it produces the results he predicts is not yet demonstrated at scale.

The third position — held by a significant number of researchers who are neither as optimistic as Altman nor as sceptical as LeCun — is that current architectures will continue to produce impressive capability improvements but will reach a ceiling before economic AGI, and that the combination of LLMs with additional architectural innovations (world models, memory systems, embodied learning, multi-agent coordination) will be required to cross that ceiling. This is the “LLMs plus more” position, and it is arguably the most empirically grounded of the three major camps.

Section IV — Why the Answer Matters Now

The AGI Question Is Not Just a Technical Question. It Is a Governance Question.

If AGI is close, the governance gap is urgent. The Fable 5 recall — covered in Episodes 08, 09, 11, and the update in this episode — is partly a story about what happens when a government realises a capability threshold has been crossed faster than it anticipated. The US government’s response to Fable 5 can be read as a preview of how governments will respond to AGI: with regulation, recall authority, export controls, and equity stakes. The institutional frameworks for governing AI are already inadequate for 2026 capabilities. If economic AGI arrives within the five-year horizon that Dario and Sam are projecting, the governance gap will be crisis-level.

If AGI requires new architectures, we have more time but cannot afford complacency. The LeCun position implies that the transformative moment — the crossing of the human-level threshold — is further away than the scaling enthusiasts claim. That additional time could be used to develop the governance frameworks, accountability systems, and institutional capacity that the rapid deployment of powerful but not-yet-AGI systems has already outpaced. The history of technology regulation suggests that this time will not be used well unless there is sustained, serious institutional attention to the problem. There is not yet that kind of attention at the required scale.

The interpretability problem is the most urgent regardless of timeline. Dario Amodei’s “Urgency of Interpretability” paper is relevant here regardless of which AGI timeline is correct. We do not know what is happening inside the most powerful AI systems we have built. We cannot verify whether their objectives align with ours. We cannot reliably predict their behaviour in novel situations. We cannot audit their reasoning for the specific failure modes that matter most in high-stakes deployments. These limitations apply to current systems, before AGI, and they become more urgent the closer we get to systems that are more capable, more autonomous, and more consequential. The interpretability problem is not a future risk. It is a current gap. It needs to be closed regardless of when AGI arrives.

The most important thing to understand about the AGI debate is not which timeline is correct. It is that the question of when AGI arrives is less important than the question of whether we are building the governance frameworks, interpretability tools, and institutional capacity to manage it responsibly before it arrives. On that question, there is broad agreement: we are behind. The disagreement is only about how far behind.

Neal Lloyd · Inside The Machine, Day 23

— Neal Lloyd
Inside The Machine, Day 23 · June 25 2026

← PreviousDay 22: AI and Your Health

Next →Coming Soon

About The Author

Neal Lloyd

Author · Series Creator

Authored by Neal Lloyd

Neal Lloyd writes about technology, human adaptation, and the uncomfortable questions nobody wants to answer at dinner. Inside The Machine is his ongoing daily series on AI.

By The Numbers

Major positions in the AGI debate: Altman (imminent, via scaling), Amodei (tantalizingly close, economic definition), LeCun (requires new architecture). All three are serious, well-evidenced, and incompatible with each other.

5yr

Sam Altman’s approximate timeline for AGI. The same timeline Dario Amodei’s “tantalizingly close” implies. The same timeline in which current governance frameworks are demonstrably inadequate if correct.

AI systems that have been independently verified to have achieved human-level cognitive performance across all domains. The definition debate exists partly because this verification problem has not been solved.

Key Concepts

Economic AGI

Dario Amodei’s definition: a system equivalent to a virtual employee capable of performing most white-collar tasks at or above PhD level. A practical, deployment-focused definition that avoids the deeper philosophical question of machine understanding.

The Scaling Hypothesis

Intelligence is an emergent property of scale: more compute, more data, more parameters produces more capability across all cognitive domains. The empirical foundation for the AGI-is-close position. Contested by LeCun and others as insufficient for cognitive AGI.

World Models

Internal causal representations of how the world works, underlying the common sense reasoning humans apply automatically. LeCun’s argument: LLMs lack these. His proposed path: architectures that learn world models from multimodal interaction, not text prediction.

The Interpretability Gap

We cannot inspect what is happening inside frontier AI models. We cannot verify alignment. We cannot predict behaviour in novel situations. This gap is urgent at current capability levels and becomes critical as capabilities increase toward AGI.

Superintelligence

A system exceeding human cognitive performance across all domains, potentially by a large margin. OpenAI’s stated research target. Requires AGI as an intermediate step. Timeline: from “within decades” (Altman) to “requires architectural revolution first” (LeCun).

Inside The Machine

An ongoing daily editorial series on artificial intelligence.

Authored by

Neal Lloyd