Emergent capabilities based solely on scaling are largely a myth. While scaling does extend capacity, such as with "one-shot" or "few-shot" learning, this is primarily due to a larger pattern-matching basis, inductive mechanisms (not deductive reasoning), and combinatorial effects. Currently, LLM builders are attempting to compensate for significant shortcomings with human intervention (hordes of poorly paid science students and some PhDs) to create the illusion of progress (patching AI). This is not a viable path toward Artificial General Intelligence (AGI).
Historically speaking, progress is slow, but it adds up.
There's likely a lot of mechanisms to understand and modeling to do. Each time we make advances it becomes more clear what problems remain and what to do next.
AI agents calling third-party logic looks like a very promising direction. As much as possible AI should be a glue connecting various parts that are already shown to work well.
Really appreciated this breakdown of the shifting narratives around model scaling—it highlights how much of AI forecasting is shaped by commercial incentives rather than pure technical reality. The discussion on inference scaling was especially interesting, as it underscores how improvements may now come from efficiency and application rather than just brute-force scaling.
It also makes me wonder: with model scaling hitting practical limits, are we about to see a shift where RAG, multi-modal learning, and domain-specific reasoning take center stage? Instead of ever-larger models, will the next breakthroughs come from better integration with structured data and real-world applications?
Are we entering a phase where business strategy and applied AI matter more than raw research breakthroughs?
"Another potential reason to give more weight to insiders is their technical expertise. We don’t think this is a strong reason: there is just as much AI expertise in academia as in industry."
Emergent capabilities based solely on scaling are largely a myth. While scaling does extend capacity, such as with "one-shot" or "few-shot" learning, this is primarily due to a larger pattern-matching basis, inductive mechanisms (not deductive reasoning), and combinatorial effects. Currently, LLM builders are attempting to compensate for significant shortcomings with human intervention (hordes of poorly paid science students and some PhDs) to create the illusion of progress (patching AI). This is not a viable path toward Artificial General Intelligence (AGI).
Historically speaking, progress is slow, but it adds up.
There's likely a lot of mechanisms to understand and modeling to do. Each time we make advances it becomes more clear what problems remain and what to do next.
AI agents calling third-party logic looks like a very promising direction. As much as possible AI should be a glue connecting various parts that are already shown to work well.
AGI is not a faster horse.
Really appreciated this breakdown of the shifting narratives around model scaling—it highlights how much of AI forecasting is shaped by commercial incentives rather than pure technical reality. The discussion on inference scaling was especially interesting, as it underscores how improvements may now come from efficiency and application rather than just brute-force scaling.
It also makes me wonder: with model scaling hitting practical limits, are we about to see a shift where RAG, multi-modal learning, and domain-specific reasoning take center stage? Instead of ever-larger models, will the next breakthroughs come from better integration with structured data and real-world applications?
Are we entering a phase where business strategy and applied AI matter more than raw research breakthroughs?
Incredible article - very level-headed and thorough. A joy to read. Thanks for sharing.
Nakasone ...
"Another potential reason to give more weight to insiders is their technical expertise. We don’t think this is a strong reason: there is just as much AI expertise in academia as in industry."
What's the basis for this claim?
The appalling lack of accurate predictions by techno-nerds within the industry probably plays a part
The AIME graph demonstrates conclusively the plateauing of capability with compute.
Replot the graph with a traditional linear x-axis and you will see that capability has plateaued.
Much, much smaller improvements for each order of magnitude of compute.