AI's Self-Improvement: Meta's Goal, Risks, and Future Impact
Mark Zuckerberg recently outlined a bold vision for Meta: to achieve artificial intelligence that surpasses human intellect. His strategy hinges on two crucial elements: attracting top human talent, reportedly with nine-figure offers for researchers at Meta Superintelligence Labs, and crucially, developing self-improving AI systems capable of bootstrapping themselves to ever-higher levels of performance.
The concept of AI self-improvement sets it apart from other groundbreaking technologies. Unlike CRISPR, which cannot refine its own DNA targeting, or fusion reactors, which cannot independently devise pathways to commercial viability, large language models (LLMs) demonstrate the capacity to optimize the very computer chips they operate on, efficiently train other LLMs, and even generate novel ideas for AI research. Indeed, progress in these areas is already evident.
Zuckerberg envisions a future where such advancements liberate humanity from mundane tasks, allowing individuals to pursue their loftiest goals alongside brilliant, hyper-effective artificial companions. However, this self-improvement also carries inherent risks, as highlighted by Chris Painter, policy director at the AI research nonprofit METR. Should AI rapidly accelerate its own capabilities, Painter warns, it could swiftly become more adept at hacking, designing weapons, and manipulating people. Some researchers even postulate that this positive feedback loop could culminate in an “intelligence explosion,” propelling AI far beyond human comprehension. Yet, one needn’t embrace a pessimistic outlook to acknowledge the serious implications of self-improving AI. Leading AI developers like OpenAI, Anthropic, and Google all incorporate automated AI research into their safety frameworks, categorizing it alongside more recognized risks such as chemical weapons and cybersecurity. Jeff Clune, a computer science professor at the University of British Columbia and senior research advisor at Google DeepMind, emphasizes that this path represents the “fastest route to powerful AI” and is arguably “the most important thing we should be thinking about.” Conversely, Clune also points out the immense potential upsides: human ingenuity alone might not conceive the innovations necessary for AI to eventually tackle monumental challenges like cancer and climate change.
For the time being, human ingenuity remains the primary driver of AI advancement, evidenced by Meta’s substantial investments in attracting researchers. Nevertheless, AI is increasingly contributing to its own evolution in several key ways.
One of the most immediate and widespread contributions LLMs make to AI development is enhancing productivity, particularly through coding assistance. Tools such as Claude Code and Cursor are widely adopted across the AI industry. Google CEO Sundar Pichai noted in October 2024 that a quarter of the company’s new code was generated by AI, and Anthropic has documented extensive internal use of Claude Code by its employees. The premise is simple: more productive engineers can design, test, and deploy new AI systems more rapidly. However, the true productivity gain remains debatable. A recent METR study found that experienced developers working on large codebases took approximately 20% longer to complete tasks when using AI coding assistants, despite subjectively feeling more efficient. This suggests a need for more rigorous evaluation within leading AI labs to ascertain the actual benefits.
Beyond enhancing productivity, AI is proving instrumental in optimizing its own underlying infrastructure. LLM training is notoriously slow, with complex reasoning models sometimes taking minutes to generate a single response—a significant bottleneck for development. Azalia Mirhoseini, an assistant professor of computer science at Stanford University and senior staff scientist at Google DeepMind, states, “If we can run AI faster, we can innovate more.” To this end, Mirhoseini and her Google collaborators developed an AI system in 2021 capable of optimizing computer chip component placement for efficiency, a design Google has since incorporated into multiple generations of its custom AI chips. More recently, Mirhoseini has applied LLMs to writing “kernels,” low-level functions that govern chip operations like matrix multiplication, finding that even general-purpose LLMs can generate kernels that outperform human-designed versions. Elsewhere at Google, the AlphaEvolve system utilizes the Gemini LLM to iteratively devise and refine algorithms for optimizing various parts of Google’s LLM infrastructure. This system has yielded tangible results, including a 0.7% saving in Google’s computational resources for data centers, improvements to custom chip designs, and a 1% acceleration in Gemini’s training time. While seemingly minor, these percentages translate into substantial savings in time, money, and energy for a company of Google’s scale, with potential for even greater gains if applied more broadly.
Another critical area of AI self-improvement lies in automating the training process. LLMs demand vast amounts of data, making training costly at every stage. In specialized domains, real-world data can be scarce. Techniques like reinforcement learning with human feedback, where humans score LLM responses to refine models, are effective but slow and expensive. LLMs are increasingly bridging these gaps. Given sufficient examples, they can generate plausible synthetic data for domains where real data is lacking. They can also serve as “judges” in reinforcement learning, scoring model outputs themselves—a core tenet of Anthropic’s influential “Constitutional AI” framework, where one LLM helps train another to be less harmful. For AI agents, which need to execute multi-step plans, examples of successful task completion are rare. Mirhoseini and her Stanford colleagues have pioneered a technique where an LLM agent generates a step-by-step plan, an LLM judge evaluates each step’s validity, and a new LLM agent is then trained on these refined steps. This approach effectively removes data limitations, allowing models to generate virtually unlimited training experiences.
Further still, while the core architecture of today’s LLMs—the transformer, proposed by human researchers in 2017—remains human-designed, the emergence of LLM agents has opened an entirely new design frontier. Agents require tools to interact with the external world and instructions on their usage, making the optimization of these elements crucial for effectiveness. Clune notes that this area offers “low-hanging fruit” for AI to pick, as humans have not yet exhaustively explored all possibilities. In collaboration with researchers at Sakana AI, Clune developed the “Darwin Gödel Machine,” an LLM agent capable of iteratively modifying its own prompts, tools, and code to enhance its task performance. This system not only improved its scores through self-modification but also discovered novel modifications its initial version could not have conceived, entering a genuine self-improvement loop.
Finally, perhaps the most ambitious form of AI self-improvement involves advancing AI research itself. Many experts highlight “research taste”—the ability of top scientists to identify promising new questions and directions—as a unique challenge for AI. However, Clune believes this challenge may be overstated. He and Sakana AI researchers are developing an end-to-end system called the “AI Scientist.” This system autonomously scours scientific literature, formulates its own research questions, conducts experiments, and drafts its findings. One paper it authored earlier this year, detailing a new neural network training strategy, was anonymously submitted to an International Conference on Machine Learning (ICML) workshop and accepted by reviewers, despite the strategy not ultimately working. In another instance, the AI Scientist conceived a research idea later independently proposed by a human researcher, garnering significant interest. Clune likens this moment to the “GPT-1 moment of the AI Scientist,” predicting that within a few years, it will be publishing papers in top peer-reviewed conferences and journals, making original scientific discoveries.
With such enthusiasm for AI self-improvement, it seems probable that AI’s contributions to its own development will only accelerate. Mark Zuckerberg’s vision suggests superintelligent models, surpassing human capabilities across many domains, are imminent. In reality, however, the full impact of self-improving AI remains uncertain. While Google’s AlphaEvolve has sped up Gemini’s training by 1%, this feedback loop is still “very slow,” according to Matej Balog, the project lead. The training of a model like Gemini takes a significant amount of time, meaning the “virtuous cycle” is only just beginning.
Proponents of superintelligence argue that if each subsequent version of Gemini further accelerates its own training, these improvements will compound, and more capable generations will achieve even greater speedups, inevitably leading to an intelligence explosion. This perspective, however, often overlooks the principle that innovation tends to become harder over time. Early in any scientific field, discoveries come readily. But as deep learning matures, each incremental improvement may demand substantially more effort from both humans and their AI collaborators. It is conceivable that by the time AI systems attain human-level research abilities, the most straightforward advancements will already have been made.
Determining the real-world impact of AI self-improvement is therefore a formidable challenge, compounded by the fact that the most advanced AI systems are typically proprietary to frontier AI companies, making external measurement difficult. Nevertheless, external researchers are making efforts. METR, for instance, tracks the overall pace of AI development by measuring how long it takes humans to complete tasks that cutting-edge AI systems can perform independently. Their findings are striking: since GPT-2’s release in 2019, the complexity of tasks AI can complete independently has doubled every seven months. Since 2024, this doubling time has shortened to just four months, strongly suggesting an acceleration in AI progress. While factors like increased investment in researchers and hardware contribute, it is entirely plausible that AI self-improvement also plays a significant role. Tom Davidson, the Forethought researcher, anticipates a period of accelerated AI progress, at least for a time. METR’s work indicates that the “low-hanging fruit” effect isn’t currently impeding human researchers, or that increased investment is effectively counterbalancing any slowdown. If AI markedly boosts researchers’ productivity, or even takes on a portion of the research itself, this balance will undeniably shift towards accelerated progress. The critical question, Davidson concludes, is “how long it goes on for.”