GPT-5: Incremental AI Progress Amidst Exaggerated Hype

Technologyreview

OpenAI’s highly anticipated GPT-5, their new flagship model, has finally arrived, and the consensus among observers is clear: while it offers a refined user experience, it falls short of being a revolutionary breakthrough. As one colleague succinctly put it, GPT-5 is “above all else, a refined product.” This assessment aligns with a broader sentiment that recent model releases across the industry increasingly resemble incremental smartphone updates, designed to enhance existing features rather than introduce entirely new paradigms. OpenAI CEO Sam Altman himself drew a parallel to Apple’s introduction of the Retina display, a comparison that, while valid for iterative improvement, begs the question: where is the transformative leap from a BlackBerry keyboard to a touch-screen iPhone? Where is the fundamental shift that unlocks entirely new applications and industries, akin to assisted GPS enabling real-time navigation and giving rise to companies like Uber?

Indeed, GPT-5’s launch was met with an unexpected user backlash. Customers, accustomed to the distinct “personality” of GPT-4o, successfully lobbied OpenAI to reinstate it as an option for Plus subscribers. This episode further underscores that the GPT-5 release prioritized user experience fine-tuning over significant performance enhancements.

Yet, despite this reality, the hype surrounding GPT-5 was immense. Hours before the announcement, Altman teased the release with an image of an emerging Death Star. On the launch day, he touted its “PhD-level intelligence,” later claiming on a morning news show that it would “save a lot of lives.” While such pronouncements are often met with skepticism, Altman is far from alone in this grandstanding. Just last week, Meta CEO Mark Zuckerberg penned a lengthy memo about the imminent arrival of AI superintelligence, and earlier this year, Anthropic CEO Dario Amodei sparked widespread alarm with his prediction that AI could harvest half of all entry-level jobs within a year. These industry leaders frequently discuss the existential risks posed by their creations, even as their advanced models still struggle with basic queries, like counting the number of 'b’s in “blueberry.”

This is not to diminish the impressive capabilities of products from OpenAI, Anthropic, and other developers. They are undoubtedly powerful tools with considerable utility. However, the hype cycle surrounding these model releases has become excessive. As a frequent user of ChatGPT and Google Gemini, often multiple times a day, I’ve experienced their utility firsthand. Recently, my wife encountered a whale repeatedly slapping its tail on the water, a behavior she had never witnessed despite extensive experience with marine life. Curious, I turned to ChatGPT, asking, “Why do whales slap their tails repeatedly on the water?” The chatbot confidently identified the behavior as “lobtailing” and provided a list of potential reasons. While impressive, a standard Google search would have yielded similar information. More importantly, ChatGPT’s explanation, while concise, was overly definitive. In reality, while theories abound, the precise reasons behind lobtailing remain a scientific mystery.

My awareness of this mystery stems from delving deeper into traditional search results, which led me to an insightful essay by Emily Boring. She eloquently describes her observations of a humpback whale lobtailing and explores the scientific uncertainty surrounding this energy-intensive behavior. Is it for feeding, communication, or posturing? As biologist Hal Whitehead suggests, “Breaches and lob-tails make good signals precisely because they are energetically expensive and thus indicative of the importance of the message and the physical status of the signaler.” A tail-slap, in this context, becomes a powerful declaration: “Pay attention! I am important! Notice me!”

In many ways, the current AI hype cycle is a necessary byproduct of the ferocious level of investment. The uncountable billions of dollars in sunk costs, the massive data center buildouts with their significant environmental consequences, all demand justification. So much capital is at stake that the industry is compelled to generate a constant stream of grand promises.

This isn’t to say that genuinely cool things aren’t happening in AI. I’ve been genuinely floored by certain AI releases, such as ChatGPT 3.5, Dall-E, NotebookLM, Veo 3, and Synthesia. And just this week, Google DeepMind’s Genie 3, which can transform a simple text prompt into an immersive, navigable 3D world, was truly mind-blowing. Yet, Genie 3 itself makes a compelling case that the most interesting advancements in AI are often occurring outside the realm of chatbots.

One might even argue that at this stage, the most consistently amazed observers of new large language model chatbot releases are often those who stand to profit most directly from their promotion. Perhaps this perspective is cynical, but I believe it is less cynical than promising a Death Star and delivering a chatbot whose primary new appeal is automatic model selection. It’s less cynical than promising superintelligence and delivering what amounts to an overhyped, often definitive, but ultimately limited tool. It’s all just a lot of lobtailing: “Pay attention! I am important! Notice me!”