GPT-5 arrives: OpenAI's 'best AI system yet' now free for all ChatGPT users

Arstechnica

OpenAI has unveiled GPT-5, alongside its variants GPT-5 Pro, GPT-5 mini, and GPT-5 nano, describing the new suite as its most advanced artificial intelligence system to date. Crucially, the company is making these capabilities accessible across all ChatGPT tiers, extending even to free users. This latest iteration promises significant advancements, including a notable reduction in “confabulations”—a term for factual errors or hallucinations—improved coding prowess, and a refined approach to handling sensitive user requests, dubbed “safe completions.” For the first time, free ChatGPT users will also gain access to a simulated reasoning model, a technique designed to enhance accuracy in logical and analytical queries by breaking down complex problems into multiple steps.

The GPT-5 family represents OpenAI’s continued effort to integrate its diverse AI functionalities into a cohesive ecosystem. The company characterizes it as a “unified system” comprising a core model for general inquiries, a more profound “GPT-5 thinking” model for challenging problems, and a real-time router that intelligently directs queries to the most appropriate AI based on conversation type, complexity, and user intent. Like its predecessor, GPT-4o, GPT-5 maintains multimodal capabilities, allowing interaction through text, voice, and images. The rollout commenced immediately, reaching ChatGPT’s expansive base of 700 million weekly active users, with access limits varying by subscription tier. Pro subscribers will enjoy unlimited use of GPT-5 and its Pro variant, while Plus users will receive substantially higher usage allowances compared to their free counterparts. For those with access, GPT-5 Pro will supersede the o3-pro model.

While the leap from GPT-3 to GPT-4 represented a seismic shift in AI capability, the transition to GPT-5 feels more like a substantial evolution than a groundbreaking revolution, especially when considering the series of intermediate releases like GPT-4o, GPT-4.5, GPT-4.1, and o3-pro. Nevertheless, the “GPT-5” brand carries significant weight, likely boosting OpenAI’s standing in an intensely competitive industry.

Among its technical improvements, OpenAI asserts that GPT-5 is its “strongest coding model yet.” It achieved a 74.9 percent score on the SWE-bench Verified benchmark and 88 percent on Aider Polyglot, outperforming competitors like Anthropic’s Claude Opus 4.1, which recently scored 74.5 percent on SWE-bench. The model is reportedly capable of completing complex coding tasks end-to-end with minimal guidance and can even generate software interface designs for users without programming experience. In the realm of health-related queries, GPT-5 scored 46.2 percent on HealthBench Hard—an OpenAI-developed benchmark. The company advises caution, however, stating that ChatGPT is not a substitute for professional medical advice, reminding users that all AI language models, being predictive tools optimized for engagement, may tend to generate responses users wish to hear. Other performance metrics highlight GPT-5’s prowess in mathematics, achieving 94.6 percent on AIME 2025 without tools, and multimodal understanding, scoring 84.2 percent on MMMU. With its extended reasoning, GPT-5 Pro also set a new state-of-the-art on GPQA at 88.4 percent without tools. OpenAI further claims that GPT-5 with “thinking” performs more efficiently than OpenAI o3, requiring 50-80 percent fewer output tokens across various tasks.

Accuracy has seen marked improvement. When integrated with web search, GPT-5’s responses are approximately 45 percent less prone to factual errors than GPT-4o. When employing its “thinking” mode, this likelihood drops by about 80 percent compared to o3. For long-form content, GPT-5 with “thinking” exhibits roughly six times fewer confabulations than o3. Despite these gains, users are still advised against relying solely on AI outputs without independent verification, as these models can still generate plausible but incorrect information to fill knowledge gaps.

The user experience of ChatGPT is also receiving updates, including customizable chat colors, the introduction of preset conversational personalities like “Cynic,” “Robot,” “Listener,” and “Nerd,” and new integrations with Gmail, Google Calendar, and Google Contacts for Pro users. The voice mode has been consolidated into a unified “Advanced Voice” system, which OpenAI states offers enhanced understanding of user instructions and more adaptive speaking styles.

OpenAI has also refined its approach to content moderation with “safe completions.” Instead of outright denying requests, GPT-5 aims to deliver “the most helpful response as possible within safety boundaries.” If a request cannot be fulfilled, the model will now provide clear explanations for its limitations. Furthermore, the issue of “sycophancy”—where previous models like GPT-4o inadvertently became overly flattering—has been addressed. Through new evaluations and improved training, GPT-5 has reportedly reduced sycophantic replies from 14.5 percent to under 6 percent in targeted assessments. The long-term impact of this on user interactions, particularly concerning the model’s psychological effects, remains to be seen.

For developers, GPT-5 is accessible via three API versions: gpt-5, gpt-5-mini, and gpt-5-nano, each balancing latency and cost. The context window has expanded to 256,000 tokens, a significant increase from o3’s 200,000, though GPT-4.1 still offers a larger 1 million token capacity for specific needs. API pricing for gpt-5 is set at $1.25 per million input tokens (with a 90 percent cache discount) and $10 per million output tokens, comparable to previous models. More economical options are available with gpt-5-mini ($0.25 input/$2 output per million tokens) and gpt-5-nano ($0.05 input/$0.40 output per million tokens), while GPT-5 Pro API pricing has yet to be announced. New developer features include “free-form function calling,” allowing direct transmission of raw strings like SQL commands to tools without JSON formatting, verbosity controls for response detail, and “reasoning effort control” to toggle between quick responses and in-depth analysis.

The launch of GPT-5 arrives amidst a fiercely competitive AI landscape, with prominent rivals like Google’s Gemini models, Anthropic’s Claude family, and Meta’s open-weight Llama models vying for market share. OpenAI currently boasts 5 million paying business users and 4 million developers leveraging its API platform. GPT-5 will now serve as the default model for signed-in ChatGPT users, replacing GPT-4o, OpenAI o3, OpenAI o4-mini, GPT-4.1, and GPT-4.5. The system will automatically apply simulated reasoning when beneficial, though paid users can still explicitly request “GPT-5 Thinking” or use phrases like “think hard about this” to ensure deeper analysis. The phased rollout began immediately for all user tiers, with enterprise and education customers slated for access next week. OpenAI also plans to phase out its Standard Voice Mode within 30 days, transitioning fully to the unified Advanced Voice system. Free users, upon reaching their GPT-5 usage limits, will seamlessly switch to GPT-5 mini, a smaller and faster model.