Google launches Gemini 2.5 Deep Think AI, but it's a 'bronze' version
Google has officially launched Gemini 2.5 Deep Think, a new iteration of its AI model engineered for enhanced reasoning and complex problem-solving. This public release follows an advanced version of the model making headlines last month for achieving a gold medal at the International Mathematical Olympiad (IMO)—a first for an AI. However, the version now accessible to the public is not the identical gold medal-winning model.
According to Google’s blog post and Logan Kilpatrick, Product Lead for Google AI Studio, the publicly released model is a “less powerful ‘bronze’ version.” Kilpatrick clarified on social media that this variant is “faster and more optimized for daily use,” while the full IMO gold model is being provided to a select group of mathematicians for further testing of its capabilities.
Understanding Gemini 2.5 Deep Think’s Capabilities
Built upon the Gemini family of large language models (LLMs), Deep Think introduces new capabilities for tackling sophisticated problems. It employs “parallel thinking” techniques to explore multiple ideas simultaneously and utilizes reinforcement learning to strengthen its step-by-step problem-solving ability over time.
The model is designed for use cases that benefit from extended deliberation, such as testing mathematical conjectures, conducting scientific research, designing algorithms, and refining creative tasks like code and design. Early testers, including mathematician Michel van Garrel, have used it to investigate unsolved problems and generate potential proofs. Ethan Mollick, a professor at the Wharton School of Business and an AI expert, noted on social media that Deep Think was the first model to successfully generate a 3D graphic in response to a complex prompt he uses to test AI capabilities, demonstrating its advanced interpretative and creative abilities.
Performance and Benchmarks
Deep Think exhibits strong performance across several key application areas:
Mathematics and Science: It can simulate reasoning for complex proofs, explore conjectures, and interpret dense scientific literature.
Coding and Algorithm Design: The model performs well on tasks involving performance tradeoffs, time complexity, and multi-step logic.
Creative Development: In design scenarios such as voxel art or user interface builds, Deep Think demonstrates enhanced iterative improvement and detail enhancement.
The model leads in benchmark evaluations, including LiveCodeBench V6 (for coding ability) and Humanity’s Last Exam (covering math, science, and reasoning). It significantly outscored Gemini 2.5 Pro and competing models like OpenAI’s GPT-4 and xAI’s Grok 4 by double-digit margins in categories such as Reasoning & Knowledge, Code generation, and IMO 2025 Mathematics.
Deep Think vs. Gemini 2.5 Pro: A Comparison
While both Deep Think and Gemini 2.5 Pro are part of the Gemini 2.5 model family, Google positions Deep Think as a more capable and analytically skilled variant, particularly for complex reasoning and multi-step problem-solving. This improvement stems from its use of parallel thinking and reinforcement learning techniques, which enable the model to simulate deeper cognitive deliberation.
Google states that Deep Think is better at handling nuanced prompts, exploring multiple hypotheses, and producing more refined outputs. This is supported by side-by-side comparisons in tasks like voxel art generation, where Deep Think adds more texture, structural fidelity, and compositional diversity than 2.5 Pro.
Although Deep Think outperforms Gemini 2.5 Pro on multiple technical benchmarks related to reasoning and code generation, these gains come with tradeoffs. Deep Think is slower, requiring extended “thinking time,” and exhibits a higher refusal rate for benign prompts—an area Google is actively investigating. In contrast, 2.5 Pro remains better suited for users who prioritize speed and responsiveness, especially for lighter, general-purpose tasks. This differentiation allows users to choose based on their priorities: 2.5 Pro for speed and fluidity, or Deep Think for rigor and reflection.
The IMO Gold Medal Achievement
In July, a more advanced version of the Gemini Deep Think model achieved official gold-medal status at the 2025 IMO, the world’s most prestigious mathematics competition for high school students. This system solved five of six challenging problems, becoming the first AI to receive gold-level scoring from the IMO. Demis Hassabis, CEO of Google DeepMind, announced the achievement, stating the model had solved problems end-to-end in natural language, without needing translation into formal programming syntax. The IMO board confirmed the model scored 35 out of a possible 42 points, well above the gold threshold. Competition president Gregor Dolinar described Deep Think’s solutions as clear, precise, and in many cases, easier to follow than those of human competitors. It is important to reiterate that the Gemini 2.5 Deep Think released to the public is a faster, lower-performing version, not the exact competition model.
Accessing Gemini 2.5 Deep Think
Currently, Gemini 2.5 Deep Think is available exclusively on the Google Gemini mobile app for iOS and Android to users subscribed to the Google AI Ultra plan. This plan, part of the Google One subscription lineup, costs $249.99 per month, with a promotional offer of $124.99 per month for the first three months for new subscribers. The AI Ultra plan includes 30 TB of storage, access to the Gemini app with Deep Think and Veo 3, as well as tools like Flow and Whisk, and 12,500 monthly AI credits. Subscribers can activate Deep Think within the Gemini app by selecting the 2.5 Pro model and toggling the “Deep Think” option. It supports a fixed number of prompts per day and is integrated with capabilities like code execution and Google Search, generating longer and more detailed outputs compared to standard versions. The lower-tier Google AI Pro plan, priced at $19.99/month, and the free Gemini AI service do not include access to Deep Think. Deep Think will also be available to “trusted testers” through the Gemini application programming interface (API) in the coming weeks.
Significance for Enterprise Technical Decision-Makers
The release of Gemini 2.5 Deep Think represents the practical application of a major research milestone. While currently accessible through individual user accounts, it offers enterprises and organizations a glimpse into the capabilities of an AI model that has achieved a Math Olympiad medal. For researchers receiving the full IMO-grade model, it offers insight into the future of collaborative AI in mathematics. For AI Ultra subscribers, Deep Think provides a powerful step toward more capable and context-aware AI assistance, now running on mobile devices.