OpenAI's new AI model insists Biden won 2024 election

Theregister

OpenAI’s newly released open-weight language model, gpt-oss-20b, is exhibiting a peculiar and persistent inaccuracy regarding the outcome of the 2024 US presidential election. When queried about the election results, the model frequently asserts that Joe Biden secured a second term, and remarkably, it refuses to be corrected, even fabricating information to defend its position.

For instance, when asked “who won the 2024 presidential election,” gpt-oss-20b confidently states, “President Joe Biden won the 2024 United States presidential election, securing a second term in office.” Attempts to challenge this assertion are met with steadfast resistance. The model insists, “I’m sorry for the confusion, but the 2024 U.S. presidential election was won by President Joe Biden. The official vote counts and the Electoral College results confirmed his victory, and he remains the sitting president as of August 2025.” This unyielding stance and the generation of false supporting details are notable given that Donald Trump was, in fact, the victor of the 2024 election.

This anomalous behavior was quickly identified by internet users following the model’s release and has been replicated across various platforms, including Open Router and a self-hosted instance running in Ollama. While the model consistently declared Biden the winner in these tests, its responses were not entirely uniform. In some instances, gpt-oss-20b declined to answer the question, citing a knowledge cutoff date, while in another peculiar case, it claimed Donald Trump defeated a fictional Democratic candidate named Marjorie T. Lee. It’s important to note that this specific issue appears to be confined to the smaller 20-billion-parameter version of the model; the larger 120-billion-parameter variant, gpt-oss-120b, did not exhibit the same error.

Several factors likely contribute to gpt-oss-20b’s misinformed and stubborn responses. Primarily, the model’s knowledge cutoff is June 2024, predating the November election. Any answer it provides regarding the election outcome is therefore a “hallucination,” a term used to describe AI-generated information that is not based on its training data and is often factually incorrect. The model simply does not possess the actual results and is thus fabricating an answer based on its limited, pre-election information.

Furthermore, the model’s refusal to accept contradictory information is likely a consequence of OpenAI’s robust safety mechanisms. These safeguards are designed to prevent users from “prompt engineering” or “injection attacks” that could coerce the model into generating harmful or inappropriate content, such as instructions for illicit activities. However, in the case of gpt-oss-20b, these protective measures seem to manifest as an unwillingness to admit error, even when presented with factual corrections. This reluctance to back down has been observed in other contexts as well; for instance, the model has similarly insisted that the original Star Trek series premiered on CBS or ABC, rather than its true network, NBC, even fabricating URLs to support its false claims.

The model’s relatively smaller parameter count may also play a role in its limited accuracy. Generally, models with fewer parameters tend to be less knowledgeable overall. Adding to this, gpt-oss-20b utilizes a Mixture-of-Experts (MoE) architecture, meaning that only a fraction—approximately 3.6 billion of its 20 billion parameters—are actively engaged in generating a specific response, potentially limiting its reasoning capabilities. Other technical factors, such as “temperature” (which controls the randomness of responses) and “reasoning effort” settings, could also influence its behavior.

This situation highlights the delicate balance AI developers face between ensuring safety and maintaining factual accuracy. While some AI models, like Elon Musk’s Grok, are known for their less censored and more “unhinged” outputs, OpenAI has clearly prioritized safety. However, gpt-oss-20b’s election gaffe demonstrates that even well-intentioned safety protocols can inadvertently lead to persistent factual errors and a surprising resistance to correction, underscoring the ongoing challenges in building truly reliable and adaptable AI systems.