OpenAI's GPT-5 launch marred by 'mega chart screwup'

Theverge

During its highly anticipated GPT-5 livestream on Thursday, OpenAI sought to impress audiences with a series of charts illustrating the new model’s advanced capabilities. Yet, a closer inspection revealed significant discrepancies in the visual presentation of some key data, drawing swift and candid admissions from the company’s leadership.

One particularly striking example emerged from a chart ironically purporting to demonstrate GPT-5’s performance in “deception evals across models.” For a metric labeled “coding deception,” GPT-5 was shown with a 50.0 percent deception rate. However, a smaller OpenAI model, o3, which registered a lower deception rate of 47.4 percent, was inexplicably represented by a larger bar on the graph. This visual distortion suggested a better performance for o3 despite its numerically inferior score.

The charting issues were not isolated. Another problematic graph displayed one of GPT-5’s scores as numerically lower than o3’s, yet it was depicted with a visibly larger bar. Furthermore, on this same chart, scores for o3 and GPT-4o, despite being numerically different, were represented by bars of identical size, further undermining the visual integrity of the data.

These glaring inconsistencies did not go unnoticed, even by OpenAI’s top brass. CEO Sam Altman publicly acknowledged the blunder, labeling it a “mega chart screwup.” Adding to the mea culpa, an OpenAI marketing staffer also issued an apology for what was termed an “unintentional chart crime.” The company did not immediately provide further comment when asked about the errors.

The timing of these visual misrepresentations is particularly awkward for OpenAI. The company has been heavily promoting GPT-5’s “significant advances in reducing hallucinations” – a core challenge in large language models where AI generates plausible but incorrect information. To present charts that are themselves visually misleading, regardless of the underlying data points, creates an unfortunate perception for a company championing accuracy and reliability in its AI outputs. While it remains unclear whether GPT-5 itself was used in the generation of these flawed charts, the incident casts a shadow on a launch event intended to highlight the new model’s precision and fidelity. This episode underscores the critical importance of meticulous data visualization, especially when introducing groundbreaking technological advancements to a global audience.