Humans Prevail Over AI at International Math Olympiad

Adrian Cole

At a prestigious international mathematics competition, humans managed to outperform generative AI models created by Google and OpenAI. However, for the first time, these AI programs achieved gold-level scores, indicating that their rapid development might prompt some reflection among human competitors.

Neither AI model achieved a perfect score, unlike five young participants at the International Mathematical Olympiad (IMO), which is open to contestants under 20 years old. Google announced on Monday that its advanced Gemini chatbot successfully solved five of the six math problems presented at this month’s IMO in Queensland, Australia.

Gregor Dolinar, president of the IMO, remarked, “We can confirm that Google DeepMind has reached the much-desired milestone, earning 35 out of a possible 42 points – a gold medal score. Their solutions were astonishing in many respects. IMO graders found them to be clear, precise, and most of them easy to follow.”

Approximately 10% of human participants claimed gold medals, with five individuals achieving perfect scores of 42 points. Meanwhile, OpenAI, the organization behind ChatGPT, stated that its experimental reasoning model also matched the gold-level score with 35 points.

In a social media update, OpenAI researcher Alexander Wei noted, “This achievement fulfills a longstanding grand challenge in AI at the world’s most prestigious math competition. We evaluated our models on the 2025 IMO problems under the same rules as human contestants. For each problem, three former IMO medalists independently graded the model’s submitted proof.”

Last year, Google garnered a silver medal at the IMO in Bath, England, managing to solve four out of six problems, much slower than this year’s performance, where the Gemini model completed the tasks within the allocated 4.5-hour timeframe.

The IMO acknowledged that tech companies had privately assessed closed-source AI models on the same problems faced by 641 participating students from 112 countries. Dolinar expressed enthusiasm regarding the advancements in the mathematical abilities of AI models, stating, “It is very exciting to see progress in the mathematical capabilities of AI models.”

However, contest organizers could not confirm the extent of computing resources used for the AI models or if humans played a role in their performance. Earlier this year, during an interview with CBS’ 60 Minutes, a leading AI researcher from Google predicted that computers with human-level cognitive abilities, referred to as ‘artificial general intelligence,’ could be developed within the next five to ten years.

Demis Hassabis, CEO of Google DeepMind, also projected that AI technology is on a trajectory toward understanding the world in nuanced ways. He expressed optimism that it would not only address significant problems but could also cultivate a sense of imagination within a decade due to increased investment in the field. “It’s moving incredibly fast,” Hassabis stated. “I think we are on some kind of exponential curve of improvement. The success of the field in the last few years has attracted even more attention, more resources, and more talent, adding to this exponential progress.”

Share This Article