Study finds Google's AI Overviews wrong in 10% of cases

A New York Times analysis shows Google's AI Overviews, powered by Gemini, answering correctly only 90% to 91% of questions in a standard benchmark. This translates to tens of millions of incorrect responses daily across searches. Google disputes the test's relevance.

The New York Times, working with startup Oumi, tested AI Overviews using SimpleQA, a benchmark of over 4,000 questions released by OpenAI in 2024. Initial tests with Gemini 2.5 showed 85% accuracy, improving to 91% after the Gemini 3 update. Extrapolated to Google's search volume, this means tens of millions of wrong answers generated each day, or millions per hour as highlighted in reports on the findings.

Related Articles

Illustration of a smartphone screen featuring Google's AI Overviews upgraded to Gemini 3 with conversational chat interface.
Image generated by AI

Google upgrades AI overviews to Gemini 3 model

Reported by AI Image generated by AI

Google has announced upgrades to its AI Overviews in Search, now powered by the Gemini 3 model as the default. The update allows users to ask follow-up questions through a chat interface that leads into AI Mode conversations. This rollout aims to make searches more conversational and accurate globally on mobile devices.

In a comparative evaluation of leading AI models, Google's Gemini 3.2 Fast demonstrated strengths in factual accuracy over OpenAI's ChatGPT 5.2, particularly in informational tasks. The tests, prompted by Apple's partnership with Google to enhance Siri, highlight evolving capabilities in generative AI since 2023. While results were close, Gemini avoided significant errors that undermined ChatGPT's reliability.

Reported by AI

Google has released Gemini 3.1 Pro, an updated version of its flagship AI model, emphasizing improvements in problem-solving and reasoning. The model is available in preview for developers and consumers starting today. It builds on the Gemini 3 release from November.

Google has launched an experimental 'Personal Intelligence' feature for its AI Mode in Search, allowing users to connect Gmail and Google Photos for more tailored responses. The opt-in tool, powered by Gemini 3, aims to make search results more relevant by drawing on personal data without training models on full inboxes. It rolls out first to paid subscribers in the US.

Reported by AI

Google is overhauling its Workspace apps by integrating deeper Gemini AI capabilities to assist in document creation and editing. The updates allow Gemini to pull context from emails, files, and other sources to generate drafts and refine content. These features aim to streamline workflows for users across Docs, Sheets, Slides, and Drive.

Apple has selected Google's Gemini AI models to enhance its Siri virtual assistant in a forthcoming update. The decision, announced in a joint statement, marks a shift from previous integrations with OpenAI's ChatGPT. This multi-year partnership aims to deliver more capable AI experiences while upholding Apple's privacy standards.

Reported by AI

Google has announced that its experimental AI prototype, Genie 3, is now available to subscribers of its highest-tier AI plan. The tool allows users to generate and navigate interactive 3D worlds using simple text prompts. Previously limited to trusted testers, this expansion marks a step toward broader access for the 18-and-older audience.

 

 

 

This website uses cookies

We use cookies for analytics to improve our site. Read our privacy policy for more information.
Decline