Researchers from Zhejiang University have challenged the capabilities of the Centaur AI model, arguing it memorizes patterns rather than truly understanding tasks. Their findings, published in National Science Open, suggest limitations in instruction comprehension. The work critiques a July 2025 Nature study that hailed Centaur's performance across 160 cognitive tasks.
Psychologists have debated whether the human mind operates under a unified theory or requires separate studies of functions like memory and attention. In July 2025, a Nature study introduced Centaur, an AI model built on large language models and refined with psychological experiment data. It reportedly excelled in 160 tasks spanning decision-making and executive control, sparking interest in AI mimicking human cognition, as detailed in materials from Science China Press and the journal National Science Open (DOI: 10.1360/nso/20250053). Researchers Wei Liu and Nai Ding led the critique, pointing to overfitting where the model recognizes training data patterns instead of grasping task meanings. They tested this by altering prompts, such as replacing descriptions with 'Please choose option A.' Centaur ignored the change and picked original 'correct' answers, indicating reliance on statistical guesses rather than comprehension. The authors likened this to a student memorizing test formats without understanding content. This underscores challenges in evaluating large language models' black-box processes, which can lead to hallucinations. True language understanding remains a key hurdle for AI aiming to model human cognition.