Workers paid to train advanced AI models are increasingly relying on chatbots like ChatGPT to generate the required conversations and tests. This shortcut, described as widespread by multiple sources, risks degrading the quality of future models through recursive training on synthetic data.
Several whistleblowers told New Scientist that the practice occurs despite explicit company policies against it. Low pay and short-term contracts for third-party workers create incentives to complete tasks faster using AI tools.
One worker, referred to as Alice, said she feels no guilt and avoids detection by instructing chatbots to skip common AI writing markers such as em-dashes. She noted that only the least careful users get caught.
Another worker, Bob, initially used AI while training models for Outlier, a platform owned by Scale AI, and was later promoted to detect similar activity through desktop screenshots captured by monitoring software. A third worker, Carol, began using large language models to check her output for guideline violations and now uses them to generate scenarios and files.
Mark Lee at the University of Birmingham warned that models trained heavily on AI-generated content can lose capabilities, though limited human data may reduce the effect.