Researchers at UC San Francisco and Wayne State University found that generative AI can process complex medical datasets faster than traditional human teams, sometimes yielding stronger results. The study focused on predicting preterm birth using data from over 1,000 pregnant women. This approach reduced analysis time from months to minutes in some cases.
Scientists at UC San Francisco and Wayne State University conducted a real-world test of generative AI in health research, comparing its performance to human experts. The task involved predicting preterm birth, a leading cause of newborn death in the United States, where about 1,000 babies are born prematurely each day. The researchers used microbiome data compiled from approximately 1,200 pregnant women across nine studies, sourced from the March of Dimes Preterm Birth Data Repository.
To evaluate AI capabilities, the team drew on datasets from the DREAM crowdsourcing competition, which previously involved over 100 global teams developing machine learning models for preterm birth risks and gestational age estimation. Human participants in that competition took about three months to build models, followed by nearly two years to consolidate and publish findings.
In the new study, eight AI chatbots were given natural language prompts to generate analytical code without direct human programming. Only four of the systems produced usable code, but those that succeeded matched or exceeded the performance of human teams. For instance, a junior pair—a UCSF master's student, Reuben Sarwal, and a high school student, Victor Tarca—developed prediction models with AI support, generating functional code in minutes rather than hours or days required by experienced programmers.
The entire process, from inception to journal submission, took just six months. "These AI tools could relieve one of the biggest bottlenecks in data science: building our analysis pipelines," said Marina Sirota, PhD, professor of Pediatrics at UCSF and principal investigator of the March of Dimes Prematurity Research Center. Co-senior author Adi L. Tarca, PhD, from Wayne State University, added, "Thanks to generative AI, researchers with a limited background in data science won't always need to form wide collaborations or spend hours debugging code. They can focus on answering the right biomedical questions."
The study, co-authored by Sirota and Tarca, emphasizes that AI requires human oversight to avoid misleading results. It was published in Cell Reports Medicine on February 17, highlighting potential for faster progress in understanding preterm birth risk factors.