Researchers at Duke University have developed an artificial intelligence framework that reveals straightforward rules underlying highly complex systems in nature and technology. Published on December 17 in npj Complexity, the tool analyzes time-series data to produce compact equations that capture essential behaviors. This approach could bridge gaps in scientific understanding where traditional methods fall short.
The new AI, created by a team led by Boyuan Chen, director of the General Robotics Lab at Duke University, draws inspiration from historical figures like Isaac Newton, who formulated equations for changing systems. It processes data on how complex dynamics evolve, distilling thousands of variables into simpler, linear-like models that remain accurate to real-world observations.
Building on mathematician Bernard Koopman's 1930s theory, which posits that nonlinear systems can be represented linearly, the framework addresses a key challenge: the sheer volume of equations needed for such representations. By integrating deep learning with physics-based constraints, it identifies pivotal patterns in experimental data, resulting in models up to 10 times smaller than those from prior machine-learning techniques.
Tests across diverse applications—such as pendulum swings, electrical circuits, climate models, and neural signals—demonstrated the AI's ability to uncover a handful of governing variables for reliable long-term predictions. "What stands out is not just the accuracy, but the interpretability," Chen noted. "When a linear model is compact, the scientific discovery process can be naturally connected to existing theories and methods that human scientists have developed over millennia."
Beyond predictions, the system detects stable states, or attractors, helping scientists gauge system health and impending changes. Lead author Sam Moore, a PhD candidate in Chen's lab, explained: "For a dynamicist, finding these structures is like finding the landmarks of a new landscape." He added, "This is not about replacing physics. It's about extending our ability to reason using data when the physics is unknown, hidden, or too cumbersome to write down."
Chen emphasized the broader impact: "Scientific discovery has always depended on finding simplified representations of complicated processes. We increasingly have the raw data needed to understand complex systems, but not the tools to turn that information into the kinds of simplified rules scientists rely on. Bridging that gap is essential."
Funded by the National Science Foundation, Army Research Office, and DARPA, the work advances toward "machine scientists" for automated discovery. Future plans include optimizing data collection for experiments and extending to multimedia like video and audio from biological systems.