Late 2024 — Active·Runtime
BenchFlow
Models are the cars. We build the track.. The agent simulation runtime. One Scene-based lifecycle for single-agent, multi-agent, and multi-round evals. Sandboxed, hardened against reward hacking, full trajectory capture.
Xiangyi Li, Yimin Liu, Han-chung Lee