Team

Xiangyi Li

Founder

Terminal Bench contributor. Red Hatter. Founded BenchFlow and raised $1M+ from Jeff Dean, Arash Ferdowsi, Eugene Yan, Founders Inc, and A16z scout fund within the first year of entering the US.

Built code generation evaluation pipeline at Tesla. 12k+ GitHub stars.

Demos

  • Turn real-world TypeScript repos to verifiable environments with PR mirroring. Data preview notebook
  • We trained our own agents with our data and achieved hillclimb with Qwen 3 32B within 50 steps. Twitter post

Previous Work at BenchFlow

  • PokemonGym (150k views on Twitter, RT'd by Ak, Jim Fan, Harrison Chase)
  • PaperBench Ext (first full-run of PaperBench and first open-source adapter of it)
  • Environment Hub (started in mid 2024 with 60+ benchmarks adapted, which is a year earlier than PrimeIntellect's hub)