Team
Xiangyi Li
Founder
Terminal Bench contributor. Red Hatter. Founded BenchFlow and raised $1M+ from Jeff Dean, Arash Ferdowsi, Eugene Yan, Founders Inc, and A16z scout fund within the first year of entering the US.
Built code generation evaluation pipeline at Tesla. 12k+ GitHub stars.
Demos
- Turn real-world TypeScript repos to verifiable environments with PR mirroring. Data preview notebook
- We trained our own agents with our data and achieved hillclimb with Qwen 3 32B within 50 steps. Twitter post
Previous Work at BenchFlow
- PokemonGym (150k views on Twitter, RT'd by Ak, Jim Fan, Harrison Chase)
- PaperBench Ext (first full-run of PaperBench and first open-source adapter of it)
- Environment Hub (started in mid 2024 with 60+ benchmarks adapted, which is a year earlier than PrimeIntellect's hub)