Frontier Agent Evaluation

High signal environments
for agents

Curated by human experts from diverse, high-value professional domains. All tasks are verifiable, expert verified, using real data — things humans would get paid to do.

Science/Finance/Healthcare/Cybersecurity/Energy/Mathematics/Robotics/Media/Software Eng./Insurance/Office

Jeff Dean

Google

Arash Ferdowsi

Dropbox

Eugene Yan

Amazon

Founders, Inc.

A16z

Scout Fund