BenchFlow

RL Environments for Coding Agents

Real-world coding tasks from production TypeScript repositories. Train agents on actual engineering problems via PR Mirroring.

1.2M+

Combined Stars

30

Repositories

100%

From Real PRs

Validated Task Instances

Human-reviewed training tasks with verified fail-to-pass test coverage

8 tasks4 repositories

Source Repositories

Production codebases with active development and test coverage

30 repositories

calcom/cal.com

39k

Open-source scheduling infrastructure. The Calendly alternative.

SAASTypeScript

twentyhq/twenty

37k

Open-source CRM. Modern alternative to Salesforce.

SAASTypeScript

makeplane/plane

40k

Open-source project tracking. JIRA/Linear alternative.

SAASTypeScript

Infisical/infisical

24k

Open-source secret management platform for teams.

SAASTypeScript

dubinc/dub

23k

Open-source link management. Bitly alternative with analytics.

SAASTypeScript

activepieces/activepieces

19k

Open-source workflow automation. Zapier alternative.

SAASTypeScript

documenso/documenso

12k

Open-source document signing. DocuSign alternative.

SAASTypeScript

formbricks/formbricks

12k

Open-source survey platform. Qualtrics alternative.

SAASTypeScript

midday-ai/midday

13k

Financial tools for freelancers. Invoicing and tracking.

SAASTypeScript

openstatusHQ/openstatus

8k

Synthetic monitoring and status pages. Open-source.

SAASTypeScript

langgenius/dify

120k

LLM app orchestration platform with RAG and agents.

AI/LLMTypeScript

lobehub/lobe-chat

69k

AI Agent Workspace with multi-provider support and RAG.

AI/LLMTypeScript

FlowiseAI/Flowise

47k

Visual AI agent builder with drag-and-drop interface.

AI/LLMTypeScript

mckaywrigley/chatbot-ui

33k

AI chat interface for any model. ChatGPT-style UI.

AI/LLMTypeScript

excalidraw/excalidraw

112k

Virtual whiteboard for hand-drawn diagrams.

PRODUCTIVITYTypeScript

toeverything/AFFiNE

60k

Knowledge base with docs and canvas. Notion + Miro alternative.

PRODUCTIVITYTypeScript

tldraw/tldraw

44k

Infinite canvas whiteboard SDK for developers.

PRODUCTIVITYTypeScript

steven-tey/novel

15k

Notion-style WYSIWYG editor with AI autocompletion.

PRODUCTIVITYTypeScript

supabase/supabase

94k

Open-source Firebase alternative with Postgres.

DEV TOOLSTypeScript

hoppscotch/hoppscotch

77k

API development ecosystem. Postman alternative.

DEV TOOLSTypeScript

strapi/strapi

71k

Leading open-source headless CMS. REST + GraphQL.

DEV TOOLSTypeScript

payloadcms/payload

39k

Fullstack Next.js CMS framework.

DEV TOOLSTypeScript

refinedev/refine

34k

React framework for admin panels and internal tools.

DEV TOOLSTypeScript

unkeyed/unkey

5k

API key management and rate limiting platform.

DEV TOOLSTypeScript

appsmithorg/appsmith

39k

Low-code platform for internal tool building.

INTERNAL TOOLSTypeScript

ToolJet/ToolJet

37k

Open-source internal tool builder with AI.

INTERNAL TOOLSTypeScript

PostHog/posthog

30k

Product analytics, feature flags, and experimentation.

INTERNAL TOOLSTypeScript

medusajs/medusa

31k

Composable headless commerce platform.

COMMERCETypeScript

vercel/commerce

14k

High-performance Next.js e-commerce starter.

COMMERCETypeScript

sadmann7/skateshop

6k

Next.js 14 e-commerce with Server Actions demo.

COMMERCETypeScript

How It Works

PR Mirroring creates realistic coding tasks from real engineering work

Real GitHub PR
LM Reverses Changes
Tests Fail
Human Review
Training Task