Hub
    Docs
Try for Free
BenchFlow
/
MMLU-PRO
mirrored 15 minutes ago
Benchmark CardFiles and versionsLeaderboard
  • Hub
  • Contact
DiscordGitHubXLinkedIn
0
  • .gitignore
    133 B
    ​
  • Dockerfile
    200 B
    ​
  • LICENSE
    11.4 kB
    ​
  • README.md
    4.94 kB
    ​
  • benchflow_interface.py
    1.91 kB
    ​
  • compute_accuracy.py
    2.09 kB
    ​
  • cot_prompt_lib
    -
    ​
  • evaluate_from_api.py
    8.76 kB
    ​
  • main.py
    89 B
    ​
  • pyproject.toml
    354 B
    ​
  • requirements.txt
    140 B
    ​
  • scripts
    -
    ​
  • test_agent.py
    1.53 kB
    ​
  • uv.lock
    258 kB
    ​
update
a year ago
Update README.md
8 months ago
Updated the regex pattern in extract_final to use [A-J] between word boundaries as an answer.
a year ago
fix: change to inteligence url
3 months ago
fix: log should be a dict
3 months ago
Initial commit
a year ago
fix: fix for benchflow 0.1.12
3 months ago
feat: MMLU bench client
4 months ago
feat: MMLU bench client
4 months ago
feat: MMLU bench client
4 months ago
fix: intelligence url
3 months ago
add tested_24_prompt_styles
8 months ago
kirkfix: change to inteligence url 7ddc41a
feat: step 2 dockernize
4 months ago
feat: step 2 dockernize
4 months ago