Hub
    Docs
Try for Free
BenchFlow
/
MMLU-PRO
mirrored 18 minutes ago
0
Benchmark CardFiles and versionsLeaderboard
  • Hub
  • Contact
DiscordGitHubXLinkedIn
  • .gitignore
    133 B
    ​
  • Dockerfile
    200 B
    ​
  • LICENSE
    11.4 kB
    ​
  • README.md
    4.94 kB
    ​
  • benchflow_interface.py
    1.91 kB
    ​
  • compute_accuracy.py
    2.09 kB
    ​
  • cot_prompt_lib
    -
    ​
  • evaluate_from_api.py
    8.76 kB
    ​
  • main.py
    89 B
    ​
  • pyproject.toml
    354 B
    ​
  • requirements.txt
    140 B
    ​
  • scripts
    -
    ​
  • test_agent.py
    1.53 kB
    ​
  • uv.lock
    258 kB
    ​
update
a year ago
kirkfix: change to inteligence url 7ddc41a
fix: change to inteligence url
5 months ago
Initial commit
a year ago
Update README.md
9 months ago
fix: log should be a dict
5 months ago
Updated the regex pattern in extract_final to use [A-J] between word boundaries as an answer.
a year ago
fix: intelligence url
5 months ago
feat: MMLU bench client
5 months ago
feat: MMLU bench client
5 months ago
feat: MMLU bench client
5 months ago
fix: fix for benchflow 0.1.12
5 months ago
add tested_24_prompt_styles
10 months ago
feat: step 2 dockernize
5 months ago
feat: step 2 dockernize
5 months ago