# Versioning System ## Overview SWE-bench assigns each task instance a specific `version` with respect to its repository. This version information is crucial for ensuring reproducible execution-based evaluation, as it determines the exact installation instructions needed for that repository. ## How Versioning Works When task instances are created, they are assigned a version based on the repository's state at the time of issue creation. This version information enables the evaluation harness to set up the correct environment for testing the proposed patch. ## Tools for Version Management ### General Purpose Tool: `get_versions.py` The `get_versions.py` script is a general-purpose tool for retrieving version information. It can obtain versions through two methods: 1. Reading directly from the GitHub repository 2. Building the repository locally and locating appropriate version files The script assigns each task instance a new `version: ` key/value pair in its metadata. ### Usage The script can be invoked via the `run_get_version.sh` wrapper script: ```bash python get_versions.py \ --instances_path [Required] [folder] Path to candidate task instances \ --retrieval_method [Required] [choice] Method to retrieve versions ("build", "mix", or "github") \ --cleanup [Required] [bool] Remove testbed and conda environments upon task completion \ --conda_env [Required] [str] Name of conda environment to run task installation within \ --num_workers [Required] [int] Number of processes to parallelize on \ --path_conda [Required] [folder] Path to miniconda or anaconda installation \ --output_dir [Required] [folder] Path to directory to write versioned task instances to \ --testbed [Required] [folder] Path to testbed directory, for cloning GitHub repos to ``` ### Repository-Specific Version Extraction For certain repositories, SWE-bench provides specialized scripts in the `extract_web/` directory that crawl the package's website to find versions and their cutoff dates. These scripts (like `get_versions_*.py`) can be adapted to other repositories to check task instances' `creation_date` against the version dates. ## Integration with Evaluation The version information is used by the evaluation harness to: 1. Set up the correct Docker environment for each task 2. Install the proper dependencies based on the version 3. Ensure consistent evaluation conditions across runs This versioning system is a key component in making SWE-bench evaluations reproducible and reliable.