christian-muertz commited on
Commit
568f9ab
·
verified ·
1 Parent(s): bb90457

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +54 -0
README.md ADDED
@@ -0,0 +1,54 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # SWE Bench Verified (Compressed)
2
+
3
+ <picture>
4
+ <img src="./plot.png" alt="SWE-Bench Verified Total Image Size" style="width:100%">
5
+ </picture>
6
+
7
+ Setting up all the SWE-Bench Verified images used to take over 200 GiB of storage and 100+ GiB of transfer.
8
+
9
+ Now it’s just:
10
+ - 31 GiB total storage (down from 206 GiB)
11
+ - 5 GiB network transfer (down from 100 GiB)
12
+ - ~ 5 minutes setup
13
+
14
+
15
+ ## 🚀 Getting the Images
16
+
17
+ Images follow the naming convention:
18
+
19
+ ```
20
+ logicstar/sweb.eval.x86_64.<repo>_1776_<instance>
21
+ ```
22
+
23
+ ### Docker
24
+ ```bash
25
+ curl -L -# https://huggingface.co/LogicStar/SWE-Bench-Verified-Compressed/resolve/main/saved.tar.zst?download=true | zstd -d --long=31 --stdout | docker load
26
+ ```
27
+
28
+ ### Podman
29
+ ⚠️ Podman cannot load docker-archives with manifests larger than 1 MiB.
30
+ We split the archive into two parts:
31
+ ```bash
32
+ curl -L -# https://huggingface.co/LogicStar/SWE-Bench-Verified-Compressed/resolve/main/saved.1.tar.zst?download=true | zstd -d --long=31 --stdout | podman load
33
+ curl -L -# https://huggingface.co/LogicStar/SWE-Bench-Verified-Compressed/resolve/main/saved.2.tar.zst?download=true | zstd -d --long=31 --stdout | podman load
34
+ ```
35
+
36
+ For faster downloads and parallelized loading, use the Hugging Face CLI to download the compressed OCI Layout and our load.py script to load the images in parallel:
37
+
38
+ ```bash
39
+ # Clone the repo and cd into it
40
+ hf download ...
41
+ python3 load.py
42
+ ```
43
+
44
+ ## 🛠 Using the Images
45
+
46
+ Just pass --namespace logicstar to the SWE-Bench harness. Example:
47
+
48
+ ```bash
49
+ python -m swebench.harness.run_evaluation \
50
+ --predictions_path gold \
51
+ --max_workers 1 \
52
+ --run_id validate-gold \
53
+ --namespace logicstar
54
+ ```