underthelights commited on
Commit
9d96bfb
·
verified ·
1 Parent(s): 3ad7c84

Upload folder using huggingface_hub

Browse files
Files changed (2) hide show
  1. README.md +3 -41
  2. cliprt_libero_spatial.pt +3 -0
README.md CHANGED
@@ -1,41 +1,3 @@
1
- ---
2
- license: mit
3
- datasets:
4
- - clip-rt/modified_libero_hdf5
5
- language:
6
- - en
7
- tags:
8
- - robotics
9
- - vla
10
- - clip
11
- - contrastive_learning
12
- ---
13
-
14
- # CLIP-RT Finetuned on LIBERO-Spatial
15
-
16
- This model was produced by fine-tuning the [CLIP-RT model](https://clip-rt.github.io/) with a 0.3B parameter action decoder added to enable continuous action prediction on the LIBERO-Spatial dataset from the [LIBERO simulation benchmark](https://libero-project.github.io/main.html).
17
-
18
- ## Hyperparemeters
19
-
20
- | Category | Details |
21
- |----------------------|---------------------------------------------------------------------|
22
- | **Hardware** | 8 × H100 GPUs with 80GB memory |
23
- | **Model size** | 1.3B (CLIP-RT base + 0.3B action decoder) |
24
- | **Action dimension** | 7D per step × 8 steps (chunked) |
25
- | **Loss** | L1 regression |
26
- | **Batch size** | 256 |
27
- | **Epochs** | 128 |
28
-
29
- ## Usage Instructions
30
- To evaluate this model on the LIBERO simulator or in your own imitation learning pipeline, use the action decoder module with precomputed CLIP image and language embeddings. Refer to the original [CLIP-RT GitHub repository](https://github.com/clip-rt/clip-rt) for code and inference scripts.
31
-
32
- ## Citation
33
-
34
- ```bibtex
35
- @article{kang2024cliprt,
36
- title={CLIP-RT: Learning Language-Conditioned Robotic Policies from Natural Language Supervision},
37
- author={Kang, Gi-Cheon and Kim, Junghyun and Shim, Kyuhwan and Lee, Jun Ki and Zhang, Byoung-Tak},
38
- journal={arXiv preprint arXiv:2411.00508},
39
- year = {2024}
40
- }
41
- ```
 
1
+ ---
2
+ license: mit
3
+ ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
cliprt_libero_spatial.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:abe1a57341269995988896c5aa4d83a6449ef647ece40aac0232375ca8b1d2e7
3
+ size 16203732598