ZhijianShu
/

LiteVGGT

Image-to-3D

Model card Files Files and versions

xet

Community

ZhijianShu commited on 9 days ago

Commit

a1f7a80

verified ·

1 Parent(s): 382dbe5

Update README.md

Browse files

Files changed (1) hide show

README.md +33 -46

README.md CHANGED Viewed

@@ -10,7 +10,7 @@ LiteVGGT is a 3D vision foundation model that significantly boosts vanilla VGGT'
 This model was presented in the paper: [LiteVGGT: Boosting Vanilla VGGT via Geometry-aware Cached Token Merging](https://huggingface.co/papers/2512.04939).
 - 🏠 [Project Page](https://garlicba.github.io/LiteVGGT/)
-- \ud83d\udcbb [Code](https://github.com/GarlicBa/LiteVGGT-repo)
 ## Overview
@@ -24,48 +24,35 @@ For 1000 input images, LiteVGGT achieves a **10\u00d7 speedup** over VGGT while
 To quickly try out LiteVGGT for 3D reconstruction, follow these steps:
-1.  **Environment Setup:**
-    First, create a virtual environment using Conda, clone this repository to your local machine, and install the required dependencies.
-    ```bash
-    conda create -n litevggt python=3.10
-    conda activate litevggt
-    git clone git@github.com:GarlicBa/LiteVGGT-repo.git
-    cd LiteVGGT-repo
-    pip install -r requirements.txt
-    ```
-2.  **Install Transformer Engine:**
-    Install the Transformer Engine package following its official installation requirements (see https://github.com/NVIDIA/TransformerEngine):
-    ```bash
-    export CC=your/gcc/path
-    export CXX=your/g++/path
-    pip install --no-build-isolation transformer_engine[pytorch]
-    ```
-3.  **Download Checkpoint:**
-    Then, download our LiteVGGT checkpoint that has been **finetuned** and **TE-remapped**:
-    ```bash
-    wget https://huggingface.co/ZhijianShu/LiteVGGT/resolve/main/te_dict.pt
-    ```
-4.  **Run Inference:**
-    ```bash
-    python run_demo.py \
-      --ckpt_path path/to/your/te_dict.pt \
-      --img_dir path/to/your/img_dir \
-      --output_dir ./recon_result \
-    ```
-## Citation
-If you find this project helpful, citing our paper would be greatly appreciated:
-```bibtex
-@inproceedings{wang2025vggt,
-  title={VGGT: Visual Geometry Grounded Transformer},
-  author={Wang, Jianyuan and Chen, Minghao and Karaev, Nikita and Vedaldi, Andrea and Rupprecht, Christian and Novotny, David},
-  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
-  year={2025}
-}
-```

 This model was presented in the paper: [LiteVGGT: Boosting Vanilla VGGT via Geometry-aware Cached Token Merging](https://huggingface.co/papers/2512.04939).
 - 🏠 [Project Page](https://garlicba.github.io/LiteVGGT/)
+- [Code](https://github.com/GarlicBa/LiteVGGT-repo)
 ## Overview
 To quickly try out LiteVGGT for 3D reconstruction, follow these steps:
+First, create a virtual environment using Conda, clone this repository to your local machine, and install the required dependencies.
+```bash
+conda create -n litevggt python=3.10
+conda activate litevggt
+git clone git@github.com:GarlicBa/LiteVGGT-repo.git
+cd LiteVGGT-repo
+pip install -r requirements.txt
+```
+Install the Transformer Engine package following its official installation requirements (see https://github.com/NVIDIA/TransformerEngine):
+```bash
+export CC=your/gcc/path
+export CXX=your/g++/path
+pip install --no-build-isolation transformer_engine[pytorch]
+```
+Then, download our LiteVGGT checkpoint that has been **finetuned** and **TE-remapped**:
+```bash
+wget https://huggingface.co/ZhijianShu/LiteVGGT/resolve/main/te_dict.pt
+```
+Finally:
+```bash
+python run_demo.py \
+  --ckpt_path path/to/your/te_dict.pt \
+  --img_dir path/to/your/img_dir \
+  --output_dir ./recon_result \
+```