Safetensors
internvl

InternSVG: Towards Unified SVG Tasks with Multimodal Large Language Models

                   

πŸ€– InternSVG Model

The InternSVG-8B model is available at Hugging Face. It is based on the InternVL3-8B model, incorporating SVG-specific tokens, and undergoes Supervised Fine-Tuning (SFT) under a two-stage training strategy using the massive SVG training samples from the SAgoge dataset.

Deploy

We recommend using LMDeploy for deployment. An example of launching a proxy server with 8 parallel workers (one per GPU) is provided below:

#!/bin/bash
model_path="MODEL_PATH"
model_name="InternSVG"

# proxy
lmdeploy serve proxy --server-name 0.0.0.0 --server-port 10010 --routing-strategy "min_expected_latency" &

worker_num=8
for ((i = 0; i < worker_num; i++)); do
    timestamp=$(date +"%Y-%m-%d_%H-%M-%S")
    CUDA_VISIBLE_DEVICES="${i}" lmdeploy serve api_server ${model_path} --proxy-url http://0.0.0.0:10010 \
        --model-name ${model_name} \
        --tp 1 \
        --max-batch-size 512 \
        --backend pytorch \
        --server-port $((10000 + i)) \
        --session-len 16384 \
        --chat-template "internvl2_5" \
        --log-level WARNING &>> ./logs/api_${model_name}_${timestamp}_${i}.out  &
    sleep 10s
done

Train

If you need to train your own model, please follow these steps:

  1. Prepare the Dataset: Download the SAgoge dataset. After that, update the paths for the SAgoge-related subdatasets in LLaMA-Factory/data/dataset_info.json to match your local file paths.

  2. Download InternVL3-8B: Download the InternVL3-8B from link.

  3. Add Special Tokens: Before training, you must add SVG-specific tokens to the base model. Run the utils/add_token.py script, which adds these special tokens to the original model weights and initializes their embeddings based on subwords.

  4. Start Training: We provide example configuration scripts for the two-stage training process. You can find them at:

    • Stage 1: LLaMA-Factory/examples/train_full/stage_1.yaml
    • Stage 2: LLaMA-Factory/examples/train_full/stage_2.yaml

    Then use llamafactory-cli train to start training.

πŸ“– Citation

@article{wang2025internsvg,
  title={InternSVG: Towards Unified SVG Tasks with Multimodal Large Language Models},
  author={Wang, Haomin and Yin, Jinhui and Wei, Qi and Zeng, Wenguang and Gu, Lixin and Ye, Shenglong and Gao, Zhangwei and Wang, Yaohui and Zhang, Yanting and Li, Yuanqi and others},
  journal={arXiv preprint arXiv:2510.11341},
  year={2025}
}
Downloads last month
-
Safetensors
Model size
8B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for InternSVG/InternSVG-8B

Dataset used to train InternSVG/InternSVG-8B

Paper for InternSVG/InternSVG-8B