Mol-VL is a Vision-Language Model for Optical Chemical Structure Understanding (OCSU).

To take advantage of existing pretrained VLMs, we adopt the weights from Qwen2-VL. Mol-VL-7B is further finetuned on Vis-CheBI20 training set.

For technical details, please refer to OCSU. Training and evaluation scripts are available at Github.

If you find our work useful in your research, please consider citing:

@article{fan2025ocsu,
  title={OCSU: Optical Chemical Structure Understanding for Molecule-centric Scientific Discovery},
  author={Fan, Siqi and Xie, Yuguang and Cai, Bowen and Xie, Ailin and Liu, Gaochao and Qiao, Mu and Xing, Jie and Nie, Zaiqing},
  journal={arXiv preprint arXiv:2501.15415},
  year={2025}
}

Downloads last month: 5

Safetensors

Model size

8B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for PharMolix/Mol-VL-7B

OCSU: Optical Chemical Structure Understanding for Molecule-centric Scientific Discovery

Paper • 2501.15415 • Published Jan 26, 2025