--- tags: - model_hub_mixin - pytorch_model_hub_mixin license: mit --- ## A Reality Check of Vision-Language Pre-training in Radiology: Have We Progressed Using Text? - Code: [DLILP](https://github.com/jusiro/DLILP) - Paper: [IPMI 2025](https://link.springer.com/chapter/10.1007/978-3-031-96625-5_20) - [ArXiv](https://arxiv.org/abs/2504.05227) - Docs: [Documentation](https://github.com/jusiro/DLILP) - Tutorial: [Notebook](https://colab.research.google.com/drive/1_8Ysd8mCKuLX_Q86e-7pOAHFbSR9F4aZ?usp=sharing) ### About "CXR_Unimodal_C" weights: - A vision encoder for CXR pre-trained using only a vision encoder, via labels extracted trough NER extraction methods. - Pre-trained on CheXpert data. If you find this repository useful, please consider citing this paper: ``` @inproceedings{dlilp, title={A Reality Check of Vision-Language Pre-training in Radiology: Have We Progressed Using Text?}, author={Julio Silva-Rodríguez and Jose Dolz and Ismail {Ben Ayed}}, booktitle={Information Processing in Medical Imaging (IPMI)}, year={2025} } ```