The MHA2MLA-VLM model published in the paper "MHA2MLA-VLM: Enabling DeepSeek's Economical Multi-Head Latent Attention across Vision-Language Models"
Xiaoran Fan
cnxup
AI & ML interests
NLP, CV, LLM
Recent Activity
updated
a dataset
about 12 hours ago
cnxup/LLaVA-NeXT-Data
published
a dataset
about 13 hours ago
cnxup/LLaVA-NeXT-Data
updated
a model
3 days ago
cnxup/SVD-Init
Organizations
None yet