AtomGradient
/

gemma-3-4b-it-qat-4bit-mobile

@@ -17,6 +17,8 @@ pipeline_tag: image-text-to-text
 # gemma-3-4b-it-qat-4bit-mobile
 Aggressively optimized version of [gemma-3-4b-it-qat-4bit](https://huggingface.co/mlx-community/gemma-3-4b-it-qat-4bit) for iPhone/iPad (8 GB RAM). Reduces model size from 2.8 GB to 2.1 GB with split weights for text-only lazy loading, significantly lower runtime memory, and reduced thermal output.
 ## Optimizations Applied

 # gemma-3-4b-it-qat-4bit-mobile
+> **Paper**: [On-Device Multimodal LLM Optimization: Fitting Gemma 3 into 2 GB](https://atomgradient.github.io/swift-gemma-cli/)
 Aggressively optimized version of [gemma-3-4b-it-qat-4bit](https://huggingface.co/mlx-community/gemma-3-4b-it-qat-4bit) for iPhone/iPad (8 GB RAM). Reduces model size from 2.8 GB to 2.1 GB with split weights for text-only lazy loading, significantly lower runtime memory, and reduced thermal output.
 ## Optimizations Applied