AlexWuKing Claude Opus 4.6 commited on
Commit
6f1208d
·
1 Parent(s): 61e3c79

Add paper link to model card

Browse files

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -17,6 +17,8 @@ pipeline_tag: image-text-to-text
17
 
18
  # gemma-3-4b-it-qat-4bit-mobile
19
 
 
 
20
  Aggressively optimized version of [gemma-3-4b-it-qat-4bit](https://huggingface.co/mlx-community/gemma-3-4b-it-qat-4bit) for iPhone/iPad (8 GB RAM). Reduces model size from 2.8 GB to 2.1 GB with split weights for text-only lazy loading, significantly lower runtime memory, and reduced thermal output.
21
 
22
  ## Optimizations Applied
 
17
 
18
  # gemma-3-4b-it-qat-4bit-mobile
19
 
20
+ > **Paper**: [On-Device Multimodal LLM Optimization: Fitting Gemma 3 into 2 GB](https://atomgradient.github.io/swift-gemma-cli/)
21
+
22
  Aggressively optimized version of [gemma-3-4b-it-qat-4bit](https://huggingface.co/mlx-community/gemma-3-4b-it-qat-4bit) for iPhone/iPad (8 GB RAM). Reduces model size from 2.8 GB to 2.1 GB with split weights for text-only lazy loading, significantly lower runtime memory, and reduced thermal output.
23
 
24
  ## Optimizations Applied