Update model files by removing redundant n-gram embededding weight duplication
#3
by
LongCat0830
- opened
No description provided.
So can we know what the model that was uploaded earlier (200GB+) is?
I'd like to know as well.
I was surprised yesterday, when the 8bit quant that mlx-community posted was only 73 GB in size.
LongCat0830
changed pull request title from
update
to Update model files by removing redundant n-gram embededding weight duplication
So can we know what the model that was uploaded earlier (200GB+) is?
The previous version contained a duplication of n-gram embedding weights, which did not affect runtime performance or model correctness but increased storage size.