Update model files by removing redundant n-gram embededding weight duplication

by LongCat0830 - opened 24 days ago

base: refs/heads/main

←

from: refs/pr/3

Discussion Files changed

+121

-113

This PR is in draft mode

LongCat0830

LongCat org 24 days ago

No description provided.

P1atinum

24 days ago

So can we know what the model that was uploaded earlier (200GB+) is?

doc-acula

24 days ago

I'd like to know as well.
I was surprised yesterday, when the 8bit quant that mlx-community posted was only 73 GB in size.

LongCat0830 changed pull request title from update to Update model files by removing redundant n-gram embededding weight duplication 24 days ago

LongCat0830

LongCat org 24 days ago

So can we know what the model that was uploaded earlier (200GB+) is?

The previous version contained a duplication of n-gram embedding weights, which did not affect runtime performance or model correctness but increased storage size.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Publish this branch

This branch is in draft mode, publish it to be able to merge.

· Sign up or log in to comment