Any plans to use RMSNorm (or FlashNorm) instead of LayerNorm?
1
#12 opened about 1 year ago
by
graefics
lack of digit splitting in slow version of tokenizer
❤️
1
#11 opened over 1 year ago
by
Forence
Adding Evaluation Results
#10 opened over 1 year ago
by
leaderboard-pr-bot
Big difference between the before-cooldown-ckpt and the final checkpoint in the results of downstream tasks?
1
#9 opened over 1 year ago
by
siqi-zz
Adding Evaluation Results
#8 opened over 1 year ago
by
leaderboard-pr-bot
Will there be a version with traditional Chinese in the future?
#5 opened almost 2 years ago
by
win10
Training config link is broken
👍
2
11
#3 opened almost 2 years ago
by
davidgortega