tuxsentience-beta3

Our second open-weight model, in progress. For now this documents progress and details.

Model Information

It has been decided that this will be based off Qwen3 8B.

It will like the last one most likely be 4-bit, but due to our new training methods (detailed below) we may release larger sizes.

Training Information

We are attempting to train this model via distributed computing, this is how our current setup looks so far:

i9-10910, 32GB RAM, RX 7600 (8GB)
i5-13420H, 16GB RAM, RTX 3050 Mobile (6GB)
i5-12400, 32GB RAM, RTX 3060 (12GB)
Ryzen 7 9800X3D, 32GB RAM, RTX 3080 (10GB)

Amounting to around 98.47 TFLOPS.

In the future we are trying to aquire better hardware and a RX 9070 XT is planned for future models. Currently we are attempting unsloth + ray for distributed computing.

Benchmarks

Coming soon to an accuracy near you

FAQ

Q: This implies the existance of beta1 and alpha versions
A: They do exist, however they were never published and most likely never will be

Made possible by

https://accuratelinuxgraphs.com/ - Benchmarks and data visualization

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for GrainWare/tuxsentience-beta3

Base model

Qwen/Qwen3-8B-Base

Finetuned

Qwen/Qwen3-8B

Quantized

unsloth/Qwen3-8B-GGUF

Finetuned

(1)

this model

Dataset used to train GrainWare/tuxsentience-beta3

Collection including GrainWare/tuxsentience-beta3

tuxsentience

Collection

ACCURACY IS PRIORITY • 2 items • Updated Aug 7