Arc-Intelligence
/

ATLAS-8B-Thinking

Text Generation

reinforcement-learning

teacher-student

adaptive-learning

text-generation-inference

Model card Files Files and versions

ATLAS-8B-Thinking / checkpoint

115 GB

2 contributors

History: 31 commits

aman-jaglan's picture

Copy to checkpoint/zero_to_fp32.py

25f9fe0 verified 3 months ago

global_step
Copy to checkpoint/global_step/zero_pp_rank_3_mp_rank_00_model_states.pt 3 months ago
added_tokens.json

707 Bytes

Copy to checkpoint/added_tokens.json 3 months ago
config.json

730 Bytes

Copy to checkpoint/config.json 3 months ago
generation_config.json

214 Bytes

Copy to checkpoint/generation_config.json 3 months ago
latest

13 Bytes

Copy to checkpoint/latest 3 months ago
merges.txt

1.67 MB

Copy to checkpoint/merges.txt 3 months ago
model-00001-of-00004.safetensors

4.9 GB
xet

Copy to checkpoint/model-00001-of-00004.safetensors 3 months ago
model-00002-of-00004.safetensors

4.92 GB
xet

Copy to checkpoint/model-00002-of-00004.safetensors 3 months ago
model-00003-of-00004.safetensors

4.98 GB
xet

Copy to checkpoint/model-00003-of-00004.safetensors 3 months ago
model-00004-of-00004.safetensors

1.58 GB
xet

Copy to checkpoint/model-00004-of-00004.safetensors 3 months ago
model.safetensors.index.json

32.9 kB

Copy to checkpoint/model.safetensors.index.json 3 months ago
rng_state_0.pth
Detected Pickle imports (7)
- "_codecs.encode",
- "torch.ByteStorage",
- "collections.OrderedDict",
- "torch._utils._rebuild_tensor_v2",
- "numpy.ndarray",
- "numpy._core.multiarray._reconstruct",
- "numpy.dtype"
How to fix it?
15 kB
xet

Copy to checkpoint/rng_state_0.pth 3 months ago
rng_state_1.pth
Detected Pickle imports (7)
- "_codecs.encode",
- "numpy.dtype",
- "numpy._core.multiarray._reconstruct",
- "torch.ByteStorage",
- "torch._utils._rebuild_tensor_v2",
- "collections.OrderedDict",
- "numpy.ndarray"
How to fix it?
15 kB
xet

Copy to checkpoint/rng_state_1.pth 3 months ago
rng_state_2.pth
Detected Pickle imports (7)
- "_codecs.encode",
- "numpy.dtype",
- "numpy._core.multiarray._reconstruct",
- "torch.ByteStorage",
- "torch._utils._rebuild_tensor_v2",
- "collections.OrderedDict",
- "numpy.ndarray"
How to fix it?
15 kB
xet

Copy to checkpoint/rng_state_2.pth 3 months ago
rng_state_3.pth
Detected Pickle imports (7)
- "_codecs.encode",
- "numpy.dtype",
- "numpy._core.multiarray._reconstruct",
- "torch.ByteStorage",
- "torch._utils._rebuild_tensor_v2",
- "collections.OrderedDict",
- "numpy.ndarray"
How to fix it?
15 kB
xet

Copy to checkpoint/rng_state_3.pth 3 months ago
special_tokens_map.json

613 Bytes

Copy to checkpoint/special_tokens_map.json 3 months ago
tokenizer.json

11.4 MB
xet

Copy to checkpoint/tokenizer.json 3 months ago
tokenizer_config.json

9.9 kB

Copy to checkpoint/tokenizer_config.json 3 months ago
trainer_state.json

22 kB

Copy to checkpoint/trainer_state.json 3 months ago
training_args.bin
Detected Pickle imports (14)
- "torch.bfloat16",
- "transformers.trainer_utils.SchedulerType",
- "transformers.integrations.deepspeed.HfDeepSpeedConfig",
- "transformers.trainer_utils.HubStrategy",
- "accelerate.state.PartialState",
- "torch.device",
- "trainers.grpo_config.GRPOConfig",
- "transformers.trainer_utils.SaveStrategy",
- "transformers.training_args.OptimizerNames",
- "transformers.trainer_utils.IntervalStrategy",
- "transformers.trainer_pt_utils.AcceleratorConfig",
- "accelerate.utils.dataclasses.DeepSpeedPlugin",
- "transformers.integrations.deepspeed.HfTrainerDeepSpeedConfig",
- "accelerate.utils.dataclasses.DistributedType"
How to fix it?
9.59 kB
xet

Copy to checkpoint/training_args.bin 3 months ago
vocab.json

2.78 MB

Copy to checkpoint/vocab.json 3 months ago
zero_to_fp32.py

29.2 kB

Copy to checkpoint/zero_to_fp32.py 3 months ago