trl-lib (TRL)

posted an update about 18 hours ago

Post

123

Ya está disponible el vídeo de la charla del otro día en @nerdearla sobre IA abierta, por si queréis verla! 🤠

https://www.youtube.com/watch?v=p-JLn4xAkMw

1 reply

·

sergiopaniego

posted an update 1 day ago

Post

1660

we've just added several example scripts to TRL showing how to train models with GRPO using some of the new OpenEnv environments

train a model to interact with a browser (🎮 BrowserGym Env), play Wordle (🎮 Wordle Env) and moooore!

TRL (GRPO + vLLM) + OpenEnv! ⚡️

📝 go play with them: https://github.com/huggingface/trl/tree/main/examples/scripts/openenv

📝 examples list: https://huggingface.co/docs/trl/main/en/example_overview#scripts

sergiopaniego

posted an update 4 days ago

Post

1624

Who wants a TRL sticker? 🙋

https://github.com/huggingface/trl

qgallouedec

updated a dataset 7 days ago

trl-lib/DeepMath-103K

Viewer • Updated 7 days ago • 103k • 199

qgallouedec

updated a collection 7 days ago

Prompt-only datasets

Collection

2 items • Updated 7 days ago

qgallouedec

published a dataset 7 days ago

trl-lib/DeepMath-103K

Viewer • Updated 7 days ago • 103k • 199

sergiopaniego

posted an update 17 days ago

Post

5308

fine-tuning a 14B model with TRL + SFT on a free Colab (T4 GPU)?
thanks to the latest TRL optimizations, you actually can!
sharing a new notebook showing how to do it 😎

colab: https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/sft_trl_lora_qlora.ipynb

notebooks in TRL: https://github.com/huggingface/trl/tree/main/examples/notebooks

2 replies

·

sergiopaniego

updated a dataset 17 days ago

trl-lib/documentation-images

Viewer • Updated 17 days ago • 9 • 80k

sergiopaniego

posted an update 18 days ago

Post

412

Gave a smol 🤏 intro to Agents using smolagents last Monday!
Sharing the slides in case you're curious. They serve as a gentle first step into the Agents Course we developed at @huggingface 🫶🫶

Course: https://huggingface.co/learn/agents-course/unit0/introduction

Workshop material: https://github.com/sergiopaniego/talks/tree/main/intro_to_agents

sergiopaniego

posted an update 21 days ago

Post

3099

Sharing the slides from yesterday's talk about "Fine Tuning with TRL" from the @TogetherAgent x @huggingface workshop we hosted in our Paris office 🎃!

Link: https://github.com/sergiopaniego/talks/blob/main/fine_tuning_with_trl/Fine%20tuning%20with%20TRL%20(Oct%2025).pdf

sergiopaniego

posted an update 22 days ago

Post

405

On-Policy distillation is trendy! and super useful!

HuggingFaceH4/on-policy-distillation

sergiopaniego

in trl-lib/documentation-images 24 days ago

Upload text_arena_evals.png

#2 opened 24 days ago by

burtenshaw

qgallouedec

updated a dataset 26 days ago

trl-lib/trackio-dataset

Updated 1 minute ago • 19.2k

qgallouedec

published a dataset 28 days ago

trl-lib/trackio-dataset

Updated 1 minute ago • 19.2k

sergiopaniego

posted an update 29 days ago

Post

2843

Meet OpenEnv 👋, an open ecosystem of environments for intelligent agents. Build, share, and test agents safely and consistently.

Ideal for training with TRL (we include examples🤓), deployment, and community collaboration via the HF Hub

Blog: https://huggingface.co/blog/openenv
Hub for Environments:

openenv
OpenEnv repo: https://github.com/meta-pytorch/OpenEnv
Try it out using TRL: https://huggingface.co/docs/trl/main/en/openenv

1 reply

·

qgallouedec

updated a Space 29 days ago

Trackio

🚀

4

Display tracking information

sergiopaniego

posted an update about 1 month ago

Post

1953

New drop! 💥 The VLM Object Understanding Comparison Space now runs with Qwen3-VL-4B and moondream3.

You can compare how models reason about images 🧠

Bonus: thanks to @ariG23498 , you now get auto-suggested prompts to explore faster.

Let’s gooo

sergiopaniego/vlm_object_understanding

sergiopaniego

posted an update about 1 month ago

Post

903

New drop! 💥 The VLM Object Understanding Comparison Space now runs with Qwen3-VL-4B and moondream3.

You can compare how models reason about images 🧠

Bonus: thanks to @ariG23498 , you now get auto-suggested prompts to explore faster.

Let’s gooo

sergiopaniego/vlm_object_understanding

sergiopaniego

posted an update about 1 month ago

Post

2318

@Qwen released their new small and dense VLMs (Qwen3-VL).

They're incredibly capable and one of my all-time favourite VLMs.

🤗 We’ve prepared some resources to help you get started.

> Fine-tune Qwen3-VL-4B with SFT or GRPO (free Colab notebooks):
> SFT: https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/sft_qwen_vl.ipynb
> GRPO: https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/grpo_qwen3_vl.ipynb

> Compare object detection vs. Moondream3:
sergiopaniego/vlm_object_understanding

> Fine-tune from the CLI using TRL:
https://github.com/kashif/Qwen3-VL/blob/trl-sft/qwen-vl-finetune/README.md#trl-based-training-single-gpu

lvwerra

authored a paper about 1 month ago

BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution

Paper • 2510.08697 • Published Oct 9 • 35

TRL

AI & ML interests

Recent Activity

trl-lib/DeepMath-103K

Prompt-only datasets

trl-lib/DeepMath-103K

trl-lib/documentation-images

Upload text_arena_evals.png

trl-lib/trackio-dataset

trl-lib/trackio-dataset

Trackio

BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution

AI & ML interests

Recent Activity

Team members 10

trl-lib's activity

Upload text_arena_evals.png

Trackio