AI & ML interests

A one-year long research workshop on large language models: the Summer of Language Models 21 🌸

Recent Activity

meg 
posted an update 12 days ago
view post
Post
3558
🤖 Did you know your voice might be cloned without your consent from just *one sentence* of audio?
That's not great. So with @frimelle , we brainstormed a new idea for developers who want to curb malicious use: ✨The Voice Consent Gate.✨
Details, code, here: https://huggingface.co/blog/voice-consent-gate
  • 3 replies
·
monsoon-nlp 
posted an update about 1 month ago
view post
Post
439
Bio LLMs train on many genomes, but can we encode differences within a species? TomatoTomato adds pangenome tokens to represent a domestic tomato and a wild tomato in one sequence 🍅 🧬
monsoon-nlp/tomatotomato-gLM2-150M-v0.1
meg 
posted an update about 2 months ago
view post
Post
2872
🤖 As AI-generated content is shared in movies/TV/across the web, there's one simple low-hanging fruit 🍇 to help know what's real: Visible watermarks. With the Gradio team, I've made sure it's trivially easy to add this disclosure to images, video, chatbot text. See how: https://huggingface.co/blog/watermarking-with-gradio
Thanks to the code collab in particular from @abidlabs and Yuvraj Sharma.
albertvillanova 
posted an update 3 months ago
view post
Post
3853
Latest smolagents release supports GPT-5: build agents that think, plan, and act.
⚡ Upgrade now and put GPT-5 to work!
meg 
posted an update 3 months ago
albertvillanova 
posted an update 3 months ago
view post
Post
587
🚀 smolagents v1.21.0 is here!
Now with improved safety in the local Python executor: dunder calls are blocked!
⚠️ Still, not fully isolated: for untrusted code, use a remote executor instead: Docker, E2B, Wasm.
✨ Many bug fixes: more reliable code.
👉 https://github.com/huggingface/smolagents/releases/tag/v1.21.0
meg 
posted an update 3 months ago
view post
Post
438
🤖 ICYMI: Yesterday, Hugging Face and OpenAI partnered to bring open source GPT to the public. This is a Big Deal in "AI world".

0. Common ground setting: OpenAI is the ChatGPT people. An “open source” model is one whose weights are available — that means the model can be “yours”.
1. You don’t have to interact with the company directly, nor give them your interactions, to use the system. The company can't "surveil" you.
2. You can evaluate the unique contributions of their SOTA model much more rigorously than you can when there are collections of models+code behind a closed API. You can find out specifically what the model can and can't do.
3. And you can directly customize it for whatever you'd like. Fine-tuning, wherein you give the model data that's tailored to your use cases and train it some more on that data, is trivial* when you have the model weights.
*Provided you have the compute.
4. You can directly benchmark whatever you'd like. Biases? Energy usage? Strengths/weaknesses? Go for it. You wants it you gots it--this transparency helps people understand SOTA *in general*, not just for this model, but points to, e.g., what's going on with closed Google models as well.
5. One of the most powerful things about "openness" that I've learned is that it cultivates ecosystems of collaborators building on top of one another's brilliance to make systems that are significantly better than they would be if created in isolation.
But, caveat wrt my own philosophy...
6. I do not take it as a given that advancing LLMs is good, and have a lot more to say wrt where I think innovation should focus more. For example, a focus on *data* -- curation, measurement, consent, credit, compensation, safety -- would deeply improve technology for everyone.
7. The transparency this release provides is massive for people who want to *learn* about LLMs. For the next generation of technologists to advance over the current, they MUST be able to learn about what's happening now. (cont...)
  • 1 reply
·
meg 
posted an update 3 months ago
view post
Post
493
🤖 👾 Thanks so much to BBC News and the stellar Suranjana Tewari for having me on to talk about US <—> China relationship in AI, and what it means for AI ethics.
albertvillanova 
posted an update 4 months ago
view post
Post
724
🚀 New in smolagents v1.20.0: Remote Python Execution via WebAssembly (Wasm)

We've just merged a major new capability into the smolagents framework: the CodeAgent can now execute Python code remotely in a secure, sandboxed WebAssembly environment!

🔧 Powered by Pyodide and Deno, this new WasmExecutor lets your agent-generated Python code run safely: without relying on Docker or local execution.

Why this matters:
✅ Isolated execution = no host access
✅ No need for Python on the user's machine
✅ Safer evaluation of arbitrary code
✅ Compatible with serverless / edge agent workloads
✅ Ideal for constrained or untrusted environments

This is just the beginning: a focused initial implementation with known limitations. A solid MVP designed for secure, sandboxed use cases. 💡

💡 We're inviting the open-source community to help evolve this executor:
• Tackle more advanced Python features
• Expand compatibility
• Add test coverage
• Shape the next-gen secure agent runtime

🔗 Check out the PR: https://github.com/huggingface/smolagents/pull/1261

Let's reimagine what agent-driven Python execution can look like: remote-first, wasm-secure, and community-built.

This feature is live in smolagents v1.20.0!
Try it out.
Break things. Extend it. Give us feedback.
Let's build safer, smarter agents; together 🧠⚙️

👉 https://github.com/huggingface/smolagents/releases/tag/v1.20.0

#smolagents #WebAssembly #Python #AIagents #Pyodide #Deno #OpenSource #HuggingFace #AgenticAI