somosnlp (SomosNLP)

pcuenq

posted an update 3 days ago

Post

2389

👉 What happened in AI in 2025? 👈

We prepared the 2025 version of the HF AI Timeline Grid, highlighting open vs API-based model releases, and allowing you to browse and filter by access, modality, and release type!

Play with it here:
2025-ai-timeline/2025-ai-timeline

Here's my personal quarterly TL;DR:

1️⃣ Q1 — Learning to Reason
Deepseek not only releases a top-notch reasoning model, but shows how to train them and compete with closed frontier models. OpenAI debuts Deep Research.

Significant milestones: DeepSeek R1 & R1-Zero, Qwen 2.5 VL, OpenAI Deep Research, Gemini 2.5 Pro (experimental)

2️⃣ Q2 — Multimodality and Coding
More LLMs embrace multimodality by default, and there's a surge in coding agents. Strong vision, audio, and generative models emerge.

Significant milestones: Llama 4, Qwen 3, Imagen 4, OpenAI Codex, Google Jules, Claude 4

3️⃣ Q3 — "Gold" rush, OpenAI opens up, the community goes bananas
Flagship models get gold in Math olympiads and hard benchmarks. OpenAI releases strong open source models and Google releases the much anticipated nano-banana for image generation and editing. Agentic workflows become commonplace.

Significant milestones: Gemini and OpenAI IMO Gold, gpt-oss, Gemini 2.5 Flash Image, Grok 4, Claude Sonnet 4.5

4️⃣ Q4 — Mistral returns, leaderboard hill-climbing
Mistral is back with updated model families. All labs release impressive models to wrap up the year!

Significant milestones: Claude Opus 4.5, DeepSeek Math V2, FLUX 2, GPT 5.1, Kimi K2 Thinking, Nano Banana Pro, GLM 4.7, Gemini 3, Mistral 3, MiniMax M2.1 🤯

Credits
🙏 NHLOCAL for the source data https://github.com/NHLOCAL/AiTimeline

🫡 @reach-vb for the original idea, design and recipe

🙌 @ariG23498 and yours truly for compiling and verifying the 2025 edition

🥳 Here's to 2026, wishing it becomes the best year ever for open releases and on-device-first use-cases! 🥂

1 reply

·

mariagrandury

authored 4 papers 4 months ago

Adding LLMs to the psycholinguistic norming toolbox: A practical guide to getting the most out of human ratings

Paper • 2509.14405 • Published Sep 17, 2025 • 2

Psycholinguistic Word Features: a New Approach for the Evaluation of LLMs Alignment with Humans

Paper • 2506.22439 • Published May 29, 2025 • 3

Apertus: Democratizing Open and Compliant LLMs for Global Language Environments

Paper • 2509.14233 • Published Sep 17, 2025 • 14

La Leaderboard: A Large Language Model Leaderboard for Spanish Varieties and Languages of Spain and Latin America

Paper • 2507.00999 • Published Jul 1, 2025 • 1

mariagrandury

updated a dataset 4 months ago

somosnlp/recursos-pln-es

Viewer • Updated Sep 18, 2025 • 183 • 72 • 1

mariagrandury

published a dataset 4 months ago

somosnlp/recursos-pln-es

Viewer • Updated Sep 18, 2025 • 183 • 72 • 1

mariagrandury

updated a dataset 4 months ago

somosnlp/recursos-pln-es-models

Viewer • Updated Sep 16, 2025 • 22 • 26

mariagrandury

published a dataset 4 months ago

somosnlp/recursos-pln-es-models

Viewer • Updated Sep 16, 2025 • 22 • 26

mariagrandury

updated a Space 5 months ago

Leaderboard Retos Hackathon SomosNLP 2025

🏆

1

Leaderboard Retos Hackathon SomosNLP 2025

mariagrandury

published a dataset 7 months ago

somosnlp/babylm-es

Updated Jun 19, 2025 • 15

mariagrandury

authored 2 papers 8 months ago

Kaleidoscope: In-language Exams for Massively Multilingual Vision Evaluation

Paper • 2504.07072 • Published Apr 9, 2025 • 9

It's the same but not the same: Do LLMs distinguish Spanish varieties?

Paper • 2504.20049 • Published Apr 8, 2025

reddrex

in somosnlp/LingComp_QA 8 months ago

How use the dataset to train my model GPT

2

#1 opened 8 months ago by

luisaarias

ouhenio

updated a Space 9 months ago

Mapa Blend-es

🌍

1

Revisa el avance colectivo de blend-es 😊

pcuenq

authored a paper 9 months ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7, 2025 • 202

gabrielmbmb

authored a paper 11 months ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4, 2025 • 253

tadeodonegana

posted an update 11 months ago

Post

1220

At RooMix(dot)ai we’re looking for an expert in generative image models for a short consulting gig. Any recommendations?

1 reply

·

Taylor658

posted an update about 1 year ago

Post

1030

🌐 The Stanford Institute for Human-Centered AI (https://aiindex.stanford.edu/vibrancy/) has released its 2024 Global AI Vibrancy Tool, a way to explore and compare AI progress across 36 countries.

📊 It measures progress across the 8 broad pillars of R&D, Responsible AI, Economy, Education, Diversity, Policy and Governance, Public Opinion and Infrastructure. (Each of these pillars have a number of Sub Indices)

📈 As a whole it is not surprising that the USA was at the top in terms of overall score as of 2023 (AI investment activity is a large part of the economic pillar for example and that is a large part of the overall USA ranking) but drilling in to more STRATEGIC Macro pillars like Education, Infrastructure or R&D reveal interesting growth patterns in Asia (particularly China) and Western Europe that I suspect the 2024 metrics will bear out.

🤖 Hopefully the 2024 Global Vibrancy ranking will break out AI and ML verticals like Computer Vision or NLP and or the AI Agent space as that may also from a global macro level give indications of what is to come globally for AI in 2025.

Taylor658

posted an update about 1 year ago

Post

1254

🤖💻 Function Calling is a key component of Agent workflows. To call functions, an LLM needs a way to interact with other systems and run code. This usually means connecting it to a runtime environment that can handle function calls, data, and security.

Per the Berkeley Function-Calling Leaderboard there are only 2 fully open source models (The other 2 in the top 20 that are not closed source have cc-by-nc-4.0 licenses) out of the top 20 models that currently have function calling built in as of 17 Nov 2024.
https://gorilla.cs.berkeley.edu/leaderboard.html

The 2 Open Source Models out of the top 20 that currently support function calling are:

meetkai/functionary-medium-v3.1
Team-ACE/ToolACE-8B

This is a both a huge disadvantage AND an opportunity for the Open Source community as Enterprises, Small Business, Government Agencies etc. quickly adopt Agents and Agent workflows over the next few months. Open Source will have a lot of catching up to do as Enterprises will be hesitant to switch from the closed source models that they may initially build their Agent workflows on in the next few months to an open source alternative later.

Hopefully more open source models will support function calling in the near future.

AI & ML interests

Team members 311

somosnlp's activity

Leaderboard Retos Hackathon SomosNLP 2025

How use the dataset to train my model GPT

Mapa Blend-es