Weβre thrilled to announce that the Qwen3-VL family of vision-language models is now available on Azure AI Foundry, thanks to our collaboration with Microsoft.
We bring open-source innovation to enterprise-grade AI infrastructure, making it easier than ever for enterprise to deploy and scale the latest and greatest from models from hugging Face securely within Azure.
π Highlights:
- Deploy Qwen3-VL instantly via managed endpoints - Built-in governance, telemetry, and lifecycle management - True multimodal reasoning β vision, language, and code understanding - State-of-the-art performance, outperforming closed-source models like Gemini 2.5 Pro and GPT-5 - Available in both *Instruct* and *Thinking* modes, across 24 model sizes
π Get started today: search for Qwen3-VL in the Hugging Face Collection on Azure AI Foundry.
π New blog: Maintain the unmaintainable β 1M+ Python LOC, 400+ models
How do you stop a million-line library built by thousands of contributors from collapsing under its own weight? At π€ Transformers, we do it with explicit software-engineering tenets, principles that make the codebase hackable at scale.
π Inside the post: β One Model, One File: readability first β you can still open a modeling file and see the full logic, top to bottom. β Modular Transformers: visible inheritance that cuts maintenance cost by ~15Γ while keeping models readable. β Config-Driven Performance: FlashAttention, tensor parallelism, and attention scheduling are config-level features, not rewrites.
Written with @lysandre,@pcuenq and @yonigozlan, this is a deep dive into how Transformers stays fast, open, and maintainable.