D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI Paper • 2510.05684 • Published Oct 7 • 137
Tri Series Collection Introducing our new series of models: Tri-7B, Tri-21B, and Tri-70B-preview-SFT • 10 items • Updated Sep 10 • 8
NemoGuard Collection Essential datasets and models for content safety, topic-following, and security guardrails • 11 items • Updated 4 days ago • 11
HyperCLOVA X SEED Collection HyperCLOVA X SEED is NAVER's lightweight open-source lineup with a strong focus on Korean language performance • 4 items • Updated Jul 22 • 28
view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM Mar 12 • 468
view article Article Exploring Hard Negative Mining with NV-Retriever in Korean Financial Text By Albertmade and 1 other • Jan 12 • 15
InstantFamily: Masked Attention for Zero-shot Multi-ID Image Generation Paper • 2404.19427 • Published Apr 30, 2024 • 74
FruitNeRF: A Unified Neural Radiance Field based Fruit Counting Framework Paper • 2408.06190 • Published Aug 12, 2024 • 18
InstructAvatar: Text-Guided Emotion and Motion Control for Avatar Generation Paper • 2405.15758 • Published May 24, 2024 • 1
I2VEdit: First-Frame-Guided Video Editing via Image-to-Video Diffusion Models Paper • 2405.16537 • Published May 26, 2024 • 17