DeepGlint-AI/rice-vit-large-patch14-560 Image Feature Extraction • 0.3B • Updated Jul 29 • 190 • 10
lmms-lab/LLaVA-OneVision-1.5-8B-Instruct Image-Text-to-Text • 9B • Updated Oct 21 • 55.8k • 59
LLaVA-OneVision-1.5 Collection https://github.com/EvolvingLMMs-Lab/LLaVA-OneVision-1.5 • 9 items • Updated Oct 21 • 18
A Survey of Scientific Large Language Models: From Data Foundations to Agent Frontiers Paper • 2508.21148 • Published Aug 28 • 140
Identity-Preserving Text-to-Video Generation by Frequency Decomposition Paper • 2411.17440 • Published Nov 26, 2024 • 37
VideoGen-of-Thought: A Collaborative Framework for Multi-Shot Video Generation Paper • 2412.02259 • Published Dec 3, 2024 • 60
SegBook: A Simple Baseline and Cookbook for Volumetric Medical Image Segmentation Paper • 2411.14525 • Published Nov 21, 2024 • 19
Interactive Medical Image Segmentation: A Benchmark Dataset and Baseline Paper • 2411.12814 • Published Nov 19, 2024 • 23
GMAI-VL & GMAI-VL-5.5M: A Large Vision-Language Model and A Comprehensive Multimodal Dataset Towards General Medical AI Paper • 2411.14522 • Published Nov 21, 2024 • 37