GigaBrain-0: A World Model-Powered Vision-Language-Action Model Paper • 2510.19430 • Published 27 days ago • 44
GigaBrain-0: A World Model-Powered Vision-Language-Action Model Paper • 2510.19430 • Published 27 days ago • 44 • 5
GigaBrain-0: A World Model-Powered Vision-Language-Action Model Paper • 2510.19430 • Published 27 days ago • 44
VIMI: Grounding Video Generation through Multi-modal Instruction Paper • 2407.06304 • Published Jul 8, 2024 • 10
Graph-Based Captioning: Enhancing Visual Descriptions by Interconnecting Region Captions Paper • 2407.06723 • Published Jul 9, 2024 • 11
MotionLLM: Understanding Human Behaviors from Human Motions and Videos Paper • 2405.20340 • Published May 30, 2024 • 20
aaraki/vit-base-patch16-224-in21k-finetuned-cifar10 Image Classification • Updated Mar 30, 2022 • 4.92k • • 11
nateraw/vit-base-patch16-224-cifar10 Image Classification • Updated Jan 28, 2022 • 3.28k • • 10