33 281 47

Orr Zohar PRO

orrzohar

https://orrzohar.github.io

AI & ML interests

Large Multi-Modal Models, Foundation Models, Video Understanding

Recent Activity

updated a model about 13 hours ago

orrzohar/BLIP3o-4B-Diffusion-Decoder

published a model about 13 hours ago

orrzohar/BLIP3o-4B-Diffusion-Decoder

updated a model about 13 hours ago

orrzohar/BLIP3o-4B

View all activity

Organizations

upvoted 2 papers 17 days ago

Kimi Linear: An Expressive, Efficient Attention Architecture

Paper • 2510.26692 • Published 18 days ago • 105

The End of Manual Decoding: Towards Truly End-to-End Language Models

Paper • 2510.26697 • Published 18 days ago • 113

upvoted 2 papers 27 days ago

FineVision: Open Data Is All You Need

Paper • 2510.17269 • Published 28 days ago • 65

OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM

Paper • 2510.15870 • Published about 1 month ago • 87

upvoted a paper 29 days ago

Robot Learning: A Tutorial

Paper • 2510.12403 • Published Oct 14 • 107

upvoted a paper about 1 month ago

SciVideoBench: Benchmarking Scientific Video Reasoning in Large Multimodal Models

Paper • 2510.08559 • Published Oct 9 • 8

upvoted a paper 3 months ago

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

Paper • 2508.06471 • Published Aug 8 • 190

upvoted an article 4 months ago

Article

TimeScope: How Long Can Your Video Large Multimodal Model Go?

Jul 23

•

upvoted 3 papers 6 months ago

SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics

Paper • 2506.01844 • Published Jun 2 • 141

UniVG-R1: Reasoning Guided Universal Visual Grounding with Reinforcement Learning

Paper • 2505.14231 • Published May 20 • 52

Scaling Law for Quantization-Aware Training

Paper • 2505.14302 • Published May 20 • 76

upvoted 9 papers 7 months ago

Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language Models in Math

Paper • 2504.21233 • Published Apr 30 • 49

ChartQAPro: A More Diverse and Challenging Benchmark for Chart Question Answering

Paper • 2504.05506 • Published Apr 7 • 25

Packing Input Frame Context in Next-Frame Prediction Models for Video Generation

Paper • 2504.12626 • Published Apr 17 • 51

CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training

Paper • 2504.13161 • Published Apr 17 • 94

VistaDPO: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models

Paper • 2504.13122 • Published Apr 17 • 20

Orr Zohar PRO

AI & ML interests

Recent Activity

Organizations

orrzohar's activity

TimeScope: How Long Can Your Video Large Multimodal Model Go?