Min Woo Sun's picture

Min Woo Sun

minwoosun

·

https://cs.stanford.edu/~minwoos

AI & ML interests

Machine Learning for Health

Recent Activity

authored a paper about 1 month ago

No Tokens Wasted: Leveraging Long Context in Biomedical Vision-Language Models

upvoted a paper about 1 month ago

No Tokens Wasted: Leveraging Long Context in Biomedical Vision-Language Models

commented on a paper about 1 month ago

No Tokens Wasted: Leveraging Long Context in Biomedical Vision-Language Models

View all activity

Organizations

upvoted 3 papers about 1 month ago

No Tokens Wasted: Leveraging Long Context in Biomedical Vision-Language Models

Paper • 2510.03978 • Published Oct 4 • 2

Apollo: An Exploration of Video Understanding in Large Multimodal Models

Paper • 2412.10360 • Published Dec 13, 2024 • 147

ModernVBERT: Towards Smaller Visual Document Retrievers

Paper • 2510.01149 • Published Oct 1 • 30

upvoted an article about 1 month ago

Article

Finally, a Replacement for BERT: Introducing ModernBERT

Dec 19, 2024

• 705

upvoted an article 3 months ago

Article

NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks

By

and 4 others •

Aug 11

• 75

upvoted a paper 6 months ago

MedCaseReasoning: Evaluating and learning diagnostic reasoning from clinical case reports

Paper • 2505.11733 • Published May 16 • 7

upvoted a paper 7 months ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7 • 200

upvoted a paper 8 months ago

MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research

Paper • 2503.13399 • Published Mar 17 • 22

upvoted a collection 8 months ago

SmolVLM2 📺 Smallest video LM ever 🤏🏻

11 items • Updated May 5 • 101

upvoted a paper 8 months ago

Video Action Differencing

Paper • 2503.07860 • Published Mar 10 • 33

upvoted a collection 9 months ago

MultiMedQA

MultiMedQA Benchmark datasets • 9 items • Updated Apr 2, 2024 • 13

upvoted an article 9 months ago

Article

SmolVLM - small yet mighty Vision Language Model

Nov 26, 2024

• 381

upvoted a collection 9 months ago

SmolVLM

State-of-the-art compact VLMs for on-device applications: Base, Synthetic, and Instruct. Check our blog: https://huggingface.co/blog/smolvlm • 5 items • Updated May 5 • 40

upvoted a paper 10 months ago

BIOMEDICA: An Open Biomedical Image-Caption Archive, Dataset, and Vision-Language Models Derived from Scientific Literature

Paper • 2501.07171 • Published Jan 13 • 55

upvoted a paper about 1 year ago

DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception

Paper • 2410.12628 • Published Oct 16, 2024 • 41

upvoted a collection over 1 year ago

HyenaDNA Models

HyenaDNA models usable directly with Hugging Face classes like AutoModel. • 8 items • Updated Nov 14, 2023 • 19

upvoted a paper over 1 year ago

EHRCon: Dataset for Checking Consistency between Unstructured Notes and Structured Tables in Electronic Health Records

Paper • 2406.16341 • Published Jun 24, 2024 • 14