Qifan Zhang's picture

Qifan Zhang

firefighter

·

AI & ML interests

None yet

Recent Activity

upvoted a paper about 6 hours ago

Virtual Width Networks

upvoted a paper 5 days ago

Qwen-Image Technical Report

upvoted a paper 11 days ago

Visual Backdoor Attacks on MLLM Embodied Decision Making via Contrastive Trigger Learning

View all activity

Organizations

None yet

upvoted a paper about 6 hours ago

Virtual Width Networks

Paper • 2511.11238 • Published 3 days ago • 19

upvoted a paper 5 days ago

Qwen-Image Technical Report

Paper • 2508.02324 • Published Aug 4 • 261

upvoted a paper 11 days ago

Visual Backdoor Attacks on MLLM Embodied Decision Making via Contrastive Trigger Learning

Paper • 2510.27623 • Published 17 days ago • 12

upvoted a paper 12 days ago

INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats

Paper • 2510.25602 • Published 19 days ago • 69

upvoted a collection 9 months ago

FP8 LLMs for vLLM

Accurate FP8 quantized models by Neural Magic, ready for use with vLLM! • 44 items • Updated Oct 17, 2024 • 76

upvoted an article 10 months ago

Article

Finally, a Replacement for BERT: Introducing ModernBERT

Dec 19, 2024

•

706

upvoted a collection about 1 year ago

Emu3

Emu3: Next-Token Prediction is All You Need • 7 items • Updated Feb 13 • 78

upvoted 2 articles over 1 year ago

Article

Llama 3.1 - 405B, 70B & 8B with multilinguality and long context

Jul 23, 2024

•

238

Article

Training and Finetuning Embedding Models with Sentence Transformers v3

May 28, 2024

•

259

upvoted a paper over 1 year ago

Automatic Assessment of Divergent Thinking in Chinese Language with TransDis: A Transformer-Based Language Model Approach

Paper • 2306.14790 • Published Jun 26, 2023 • 2

upvoted an article over 1 year ago

Article

Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models

Jun 24, 2024

•

202

upvoted 2 collections over 1 year ago

Qwen

Qwen • 16 items • Updated Jul 21 • 21

Qwen2

Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 39 items • Updated Jul 21 • 372

upvoted a paper over 1 year ago

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Paper • 2403.03507 • Published Mar 6, 2024 • 189

upvoted 2 collections over 1 year ago

Table Transformer

The Table Transformer (TATR) is a series of object detection models useful for table extraction from PDF images. • 5 items • Updated May 1 • 26

Qwen1.5

Qwen1.5 is the improved version of Qwen, the large language model series developed by Alibaba Cloud. • 55 items • Updated Jul 21 • 210