Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

xb-chang's picture

xb-chang

xb-chang

·

xb-chang

AI & ML interests

None yet

Organizations

None yet

xb-chang 's collections 10

Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients

Paper • 2407.08296 • Published Jul 11, 2024 • 33
Inference Performance Optimization for Large Language Models on CPUs

Paper • 2407.07304 • Published Jul 10, 2024 • 53

Self-Recognition in Language Models

Paper • 2407.06946 • Published Jul 9, 2024 • 25
Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models

Paper • 2407.01906 • Published Jul 2, 2024 • 44

Controlling Space and Time with Diffusion Models

Paper • 2407.07860 • Published Jul 10, 2024 • 17
DisCo-Diff: Enhancing Continuous Diffusion Models with Discrete Latents

Paper • 2407.03300 • Published Jul 3, 2024 • 14
Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion

Paper • 2407.01392 • Published Jul 1, 2024 • 45
No Training, No Problem: Rethinking Classifier-Free Guidance for Diffusion Models

Paper • 2407.02687 • Published Jul 2, 2024 • 25

Video-to-Audio Generation with Hidden Alignment

Paper • 2407.07464 • Published Jul 10, 2024 • 17
MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?

Paper • 2407.04842 • Published Jul 5, 2024 • 56
FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs

Paper • 2407.04051 • Published Jul 4, 2024 • 39

Associative Recurrent Memory Transformer

Paper • 2407.04841 • Published Jul 5, 2024 • 36
Learning to (Learn at Test Time): RNNs with Expressive Hidden States

Paper • 2407.04620 • Published Jul 5, 2024 • 34

Reinforcement Learning

Gradient Boosting Reinforcement Learning

Paper • 2407.08250 • Published Jul 11, 2024 • 13

An accurate detection is not all you need to combat label noise in web-noisy datasets

Paper • 2407.05528 • Published Jul 8, 2024 • 4

vision language models (VLM)

PaliGemma: A versatile 3B VLM for transfer

Paper • 2407.07726 • Published Jul 10, 2024 • 72
Vision language models are blind

Paper • 2407.06581 • Published Jul 9, 2024 • 84
CosmoCLIP: Generalizing Large Vision-Language Models for Astronomical Imaging

Paper • 2407.07315 • Published Jul 10, 2024 • 7
Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision

Paper • 2407.06189 • Published Jul 8, 2024 • 26

Data Generation

UltraEdit: Instruction-based Fine-Grained Image Editing at Scale

Paper • 2407.05282 • Published Jul 7, 2024 • 16
Training Task Experts through Retrieval Based Distillation

Paper • 2407.05463 • Published Jul 7, 2024 • 10
Learning Action and Reasoning-Centric Image Editing from Videos and Simulations

Paper • 2407.03471 • Published Jul 3, 2024 • 31

Associative Recurrent Memory Transformer

Paper • 2407.04841 • Published Jul 5, 2024 • 36
Flash-VStream: Memory-Based Real-Time Understanding for Long Video Streams

Paper • 2406.08085 • Published Jun 12, 2024 • 17

Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients

Paper • 2407.08296 • Published Jul 11, 2024 • 33
Inference Performance Optimization for Large Language Models on CPUs

Paper • 2407.07304 • Published Jul 10, 2024 • 53

Reinforcement Learning

Gradient Boosting Reinforcement Learning

Paper • 2407.08250 • Published Jul 11, 2024 • 13

Self-Recognition in Language Models

Paper • 2407.06946 • Published Jul 9, 2024 • 25
Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models

Paper • 2407.01906 • Published Jul 2, 2024 • 44

An accurate detection is not all you need to combat label noise in web-noisy datasets

Paper • 2407.05528 • Published Jul 8, 2024 • 4

Controlling Space and Time with Diffusion Models

Paper • 2407.07860 • Published Jul 10, 2024 • 17
DisCo-Diff: Enhancing Continuous Diffusion Models with Discrete Latents

Paper • 2407.03300 • Published Jul 3, 2024 • 14
Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion

Paper • 2407.01392 • Published Jul 1, 2024 • 45
No Training, No Problem: Rethinking Classifier-Free Guidance for Diffusion Models

Paper • 2407.02687 • Published Jul 2, 2024 • 25

vision language models (VLM)

PaliGemma: A versatile 3B VLM for transfer

Paper • 2407.07726 • Published Jul 10, 2024 • 72
Vision language models are blind

Paper • 2407.06581 • Published Jul 9, 2024 • 84
CosmoCLIP: Generalizing Large Vision-Language Models for Astronomical Imaging

Paper • 2407.07315 • Published Jul 10, 2024 • 7
Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision

Paper • 2407.06189 • Published Jul 8, 2024 • 26

Video-to-Audio Generation with Hidden Alignment

Paper • 2407.07464 • Published Jul 10, 2024 • 17
MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?

Paper • 2407.04842 • Published Jul 5, 2024 • 56
FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs

Paper • 2407.04051 • Published Jul 4, 2024 • 39

Data Generation

UltraEdit: Instruction-based Fine-Grained Image Editing at Scale

Paper • 2407.05282 • Published Jul 7, 2024 • 16
Training Task Experts through Retrieval Based Distillation

Paper • 2407.05463 • Published Jul 7, 2024 • 10
Learning Action and Reasoning-Centric Image Editing from Videos and Simulations

Paper • 2407.03471 • Published Jul 3, 2024 • 31

Associative Recurrent Memory Transformer

Paper • 2407.04841 • Published Jul 5, 2024 • 36
Learning to (Learn at Test Time): RNNs with Expressive Hidden States

Paper • 2407.04620 • Published Jul 5, 2024 • 34

Associative Recurrent Memory Transformer

Paper • 2407.04841 • Published Jul 5, 2024 • 36
Flash-VStream: Memory-Based Real-Time Understanding for Long Video Streams

Paper • 2406.08085 • Published Jun 12, 2024 • 17

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs