Slava's picture

265

Slava

wertlon

slava-qw

AI & ML interests

CV, GenAI

Recent Activity

upvoted a paper 6 days ago

Don't Blind Your VLA: Aligning Visual Representations for OOD Generalization

upvoted a paper 21 days ago

WithAnyone: Towards Controllable and ID Consistent Image Generation

upvoted a paper 21 days ago

Point Prompting: Counterfactual Tracking with Video Diffusion Models

View all activity

Organizations

None yet

upvoted a paper 6 days ago

Don't Blind Your VLA: Aligning Visual Representations for OOD Generalization

Paper • 2510.25616 • Published 15 days ago • 90

upvoted 9 papers 21 days ago

WithAnyone: Towards Controllable and ID Consistent Image Generation

Paper • 2510.14975 • Published 28 days ago • 80

Point Prompting: Counterfactual Tracking with Video Diffusion Models

Paper • 2510.11715 • Published Oct 13 • 2

The Role of Computing Resources in Publishing Foundation Model Research

Paper • 2510.13621 • Published 29 days ago • 15

CVD-STORM: Cross-View Video Diffusion with Spatial-Temporal Reconstruction Model for Autonomous Driving

Paper • 2510.07944 • Published Oct 9 • 24

PhysMaster: Mastering Physical Representation for Video Generation via Reinforcement Learning

Paper • 2510.13809 • Published 29 days ago • 36

FlashWorld: High-quality 3D Scene Generation within Seconds

Paper • 2510.13678 • Published 29 days ago • 70

Temporal Alignment Guidance: On-Manifold Sampling in Diffusion Models

Paper • 2510.11057 • Published Oct 13 • 30

FlashVSR: Towards Real-Time Diffusion-Based Streaming Video Super-Resolution

Paper • 2510.12747 • Published about 1 month ago • 36

Advancing End-to-End Pixel Space Generative Modeling via Self-supervised Pre-training

Paper • 2510.12586 • Published about 1 month ago • 107

upvoted 10 papers about 1 month ago

LikePhys: Evaluating Intuitive Physics Understanding in Video Diffusion Models via Likelihood Preference

Paper • 2510.11512 • Published Oct 13 • 6

SPG: Sandwiched Policy Gradient for Masked Diffusion Language Models

Paper • 2510.09541 • Published Oct 10 • 14

AdaViewPlanner: Adapting Video Diffusion Models for Viewpoint Planning in 4D Scenes

Paper • 2510.10670 • Published Oct 12 • 18

DiT360: High-Fidelity Panoramic Image Generation via Hybrid Training

Paper • 2510.11712 • Published Oct 13 • 30

Diffusion Transformers with Representation Autoencoders

Paper • 2510.11690 • Published Oct 13 • 161

Speculative Jacobi-Denoising Decoding for Accelerating Autoregressive Text-to-image Generation

Paper • 2510.08994 • Published Oct 10 • 3

Instant4D: 4D Gaussian Splatting in Minutes

Paper • 2510.01119 • Published Oct 1 • 6

Which Heads Matter for Reasoning? RL-Guided KV Cache Compression

Paper • 2510.08525 • Published Oct 9 • 22

TAG:Tangential Amplifying Guidance for Hallucination-Resistant Diffusion Sampling

Paper • 2510.04533 • Published Oct 6 • 47

Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation

Paper • 2510.08673 • Published Oct 9 • 121