Papers
arxiv:2512.07806

Multi-view Pyramid Transformer: Look Coarser to See Broader

Published on Dec 8
· Submitted by Gyeongjin Kang on Dec 9
Authors:
,
,
,
,

Abstract

MVP, a scalable multi-view transformer architecture, efficiently reconstructs large 3D scenes from multiple images using dual hierarchies and achieves state-of-the-art quality.

AI-generated summary

We propose Multi-view Pyramid Transformer (MVP), a scalable multi-view transformer architecture that directly reconstructs large 3D scenes from tens to hundreds of images in a single forward pass. Drawing on the idea of ``looking broader to see the whole, looking finer to see the details," MVP is built on two core design principles: 1) a local-to-global inter-view hierarchy that gradually broadens the model's perspective from local views to groups and ultimately the full scene, and 2) a fine-to-coarse intra-view hierarchy that starts from detailed spatial representations and progressively aggregates them into compact, information-dense tokens. This dual hierarchy achieves both computational efficiency and representational richness, enabling fast reconstruction of large and complex scenes. We validate MVP on diverse datasets and show that, when coupled with 3D Gaussian Splatting as the underlying 3D representation, it achieves state-of-the-art generalizable reconstruction quality while maintaining high efficiency and scalability across a wide range of view configurations.

Community

Paper author Paper submitter

We are excited to share our recent work "Multi-view Pyramid Transformer: Look Coarser to See Broader"

Paper: https://arxiv.org/abs/2512.07806
Project page: https://gynjn.github.io/MVP/
Code: https://github.com/Gynjn/MVP

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2512.07806 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2512.07806 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2512.07806 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.