Papers
arxiv:2511.20253

Zoo3D: Zero-Shot 3D Object Detection at Scene Level

Published on Nov 25, 2025
Authors:
,
,
,
,
,

Abstract

Zoo3D enables training-free 3D object detection through graph clustering of 2D masks and open-vocabulary semantic labeling, achieving state-of-the-art performance in open-vocabulary 3D detection across multiple benchmarks.

AI-generated summary

3D object detection is fundamental for spatial understanding. Real-world environments demand models capable of recognizing diverse, previously unseen objects, which remains a major limitation of closed-set methods. Existing open-vocabulary 3D detectors relax annotation requirements but still depend on training scenes, either as point clouds or images. We take this a step further by introducing Zoo3D, the first training-free 3D object detection framework. Our method constructs 3D bounding boxes via graph clustering of 2D instance masks, then assigns semantic labels using a novel open-vocabulary module with best-view selection and view-consensus mask generation. Zoo3D operates in two modes: the zero-shot Zoo3D_0, which requires no training at all, and the self-supervised Zoo3D_1, which refines 3D box prediction by training a class-agnostic detector on Zoo3D_0-generated pseudo labels. Furthermore, we extend Zoo3D beyond point clouds to work directly with posed and even unposed images. Across ScanNet200 and ARKitScenes benchmarks, both Zoo3D_0 and Zoo3D_1 achieve state-of-the-art results in open-vocabulary 3D object detection. Remarkably, our zero-shot Zoo3D_0 outperforms all existing self-supervised methods, hence demonstrating the power and adaptability of training-free, off-the-shelf approaches for real-world 3D understanding. Code is available at https://github.com/col14m/zoo3d .

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2511.20253 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2511.20253 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2511.20253 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.