Abstract
Fast3Dcache accelerates 3D diffusion model inference with minimal geometric quality degradation by using a geometry-aware caching framework with dynamic cache quotas and spatiotemporal stability criteria.
Diffusion models have achieved impressive generative quality across modalities like 2D images, videos, and 3D shapes, but their inference remains computationally expensive due to the iterative denoising process. While recent caching-based methods effectively reuse redundant computations to speed up 2D and video generation, directly applying these techniques to 3D diffusion models can severely disrupt geometric consistency. In 3D synthesis, even minor numerical errors in cached latent features accumulate, causing structural artifacts and topological inconsistencies. To overcome this limitation, we propose Fast3Dcache, a training-free geometry-aware caching framework that accelerates 3D diffusion inference while preserving geometric fidelity. Our method introduces a Predictive Caching Scheduler Constraint (PCSC) to dynamically determine cache quotas according to voxel stabilization patterns and a Spatiotemporal Stability Criterion (SSC) to select stable features for reuse based on velocity magnitude and acceleration criterion. Comprehensive experiments show that Fast3Dcache accelerates inference significantly, achieving up to a 27.12% speed-up and a 54.8% reduction in FLOPs, with minimal degradation in geometric quality as measured by Chamfer Distance (2.48%) and F-Score (1.95%).
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Sphinx: Efficiently Serving Novel View Synthesis using Regression-Guided Selective Refinement (2025)
- Wukong's 72 Transformations: High-fidelity Textured 3D Morphing via Flow Models (2025)
- WorldGrow: Generating Infinite 3D World (2025)
- LoG3D: Ultra-High-Resolution 3D Shape Modeling via Local-to-Global Partitioning (2025)
- ReconViaGen: Towards Accurate Multi-view 3D Object Reconstruction via Generation (2025)
- PartDiffuser: Part-wise 3D Mesh Generation via Discrete Diffusion (2025)
- Diff4Splat: Controllable 4D Scene Generation with Latent Dynamic Reconstruction Models (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper