File size: 6,128 Bytes
39131d7
 
 
cc5c6dc
 
e7280a0
39131d7
a518594
 
 
39131d7
 
cc5c6dc
39131d7
 
a518594
39131d7
a518594
 
 
 
 
 
 
 
 
 
 
 
 
0ab88eb
a518594
 
 
 
 
 
 
 
 
39131d7
a518594
 
b223975
e7280a0
 
b223975
 
a518594
76984fa
a518594
 
39131d7
a518594
 
 
 
 
 
 
 
 
 
 
 
 
 
76984fa
 
a518594
0ab88eb
a518594
 
39131d7
b223975
 
 
 
 
 
 
 
 
39131d7
b223975
 
 
 
39131d7
cc5c6dc
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
---
language:
- en
license: openrail++
pipeline_tag: image-to-image
library_name: diffusers
tags:
- intrinsic decomposition
- image analysis
- computer vision
- in-the-wild
- zero-shot
pinned: true
---

<h1 align="center">Marigold Intrinsic Image Decomposition (IID) Appearance v1-1 Model Card</h1>

<p align="center">
<a title="Image IID" href="https://huggingface.co/spaces/prs-eth/marigold-iid" target="_blank" rel="noopener noreferrer" style="display: inline-block;">
    <img src="https://img.shields.io/badge/%F0%9F%A4%97%20Image%20IID%20-Demo-yellow" alt="Image IID">
</a>
<a title="diffusers" href="https://huggingface.co/docs/diffusers/using-diffusers/marigold_usage" target="_blank" rel="noopener noreferrer" style="display: inline-block;">
    <img src="https://img.shields.io/badge/%F0%9F%A4%97%20diffusers%20-Integration%20🧨-yellow" alt="diffusers">
</a>
<a title="Github" href="https://github.com/prs-eth/marigold" target="_blank" rel="noopener noreferrer" style="display: inline-block;">
    <img src="https://img.shields.io/github/stars/prs-eth/marigold?label=GitHub%20%E2%98%85&logo=github&color=C8C" alt="Github">
</a>
<a title="Website" href="https://marigoldcomputervision.github.io/" target="_blank" rel="noopener noreferrer" style="display: inline-block;">
    <img src="https://img.shields.io/badge/%E2%99%A5%20Project%20-Website-blue" alt="Website">
</a>
<a title="arXiv" href="https://arxiv.org/abs/2505.09358" target="_blank" rel="noopener noreferrer" style="display: inline-block;">
    <img src="https://img.shields.io/badge/%F0%9F%93%84%20Read%20-Paper-AF3436" alt="arXiv">
</a>
<a title="Social" href="https://twitter.com/antonobukhov1" target="_blank" rel="noopener noreferrer" style="display: inline-block;">
    <img src="https://img.shields.io/twitter/follow/:?label=Subscribe%20for%20updates!" alt="Social">
</a>
<a title="License" href="https://huggingface.co/stabilityai/stable-diffusion-2/blob/main/LICENSE-MODEL" target="_blank" rel="noopener noreferrer" style="display: inline-block;">
    <img src="https://img.shields.io/badge/License-OpenRAIL++-929292" alt="License">
</a>
</p>

This is a model card for the `marigold-iid-appearance-v1-1` model for single-image Intrinsic Image Decomposition (IID). 
The model is fine-tuned from the `stable-diffusion-2` [model](https://huggingface.co/stabilityai/stable-diffusion-2) as 
described in our papers:
- [CVPR'2024 paper](https://hf.co/papers/2312.02145) titled "Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation"
- [Journal extension](https://hf.co/papers/2505.09358) titled "Marigold: Affordable Adaptation of Diffusion-Based Image Generators for Image Analysis"

### Using the model
This model type (`appearance`) is trained to perform InteriorVerse decomposition into **Albedo** and two **BRDF material** properties: **roughness** and **metallicity**.
Both the input image and the output albedo are in the sRGB color space.
For an alternative model type (`lighting`) that performs decomposition into Albedo, Diffuse shading, and Non-diffuse residual, click 
[here](https://huggingface.co/prs-eth/marigold-iid-lighting-v1-1).

- Play with the interactive [Hugging Face Spaces demo](https://huggingface.co/spaces/prs-eth/marigold-iid): check out how the model works with example images or upload your own.
- Use it with [diffusers](https://huggingface.co/docs/diffusers/using-diffusers/marigold_usage) to compute the results with a few lines of code.
- Get to the bottom of things with our [official codebase](https://github.com/prs-eth/marigold).

## Model Details
- **Developed by:** [Bingxin Ke](http://www.kebingxin.com/), [Kevin Qu](https://ch.linkedin.com/in/kevin-qu-b3417621b), [Tianfu Wang](https://tianfwang.github.io/), [Nando Metzger](https://nandometzger.github.io/), [Shengyu Huang](https://shengyuh.github.io/), [Bo Li](https://www.linkedin.com/in/bobboli0202), [Anton Obukhov](https://www.obukhov.ai/), [Konrad Schindler](https://scholar.google.com/citations?user=FZuNgqIAAAAJ).
- **Model type:** Generative latent diffusion-based intrinsic image decomposition (appearance: albedo, roughness, and metallicity) from a single image.
- **Language:** English.
- **License:** [CreativeML Open RAIL++-M License](https://huggingface.co/stabilityai/stable-diffusion-2/blob/main/LICENSE-MODEL).
- **Model Description:** This model can be used to generate an estimated intrinsic image decomposition of an input image. 
  - **Resolution**: Even though any resolution can be processed, the model inherits the base diffusion model's effective resolution of roughly **768** pixels. 
    This means that for optimal predictions, any larger input image should be resized to make the longer side 768 pixels before feeding it into the model.
  - **Steps and scheduler**: This model was designed for usage with **DDIM** scheduler and between **1 and 50** denoising steps.
  - **Outputs**:
    - **Albedo**: The predicted values are between 0 and 1, sRGB space.
    - **Roughness and metallicity**: The predicted values are between 0 and 1, linear space.
    - **Uncertainty maps**: Produced for each modality only when multiple predictions are ensembled with ensemble size larger than 2.
- **Resources for more information:** [Project Website](https://marigoldcomputervision.github.io/), [Paper](https://arxiv.org/abs/2505.09358), [Code](https://github.com/prs-eth/marigold).
- **Cite as:**

```bibtex
@misc{ke2025marigold,
  title={Marigold: Affordable Adaptation of Diffusion-Based Image Generators for Image Analysis},
  author={Bingxin Ke and Kevin Qu and Tianfu Wang and Nando Metzger and Shengyu Huang and Bo Li and Anton Obukhov and Konrad Schindler},
  year={2025},
  eprint={2505.09358},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

@InProceedings{ke2023repurposing,
  title={Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation},
  author={Bingxin Ke and Anton Obukhov and Shengyu Huang and Nando Metzger and Rodrigo Caye Daudt and Konrad Schindler},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2024}
}
```