Label-Efficient Semantic Segmentation with Diffusion Models

Dmitry Baranchuk

Ivan Rubachev

Andrey Voynov

Valentin Khrulkov

Artem Babenko

Yandex Research

ICLR 2022

Paper [arXiv]

Code [GitHub]

(1) Adding noise to a real image according to the forward diffusion process q.
(2) Extracting pixel-level image representations from the pretrained DDPM.
(3) Applying an ensemble of MLPs to predict a class label for each pixel representation.

The paper investigates the representations learned by the state-of-the-art DDPMs and shows that they capture high-level semantic information valuable for downstream vision tasks. We design a simple semantic segmentation approach that exploits these representations and outperforms the alternatives in the few-shot operating point.

Abstract

Denoising diffusion probabilistic models have recently received much research attention since they outperform alternative approaches, such as GANs, and currently provide state-of-the-art generative performance. The superior performance of diffusion models has made them an appealing tool in several applications, including inpainting, super-resolution, and semantic editing. In this paper, we demonstrate that diffusion models can also serve as an instrument for semantic segmentation, especially in the setup when labeled data is scarce. In particular, for several pretrained diffusion models, we investigate the intermediate activations from the networks that perform the Markov step of the reverse diffusion process. We show that these activations effectively capture the semantic information from an input image and appear to be excellent pixel-level representations for the segmentation problem. Based on these observations, we describe a simple segmentation method, which can work even if only a few training images are provided. Our approach significantly outperforms the existing alternatives on several datasets for the same amount of human supervision.

Results

Examples of predicted segmentation masks along with the groundtruth.

Paper

D. Baranchuk, I. Rubachev, A. Voynov, V. Khrulkov, A. Babenko.
Label-Efficient Semantic Segmentation with Diffusion Models
Correspondence to Dmitry Baranchuk.

[OpenReview] [Bibtex]

Acknowledgements

This template was originally made by Phillip Isola and Richard Zhang for a colorful ECCV project; the code can be found here.