type
status
date
slug
summary
tags
category
icon
password
notion image
ICCV Poster Controllable Latent Space Augmentation for Digital Pathology
Whole slide image (WSI) analysis in digital pathology presents unique challenges due to the gigapixel resolution of WSIs and the scarcity of dense supervision signals. While Multiple Instance Learning (MIL) is a natural fit for slide-level tasks, training robust models requires large and diverse datasets. Even though image augmentation techniques could be utilized to increase data variability and reduce overfitting, implementing them effectively is not a trivial task. Traditional patch-level augmentation is prohibitively expensive due to the large number of patches extracted from each WSI, and existing feature-level augmentation methods lack control over transformation semantics. We introduce HistAug, a fast and efficient generative model for controllable augmentations in the latent space for digital pathology. By conditioning on explicit patch-level transformations (e.g., hue, erosion), HistAug generates realistic augmented embeddings while preserving initial semantic information. Our method allows the processing of a large number of patches in a single forward pass efficiently, while at the same time consistently improving MIL model performance. Experiments across multiple slide-level tasks and diverse organs show that HistAug outperforms existing methods, particularly in low-data regimes. Ablation studies confirm the benefits of learned transformations over noise-based perturbations and highlight the importance of uniform WSI-wise augmentation.
 
这篇论文关注了一个非常有意思的问题,如何将图像训练中的常规增强应用到病理学图像。
以往大多数工作并未将像素级层面的增强用在模型上,关键原因在于 WSI 通常为千亿像素级,图像增强成本极高,而这就丢失了传统意义上的图像增强步骤。HistAug 的想法很简单,我们能否训练个生成模型,来模拟这个图像增强的操作。这样无需原始图片,我可以直接在特征层面做增强。HistAug 的训练则与大多数条件控制的生成模型类似
这篇论文让我联想到之前解读的 Distribution-aware Knowledge Aligning and Prototyping for Non-exemplar Lifelong Person Re-Identification。那篇文章是训练一个网络,把增强后的图片映射回未增强的图片,刚好反过来。