Foundation medical segmentation models, with MedSAM being the most popular, have achieved promising performance across organs and lesions. However, MedSAM still suffers from compromised performance on specific lesions with intricate structures and appearance, as well as bounding box prompt-induced perturbations. Although current test-time adaptation (TTA) methods for medical image segmentation may tackle this issue, partial (e.g., batch normalization) or whole parametric updates restrict their effectiveness due to limited update signals or catastrophic forgetting in large models. Meanwhile, these approaches ignore the computational complexity during adaptation, which is particularly significant for modern foundation models. To this end, our theoretical analyses reveal that directly refining image embeddings is feasible to approach the same goal as parametric updates under the MedSAM architecture, which enables us to realize high computational efficiency and segmentation performance without the risk of catastrophic forgetting. Under this framework, we propose to encourage maximizing factorized conditional probabilities of the posterior prediction probability using a proposed distribution-approximated latent conditional random field loss combined with an entropy minimization loss. Experiments show that we achieve about 3% Dice score improvements across three datasets while reducing computational complexity by over 7 times.
Visulized process of latent refinement with iterations. The first, second, and third rows denote the segmentation result, corresponding visualization of channel-average refined latent embedding and corresponding DAL-CRF loss for each spatial position.
@article{kecheng2025ICCV,
title={Test-time Adaptation for Foundation Medical Segmentation Model without
Parametric Updates},
author={Kecheng Chen, Xinyu Luo, Tiexin Qin, Jie Liu, Hui Liu,Victor Ho Fun Lee, Hong Yan, and Haoliang Li},
conference={International Conference on Computer Vision},
year={2025},
}