Diffusion Language Models (DLMs) show strong capabilities in any-order generation, but current decoding strategies rely heavily on single-step confidence or entropy. This often results in locally optimal yet globally inconsistent sampling trajectories. We introduce Coherent Contextual Decoding (CCD), a framework that leverages predictive consistency across diffusion steps by approximating a target distribution through context marginalization. We further propose CCDβDS, an adaptive decoding strategy that dynamically adjusts the unmasking budget based on token-level context sensitivity. Across Dream and LLaDA models, CCD achieves up to 3.48Γ speedup and 3.91% accuracy improvement, simultaneously enhancing efficiency and generation quality.
Current DLM decoding methods select tokens based on single-step confidence, but this approach suffers from:
CCD uses historical context predictions to build a more stable and globally coherent decoding trajectory.
CCD forms a more stable predictive distribution by averaging predictions across multiple diffusion steps.
We show CCD optimizes an entropy expression involving I(x; c | s), grounding the framework theoretically.
Budget increases for easy contexts and decreases for difficult contexts.
@article{chen2025ccd,
title={Beyond Confidence: Adaptive and Coherent Decoding for Diffusion Language Models},
author={Chen, Kecheng and Liu, Ziru and Tao, Xijia and Liu, Hui and Fu, Xinyu and Zhang, Suiyun and Tu, Dandan and Kong, Lingpeng and Liu, Rui and Li, Haoliang},
journal={arXiv preprint arXiv:2025.xxxxx},
year={2025}
}