What Changed? Detecting and Evaluating Instruction-Guided Image Edits with Multimodal Large Language Models

Published in ICCV, 2025

We present DICE, a novel framework to detect and evaluate instruction-guided image edits by identifying differences between original and edited images and assessing their coherence with the editing prompt.

Recommended citation: L.Baraldi, D. Bucciarelli, F. Betti, M. Cornia, L. Baraldi, et al. (2025). "What Changed? Detecting and Evaluating Instruction-Guided Image Edits with Multimodal Large Language Models." arXiv preprint 2505.20405.
Download Paper

Lorenzo Baraldi