Diffusion-based models have recently transformed
conditional image generation, achieving
unprecedented fidelity in generating
photorealistic and semantically accurate images.
However, consistently generating high-quality
images remains challenging, partly due to the
lack of mechanisms for conditioning outputs on
perceptual quality. In this work, we propose
methods to integrate image quality assessment
(IQA) models into diffusion-based generators,
enabling quality-aware image generation.
First, we experiment
with gradient-based guidance to optimize image
quality directly and show this approach has
limited generalizability. To address this, we
introduce
IQA-Adapter
, a novel architecture that conditions
generation on target quality levels by learning
the relationship between images and quality
scores. When conditioned on high target quality,
IQA-Adapter shifts the distribution of generated
images towards a higher-quality subdomain. This
approach achieves up to a 10% improvement across
multiple objective metrics, as confirmed by a
subjective study, while preserving generative
diversity and content. Additionally, IQA-Adapter
can be used inversely as a degradation model,
generating progressively more distorted images
when conditioned on lower quality scores. Our
quality-aware methods also provide insights into
the adversarial robustness of IQA models,
underscoring the potential of quality
conditioning in generative modeling and the
importance of robust IQA methods.
IQA-Adapter is a tool that combines Image Quality/Aesthetics Assessment (IQA/IAA) models with image-generation and enables quality-aware generation with diffusion-based models. It allows to condition image generators on target quality/aesthetics scores. IQA-Adapter is based on IP-Adapter architecture.
We use the IP-Adapter technique to condition the generative model on image quality by integrating visual quality scores into the model without altering core weights, implemented as the IQA-Adapter. The IQA-Adapter projects these scores into tokens processed through decoupled cross-attention layers, enabling adjustments based on specified quality scores. It can integrate multiple image fidelity aspects and includes a scaling parameter to control quality conditioning impact during inference.
The results depicted in Figure (a) show that IQA-Adapters trained on various IQA/IAA models consistently improve image quality over the base model, with average relative gains of 4-6%. Conditioning on high target quality (99th percentile) leads to quality improvements across multiple metrics, demonstrating cross-metric transferability. The subjective study in Figure (b) further confirms these results, where participants preferred images generated with quality-conditioned IQA-Adapters over those from the base model, especially at higher quality levels. Pairwise win rates in Figure (c) indicate that images conditioned on the highest quality levels outperform those conditioned on medium or low levels. Thus, IQA-Adapters can be used for improving both objective and perceived image quality.
Figure illustrates the impact of varying quality levels on generated images using the IQA-Adapter conditioned on different percentiles (1st to 99th) of the target quality metric. The distributions in (a) show progressively higher IQA scores as the target quality increases, while (b) provides example images, showcasing sharper and more detailed visuals at higher quality levels. These results highlight the adapter's ability to align generated image quality with the specified input conditions.
We uncover how IQA models can be pushed to their limits, revealing hidden vulnerabilities and biases. Under high guidance scales, the gradient-based method manipulates models to generate adversarial patterns unique to each, while IQA-Adapters expose subtle preferences like TOPIQ’s love for sharpness or LAION-AES’s affinity for vibrant colors. When pushed further with negative-quality guidance, these models inflate scores by exploiting biases, creating overly-stylized but hollow improvements. These discoveries show how IQA-Adapters can serve as powerful tools to probe and challenge the robustness of IQA models.
@misc{iqaadapter,
title={IQA-Adapter: Exploring Knowledge Transfer from Image Quality Assessment to Diffusion-based Generative Models},
author={Khaled Abud and Sergey Lavrushkin and Alexey Kirillov and Dmitriy Vatolin},
year={2024},
eprint={2412.01794},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2412.01794},
}