We present an empirical study of properties of the remarkable
Normalized Compression Distance (NCD), a mathematical formulation by
M. Li et al. that quantifies similarity between two binary strings as
the additional amount of algorithmic information required to transform
the description of one to the other. In particular, we are interested
in the NCD values between an image and its modified versions by common
image processing techniques. Experimental data obtained indicate that
the NCD is symmetric and transitive, and that it can be used as a
reasonable perceptual measure of similarity, but only in the spatial
domain. Further, the NCD clusters the common image processing
techniques into three groups in a manner consistent with the human
perception of similarity.
We also introduce two independent modifications to the NCD and study
their properties. The first modification calls for applying a median
filter and then thresholding to the input images before the NCD is
computed, which results in a more uniform distribution of the NCD
values. The second modification modifies the NCD formula to reflect
the running time as well as the size of the shortest program that
transforms one input string to the other. Obtained data show that
using this modified NCD, it is possible to classify subtle changes
(e.g., watermarking) to a font character image as similar to the
original and drastic changes (e.g., rotation by 90 degrees) as
different.
We use an information-theoretic distortion measure called the Normalized Compression Distance (NCD), first
proposed by M. Li et al., to determine whether two rectangular gray-scale images are visually distinguishable to
a human observer. Image distinguishability is a fundamental constraint on operations carried out by all players
in an image watermarking system.
The NCD between two binary strings is defined in terms of compressed sizes of the two strings and of their
concatenation; it is designed to be an effective approximation of the noncomputable but universal Kolmogorov
distance between two strings. We compare the effectiveness of different types of compression algorithms? in
predicting image distinguishability when they are used to compute the NCD between a sample of images and
their watermarked counterparts. Our experiment shows that, as predicted by Li's theory, the NCD is largely
independent of the underlying compression algorithm.
However, in some cases the NCD fails as a predictor of image distinguishability, since it is designed to measure
the more general notion of similarity. We propose and study a modified version of the NCD to model the latter,
which requires that not only the change be small but also in some sense random with respect to the original
image.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.