The robust design and adaptation of multimedia networks relies on the study of the influence of potential network impairments on the perceived quality. Video quality may be affected by network impairments, such as delay, jitter, packet loss, and bandwidth, and the perceptual impact of these impairments may vary according to the video content. The effects of packet loss and encoding artifacts on the perceived quality have been widely addressed in the literature. However, the relationship between video content and network impairments on the perceived video quality has not been deeply investigated. A detailed analysis of ReTRiEVED test video dataset, designed by considering a set of potential network impairments, is presented, and the effects of transmission impairments on perceived quality are analyzed. Furthermore, the impact on the perceived quality of the video content in the presence of transmission impairments is studied by using video content descriptors. Finally, the performances of well-known quality metrics are tested on the proposed dataset.
Face processing techniques for automatic recognition in broadcast video attract the research interest because of its value in applications, such as video indexing, retrieval, and summarization. In multimedia press review, the automatic annotation of broadcasting news programs is a challenging task because people can appear with large appearance variations such as hair styles, illumination conditions and poses that make the comparison between similar faces more difficult. In this paper a technique for automatic face identification in TV broadcasting programs based on a gallery of faces downloaded from Web is proposed. The approach is based on a joint use of Scale Invariant Feature Transform descriptor and Eigenfaces-based algorithms and it has been tested on video sequences using a database of images acquired starting from a web search. Experimental results show that the joint use of these two approaches improves the recognition rate in case of use Standard Definition (SD) and High Definition (HD) standards.
The use of 3D video is growing in several fields such as entertainment, military simulations, medical applications. However, the process of recording, transmitting, and processing 3D video is prone to errors thus producing artifacts that may affect the perceived quality. Nowadays a challenging task is the definition of a new metric able to predict the perceived quality with low computational complexity in order to be used in real-time applications. The research in this field is very active due to the complexity of the analysis of the influence of stereoscopic cues. In this paper we present a novel stereoscopic metric based on the combination of relevant features able to predict the subjective quality rating in a more accurate way.
KEYWORDS: Video, Video compression, Molybdenum, Databases, Video processing, Motion measurement, Video coding, Data processing, Computer programming, Image quality
In this article the effects of video content on Quality of Experience (QoE) have been presented. Delivery of the
video content with high level of QoE from bandwidth-limited and error-prone network is of crucial importance
for the service providers. Therefore, it is of fundamental importance to analyse the impact of the network
impairments and video content on perceived quality during the QoE metric design. The major contributions of
the article are in the study of i)the impact of network impairments together with video content, ii) impact of the
video content and ii) the impact of video content related parameters: spatial-temporal perceptual information,
video content, and frame size on QoE has been presented. The results show that when the impact of impairments
on perceived quality is low, the quality is significantly influenced by video content, and video content itself also
has a significant impact on QoE. Finally, the results strengthen the need for new parameter characterization, for
better QoE metric design.
The automatic labeling of faces in TV broadcasting is still a challenging problem. The high variability in view points, facial expressions,
general appearance, and lighting conditions, as well as occlusions, rapid shot changes, and camera motions, produce
significant variations in image appearance. The application of automatic tools for face recognition is not yet fully established
and the human intervention is needed. In this paper, we deal with the automatic face recognition in TV broadcasting programs.
The target of the proposed method is to identify the presence of a specific person in a video by means of a set of images
downloaded from Web using a specific search key.
A new technique for texture segmentation is presented. The method is based on the use of Laguerre Gauss (LG) functions, which allow an efficient representation of textures. In particular, the marginal densities of the LG expansion coefficients are approximated by the generalized Gaussian densities, which are completely described by two parameters. The classification and the segmentation steps are performed by using a modified k -means algorithm exploiting the Kullback–Leibler divergence as similarity metric. This clustering method is a more efficient system for texture comparison, thus resulting in a more accurate segmentation. The effectiveness of the proposed method is evaluated by using mosaic image sets created by using the Brodatz dataset, and real images.
In this paper we present a novel image quality assessment technique for evaluating virtual synthesized views in the context of multi-view video. In particular, Free Viewpoint Videos are generated from uncompressed color views and their compressed associated depth maps by means of the View Synthesis Reference Software, provided by MPEG. Prior to the synthesis step, the original depth maps are encoded with different coding algorithms thus leading to the creation of additional artifacts in the synthesized views. The core of proposed wavelet-based metric is in the registration procedure performed to align the synthesized view and the original one, and in the skin detection that has been applied considering that the same distortion is more annoying if visible on human subjects rather than on other parts of the scene. The effectiveness of the metric is evaluated by analyzing the correlation of the scores obtained with the proposed metric with Mean Opinion Scores collected by means of subjective tests. The achieved results are also compared against those of well known objective quality metrics. The experimental results confirm the effectiveness of the proposed metric.
Person re-identification through a camera network deals with finding a correct link between consecutive observations
of the same target among different cameras in order to choose the most probable correspondence
among a set of possible matches. This task is particularly challenging in presence of low-resolution camera
networks. In this work, a method for people re-identification in a framework of low-resolution camera network
is presented. The proposed approach can be divided in two parts. First, the illumination changes of a target
while crossing the network is analyzed. The color structure is evaluated using a novel color descriptor, the
Color Structure Descriptor, which describes the differences of dominant colors between two regions of interest.
Afterwards, a new pruning system for the links, the Target Color Structure is proposed. Results shows that
the improvements achieved applying Target Color Structure control are up to 4% for the top rank and up to
16% considering the first eleven more similar candidates.
Reversible data hiding deals with the insertion of auxiliary information into a host data without causing any permanent degradation to the original signal. In this contribution a high capacity reversible data hiding scheme, based on the classical difference expansion insertion algorithm, is presented. The method exploits a prediction stage, followed by prediction errors modification, both in the spatial domain and in the S-transform domain. Such two step embedding allows us to achieve high embedding capacity while preserving a high image quality, as demonstrated in the experimental results.
In this paper a methodology for digital image forgery detection by means of an unconventional use of image
quality assessment is addressed. In particular, the presence of differences in quality degradations impairing
the images is adopted to reveal the mixture of different source patches. The ratio behind this work is in
the hypothesis that any image may be affected by artifacts, visible or not, caused by the processing steps:
acquisition (i.e., lens distortion, acquisition sensors imperfections, analog to digital conversion, single sensor to
color pattern interpolation), processing (i.e., quantization, storing, jpeg compression, sharpening, deblurring,
enhancement), and rendering (i.e., image decoding, color/size adjustment). These defects are generally spatially
localized and their strength strictly depends on the content. For these reasons they can be considered as a
fingerprint of each digital image. The proposed approach relies on a combination of image quality assessment
systems. The adopted no-reference metric does not require any information about the original image, thus
allowing an efficient and stand-alone blind system for image forgery detection. The experimental results show
the effectiveness of the proposed scheme.
The use of ear information for people identification has been under testing at least for 100 years. However, it is still an open issue if the ears can be considered unique or unique enough to be used as biometric feature. In this paper a biometric system for human identification based on ear recognition is presented. The ear is modeled as a set of contours extracted from the ear image with an edge potential function. The matching algorithm has been tested in presence of several image modifications. Two human ear databases have been used for the tests. The experimental results show the effectiveness of the proposed scheme.
In this work a novel technique for detecting and segmenting textured areas in natural images is presented.
The method is based on the circular harmonic function, and, in particular, on the Laguerre Gauss functions.
The detection of the textured areas is performed by analyzing the mean, the mode, and the skewness of the
marginal densities of the Laguerre Gauss coefficients. By using these parameters a classification of the patch
and of the pixel, is performed. The feature vectors representing the textures are built using the parameters of
the Generalized Gaussian Densities that approximate the marginal densities of the Laguerre Gauss functions
computed at three different resolutions. The feature vectors are clustered by using the K-means algorithm in
which the symmetric Kullback-Leibler distance is adopted. The experimental results, obtained by using a set of
natural images, show the effectiveness of the proposed technique.
Sport practice can take advantage from the quantitative assessment of task execution, which is strictly connected to the
implementation of optimized training procedures. To this aim, it is interesting to explore the effectiveness of biofeedback
training techniques. This implies a complete chain for information extraction containing instrumented devices,
processing algorithms and graphical user interfaces (GUIs) to extract valuable information (i.e. kinematics, dynamics,
and electrophysiology) to be presented in real-time to the athlete. In cycling, performance indexes displayed in a simple
and perceivable way can help the cyclist optimize the pedaling. To this purpose, in this study four different GUIs have
been designed and used in order to understand if and how a graphical biofeedback can influence the cycling
performance. In particular, information related to the mechanical efficiency of pedaling is represented in each of the
designed interfaces and then displayed to the user. This index is real-time calculated on the basis of the force signals
exerted on the pedals during cycling. Instrumented pedals for bikes, already designed and implemented in our laboratory,
have been used to measure those force components. A group of subjects underwent an experimental protocol and pedaled
with (the interfaces have been used in a randomized order) and without graphical biofeedback. Preliminary results show
how the effective perception of the biofeedback influences the motor performance.
In this paper a Multi-view Distributed Video Coding scheme for mobile applications is presented. Specifically
a new fusion technique between temporal and spatial side information in Zernike Moments domain is proposed.
Distributed video coding introduces a flexible architecture that enables the design of very low complex video
encoders compared to its traditional counterparts. The main goal of our work is to generate at the decoder the
side information that optimally blends temporal and interview data. Multi-view distributed coding performance
strongly depends on the side information quality built at the decoder. At this aim for improving its quality
a spatial view compensation/prediction in Zernike moments domain is applied. Spatial and temporal motion
activity have been fused together to obtain the overall side-information. The proposed method has been evaluated
by rate-distortion performances for different inter-view and temporal estimation quality conditions.
KEYWORDS: Video, Video processing, Cameras, Video compression, 3D video compression, Algorithm development, Detection and tracking algorithms, Image quality, Image compression, Gaussian filters
'View plus depth' is an attractive compact representation format for 3D video compression and transmission. It
combines 2D video with depth map sequence aligned in a per-pixel manner to represent the moving 3D scene
in interest. Any different-perspective view can be synthesized out if this representation through Depth-Image
Based Rendering (DIBR). However, such rendering is prone to disocclusion errors: regions originally covered by
foreground objects become visible in the synthesized view and have to be filled with perceptually-meaningful
data.
In this work, a technique for reducing the perceived artifacts by inpainting the disoccluded areas is proposed.
Based on Criminisi's exemplar-based inpainting algorithm, the developed technique recovers the disoccluded
areas by using pixels of similar blocks surrounding it. In the original work, a moving window is centered on the
boundaries between known and unknown parts ('target window'). The known pixels are used to select windows
which are most similar to the target one. When this process is completed, the unknown region of the target
patch is filled with a weighted combination of pixels from the selected windows.
In the proposed scheme, the priority map, which defines the rule for selecting the order of pixels to be filled,
has been modified to meet the requirement for disocclusion hole filling and a better non-local mean estimate
has been suggested accordingly. Furthermore, the search for similar patches has also been extended to previous
and following frames of the video under processing, thus improving both computational efficiency and resulting
quality.
The increasing use of digital image-based applications is resulting in huge databases that are often difficult to
use and prone to misuse and privacy concerns. These issues are especially crucial in medical applications. The
most commonly adopted solution is the encryption of both the image and the patient data in separate files
that are then linked. This practice results to be inefficient since, in order to retrieve patient data or analysis
details, it is necessary to decrypt both files.
In this contribution, an alternative solution for secure medical image annotation is presented. The proposed
framework is based on the joint use of a key-dependent wavelet transform, the Integer Fibonacci-Haar transform,
of a secure cryptographic scheme, and of a reversible watermarking scheme.
The system allows: i) the insertion of the patient data into the encrypted image without requiring the knowledge
of the original image, ii) the encryption of annotated images without causing loss in the embedded information,
and iii) due to the complete reversibility of the process, it allows recovering the original image after the mark
removal. Experimental results show the effectiveness of the proposed scheme.
In this paper a novel scheme for extracting the global features from an image. Usually the features are extracted from the
whole image. In the proposed approach, only the image regions conveying information are considered. The two steps
procedure is based on the Fisher's information evaluation computed by linear combination of Zernike expansion
coefficients. Then, by using the region growing algorithm, only high information rate regions are considered. The
considered features are texture, edges, and color. The performances of the proposed scheme has been evaluated by using
the retrieval rate. Experimental results show an increase in the retrieval rate with respect to use the same features
computed on whole image.
A novel technique for searching for complex patterns in large multimedia databases is presented, based on rotation independent template matching. To handle objects of arbitrary shape while reducing the computational workload, the pattern to be localized is partitioned into small square blocks of sizes adapted to the local image content using quadtree decomposition. The use of Zernike polynomials for representing each block allows the design of a fast and effective maximum likelihood matching procedure to sequentially verify whether the target image contains each block of the quadtree. State of the art methods usually represent the whole pattern by using an orthogonal basis and extracting an invariant feature vector from the representation coefficients. In the proposed scheme, the use of the quadtree decomposition allows us to bind the number of terms of the truncated expansions, still guaranteeing a precise image representation.
KEYWORDS: Digital watermarking, Image quality, Data hiding, Quantization, Signal to noise ratio, Image processing, Modulation, Visual system, Bismuth, Image compression
This paper presents an innovative watermarking scheme which allows the insertion of information in the Discrete
Cosine Transform (DCT) domain increasing the perceptual quality of the watermarked images by exploiting
the masking effect of the DCT coefficients. Indeed, we propose to make the strength of the embedded data
adaptive by following the characteristics of the Human Visual System (HVS) with respect to image fruition.
Improvements in the perceived quality of modified data are evaluated by means of various perceptual quality
metrics as demonstrated by experimental results.
In this contribution a novel reversible data hiding scheme for digital images is presented. The proposed
technique allows the exact recovery of the original image upon extraction of the embedded information. Lossless
recovery of the original is achieved by adopting the histogram shifting technique in a novel wavelet domain: the
Integer Fibonacci-Haar Transform, which is based on a parameterized subband decomposition of the image. In
particular, the parametrization depends on a selected Fibonacci sequence. The use of this transform increases
the security of the proposed method. Experimental results show the effectiveness of the proposed scheme.
In this contribution a Multiple Description Coding scheme for video transmission over unreliable channel is
presented. The method is based on an integer wavelet transform and on a data hiding scheme for exploiting
the spatial redundancy and for reducing the scheme overhead. Experimental results show the effectiveness of
the proposed scheme.
In this contribution the robustness of a novel steganographic scheme based on the generalized Fibonacci sequence
against Chi-square attacks is investigated. In essence, an image is first represented in a basis defined by a
generalized Fibonacci sequence. Then the secret data are inserted by substitution technique into selected bit
planes preserving the first order distributions, and finally, the inverse Fibonacci decomposition is applied to
obtain the stego-image. Secret data are scrambled before the embedding to improve the security of the whole
system. In order to perform Chi-square attacks, the knowledge of both the parameters determining the binary
Fibonacci representation of an image is assumed. Experimental results show that no visual impairments are
introduced and the probability of detecting the presence of hidden data is small even if a modest capacity loss
is present.
In this contribution, a novel method for distributed video coding for stereo sequences is proposed. The system
encodes independently the left and right frames of the stereoscopic sequence. The decoder exploits the side
information to achieve the best reconstruction of the correlated video streams. In particular, a syndrome coder
approach based on a lifted Tree Structured Haar wavelet scheme has been adopted. The experimental results
show the effectiveness of the proposed scheme.
In this paper a novel technique for rotation independent template matching via Quadtree Zernike decomposition
is presented. Both the template and the target image are decomposed by using a complex polynomial basis.
The template is analyzed in block-based manner by using a quad tree decomposition. This allows the system to
better identify the object features.
Searching for a complex pattern into a large multimedia database is based on a sequential procedure that
verifies whether the candidate image contains each square of the ranked quadtree list and refining, step-by-step,
the location and orientation estimate.
KEYWORDS: Digital watermarking, Data hiding, Binary data, Image quality, Computer security, Visibility, Bismuth, Signal to noise ratio, Visualization, Multimedia
This paper presents a novel spatial data hiding scheme based on the Least Significant Bit insertion. The bitplane
decomposition is obtained by using the (p, r) Fibonacci sequences. This decomposition depends on two
parameters, p and r. Those values increase the security of the whole system; without their knowledge it is
not possible to perform the same decomposition used in the embedding process and to extract the embedded
information. Experimental results show the effectiveness of the proposed method.
KEYWORDS: Digital watermarking, Image encryption, Sensors, Computer security, Analog electronics, Multimedia, Cryptography, Quantization, Image processing, Signal to noise ratio
In this paper a joint watermarking and ciphering scheme for digital images is presented. Both operations are
performed on a key-dependent transform domain. The commutative property of the proposed method allows to
cipher a watermarked image without interfering with the embedded signal or to watermark an encrypted image
still allowing a perfect deciphering. Furthermore, the key dependence of the transform domain increases the
security of the overall system. Experimental results show the effectiveness of the proposed scheme.
This paper proposes a novel data hiding scheme in which a payload is embedded into the discrete cosine transform domain. The characteristics of the Human Visual System (HVS) with respect to image fruition have been exploited to adapt the strength of the embedded data and integrated in the design of a digital image watermarking system. By using an HVS-inspired image quality metric, we study the relation between the amount of data that can be embedded and the resulting perceived quality. This study allows one to increase the robustness of the watermarked image without damaging the perceived quality, or, as alternative, to reduce the impairments produced by the watermarking process given a fixed embedding strength. Experimental results show the effectiveness and the robustness of the proposed solution.
KEYWORDS: Digital watermarking, Image encryption, Symmetric-key encryption, Computer security, Medical imaging, Sensors, RGB color model, Data hiding, Wavelet transforms, Multimedia
In this paper a novel method for watermarking and ciphering color images is presented. The aim of the system is
to allow the watermarking of encrypted data without requiring the knowledge of the original data. By using this
method, it is also possible to cipher watermarked data without damaging the embedded signal. Furthermore, the
extraction of the hidden information can be performed without deciphering the cover data and it is also possible
to decipher watermarked data without removing the watermark. The transform domain adopted in this work is the Fibonacci-Haar wavelet transform. The experimental results show the effectiveness of the proposed scheme.
In this paper the use of Internet Protocol ver.6 packets to convey hidden information is exploited. The possibility
to hide information in commonly used data transport mechanisms allows one to send extra information in a
transparent way. The hidden message does not affect the routing mechanism neither interferes with the security
mechanisms implemented on IP based networks, as firewalls, intrusion detection systems, authentication tools.
KEYWORDS: Digital watermarking, Data hiding, Sensors, Wavelets, Multimedia, Image quality, Visualization, Signal to noise ratio, Telecommunications, Mobile communications
In this contribution, we present a novel technique for imperceptible and robust watermarking of digital images.
It is based on the host image 2-th level decomposition using the Fibonacci-Haar Transform (FHT) and on the
Singular Value Decomposition (SVD) of the transformed subbands. The main contributions of this approach are
the use of the FHT for hiding purposes, the flexibility in data hiding capacity, and the key-dependent secrecy of
the used transform. The experimental results show the effectiveness of the proposed approach both in perceived
quality of the watermarked image and in robustness against the most common attacks.
In this paper, a novel authentications system combining biometric cryptosystems with digital watermarking is
presented. One of the main vulnerabilities of the existing data hiding systems is the public knowledge of the
embedding domain. We propose the use of biometric data, minutiae fingerprint set, for generating the encryption
key needed to decompose an image in the Tree structured Haar transform. The uniqueness of the biometrics key
together with other, embedded, biometric information guarantee the authentication of the user. Experimental tests show the effectiveness of the proposed system.
In this paper a novel, fast, and robust data hiding technique based on key-dependent basis functions is presented.
The particular domain chosen for the embedding is built on the Tree-Structured Haar basis. A weight
function, based on Human Visual System features, is used as a mask for selecting the coefficients to be marked.
Experimental results show the effectiveness of the proposed method.
During contraction and stretching, muscles change shape and size, and produce a deformation of skin tissues and a modification of the body segment shape. In human motion analysis, it is very important to take into account these phenomena. The aim of this work is the evaluation of skin and muscular deformation, and the modeling of body segment elastic behavior obtained by analysing video sequences that capture a muscle contraction. The soft tissue modeling is accomplished by using triangular meshes that automatically adapt to the body segment during the execution of a static muscle contraction. The adaptive triangular mesh is built on reference points whose motion is estimated by using non linear operators. Experimental results, obtained by applying the proposed method to several video sequences, where biceps brachial isometric contraction was present, show the effectiveness of this technique.
In this contribution, we present an objective assessment method for the evaluation, without reference, of the degradation of the video quality induced by the reduction of the temporal resolution. The assessment of the jerkiness perceived by a human observer is performed by feeding a multilayer neural network with the statistical distributions of the kinematics data (speed, acceleration and jerk of objects on the image plane) evaluated on a video shot. To identify the neural network (architecture and parameters) that best fit the human behavior, a subjective experiment has been performed. Validation of the model on the test set indicate a good match between the Mean Opinion Score (MOS) and the jerkiness indicator computed by the neural network.
Human movement analysis is generally performed through the utilization of marker-based systems, which allow reconstructing, with high levels of accuracy, the trajectories of markers allocated on specific points of the human body. Marker based systems, however, show some drawbacks that can be overcome by the use of video systems applying markerless techniques. In this paper, a specifically designed computer vision technique for the detection and tracking of relevant body points is presented. It is based on the Gauss-Laguerre Decomposition, and a Principal Component Analysis Technique (PCA) is used to circumscribe the region of interest. Results obtained on both synthetic and experimental tests provide significant reduction of the computational costs, with no significant reduction of the tracking accuracy.
During contraction and stretching, muscles change shape and size, and produce a deformation of skin tissues and a modification of the body segment shape. In human motion analysis, it is indispensable to take into account this phenomenon and thus approximating body limbs to rigid structures appears as restrictive. The present work aims at evaluating skin and muscular deformation, and at modeling body segment elastic behavior by analysing video sequences that capture a sport gesture. The soft tissue modeling is accomplished by using triangular meshes that automatically adapt to the body segment during the execution of a static muscle contraction. The adaptive triangular mesh is built on reference points whose motion is estimated by using the technique based on Gauss Laguerre Expansion. Promising results have been obtained by applying the proposed method to a video sequence, where an upper arm isometric contraction was present.
Broadband multimedia communications will be a key element in the 3rd generation of wireless services. A primary challenge is the support of interactive services requiring synchronous playout of multimedia content with a maximum acceptable delay in a multi-user scenario. Subjective tests show that the subjects prefer a lower quality reproduction than a “hopping” video: the pause needed to re-buffer severely affects the overall quality. To maximize the number of active users that can be served with a predefined Quality of Service (QoS) level by a packet-oriented radio access, we propose the use of a multimedia proxy. This component may transcode the original video stream to match a predefined quality level based on the quality of each user’s channel.
Postural ability can be evaluated through the analysis of body oscillations, by estimating the displacements of selected sets of body segments. The analysis of human movement is generally achieved through the exploitation of stereophotogrammetric systems that rely on the use of markers. Marker systems show a high cost and patient settings which can be uncomfortable. On the other hand, the use of force platform has some disadvantages: the acquisition of dynamics data permits to estimate only the body oscillations as a whole, without any information about individual body segment movements. Some of these drawbacks can be overcome by the use of video systems, applying a marker-free sub-pixel algorithm. In this paper, a novel method to evaluate balance strategies that utilises commercial available systems and applies methods for feature extraction and image processing algorithms is presented.
KEYWORDS: Cultural heritage, 3D acquisition, 3D modeling, Visualization, 3D scanning, Holography, 3D image processing, Lithium, Electronic imaging, Internet
One of the most powerful applications of the World Wide Web (WWW) is the storage and distributuion of multimedia, integrating text, images, sound, videos and hyperlinks. In cultural heritage this is of particular interest, because best methods to convey a complex knowledge in the field of cultural heritage, to experts and non-experts, are the visual representation and visual interaction.
In this work we propose 3D acquisition and digitizing techniques for the virtualized reality of small cultural heritage objects (virtual gallery). The system used for creating 3D shape is based on the conoscopic holograph. This technique is a non-contact three-dimensional measuring technique that makes possible to produce holograms, even with incoherent light, with fringe periods that can be measured precisely to determine the exact distance to the point measured. It is suitable to obtain 3D profile with high resolution also on surface with unevenness reflectivity (this situation is usual on the surface of the cultural heritage objects). By conoscopic holography, high-resolution 3D model can be obtained. Howver, accurate representation and high-quality display are fundamental requirements to avoid misinterpretation of the data. Therefore, virtual gallery can be obtained through a procedure involving 3D acquisition, 3D model and visualization.
In this paper, a new no reference metric for video quality
assessment is presented. The proposed metric provides a measure of
the quality of a video based on a feature that we believe is
relevant for the human observers: the motion. The metric is based
on an unconventional use of a data hiding system. The mark is
inserted in specific areas of the video using a fragile embedding
algorithm. The exact embedding location is determined by the
amount of motion between pairs of consecutive frames. The mark is
embedded exclusively into 'moving areas' of the video. At the
receiver, the mark is extracted from the decoded video and a
quality measure of the video is estimated by evaluating the
degradation of the extracted mark. Simulation results indicate
that the proposed quality metric is able to assess the quality of
videos degraded by compression.
KEYWORDS: 3D modeling, Holography, 3D acquisition, Crystals, Visualization, Data modeling, Virtual reality, Holograms, Digital holography, Reflectivity
Physical access to historic and artistic manufactures can be limited by a lot of factors. In particular, the access to the collection of the ancient coins is difficult, especially for students. Indeed, for coins digital archive of high quality three-dimensional model and remote fruition is of great interest. In this work we propose 3D acquisition and digitizing techniques for the virtualized reality of ancient coins (virtual gallery). The system used for creating 3D shape of coins is based on conoscopic holography. This technique is a non-contact three-dimensional measuring technique that makes possible to produce holograms, even with incoherent light, with fringe periods that can be measured precisely to determine the exact distance to the point measured. It is suitable to obtain 3D profile with high resolution also on surface with unevenness reflectivity (this situation is usual on the surface of the ancient coins). By conoscopic holography, high-resolution 3D model can be obtained. However, accurate representation and high-quality display are fundamental requirements to avoid misinterpretation of the data. Therefore, virtual galleries can be obtained through a procedure involving 3D acquisition, 3D model and visualization. In conclusion, we propose an optoelectronic application, integrated with multimedia techniques, in order to improve the access to collection of ancient coins belonging to museums or privates.
In this paper the motion estimation between two consecutive color frames is analyzed by means of the Laguerre Gauss (LG) Wavelet Transform and the Mean Field theory. This contribution extends some previous work of the authors in this field. By decomposing the frames using the LG functions, a local image representation is obtained; using the self-steerability property of the LG it is possible to design a maximum likelihood estimation procedure to identify both the displacement and the rotation fields. To regularize the motion field taking in account the possible occlusions, the mean field theory has been used.
This paper deals with the problem of accurate position, orientation and scale estimation (localization) of objects in images, in the context of the Laguerre-Gauss Transform (GLT) theory. The solution proposed here is based on the maximization of the Maximum Likelihood functional, expressed by means of the Laguerre-Gauss expansion of the object and of the observed image. The original computational complexity of the problem, which would imply maximization over a four dimensional parameter space, is drastically reduced taking advantage of properties of the GLT image representation system.
This paper presents a simple technique for estimating the space location from which a certain image has been taken. The basic assumption is that the scene portrayed in the image is planar. The method is based on the acquisition of a new set of images, closely resembling the given image. The location is recovered from the parameters describing the camera's pose during the acquisition of that among the new images showing the highest degree of correlation with the original image. An example of application of this technique is discussed in the paper.
In this contribution we propose a novel semantic-based architecture to manage multimedia data. We propose an innovatory approach, introducing an abstraction level to study the relationships among the low level attributes, as color, motion, in a systematic way, before the visual image content estimation. Aim of this analysis is to unify the descriptors information and to gather them into structures that we call over-regions, which represent particular configurations of the objects to be recognized. This step will allow for the higher abstraction level effective object-based or event-based image recognition. The case-based reasoning paradigm is used in our approach for the high level analysis.
Recognition of patterns irrespective of their actual orientation is approached in a context of estimation theory. In particular, it is shown how using Circular Harmonic Filter banks the problem drastically simplifies from a computational viewpoint. A general purpose pattern detection/estimation scheme is finally illustrated by decomposing the images on an orthogonal basis formed by Laguerre-Gauss Harmonic Functions, and an application example is provided.
In this paper we address the classical problem of estimating, from a pair of consecutive frames of a video sequence, the motion (or velocity field, or optical flow) produced by planar translations and rotations, in the context of the Gauss- Laguerre Transform (GLT) theory. This contribution extends some previous works of the authors on wavelet based Optimum Scale-Orientation Independent Pattern Recognition. In particular here we make use of an orthogonal system of Laguerre-Gauss wavelets. Each wavelet represents the image by translated, dilated and rotated versions of a complex waveform whereas, for a fixed resolution, this expansion provides a local representation of the image around any point. In addition each waveform is self-steerable, i.e. it rotates by simple multiplication with a complex factor. These properties allow to derive an iterative joint translation and rotation field Maximum Likelihood (ML) estimation procedure based on a bank of CHWs. In this contribution the coarse estimate obtained by the memoryless, point wise, ML estimator is refined by resorting to a compound Markovian model that takes into account the spatial continuity of the motion field associated to a single object, by heavily penalizing abrupt changes in the motion intensity and direction not located in correspondence of intensity discontinuities (i.e. mostly object boundaries).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.