The arrangement of plant roots and their overall structure, known as root system architecture (RSA), plays an important role in acquiring water and nutrients essential for plant growth and development. Moreover, the RSA demonstrates remarkable adaptability to environmental stresses, making it a central factor in plant adaptation. Root traits, including root length, root diameter, root length density (RLD), and the presence of root hairs, play a crucial role in optimizing resource utilization within the soil and enhancing productivity. In particular, root hairs play a crucial role in the overall health and functioning of plants. These microscopic, hair-like structures extend from the surface of root cells and greatly increase the root’s surface area, which accounts for approximately 70% of the total root area. The characteristics of root hairs, such as their length and density, significantly enhance soil nutrients and water uptake. Considering these advantages, it is difficult to observe root hairs in a scene with low resolution. Therefore, we proposed a study using deep learning-based image super-resolution methods as a pre-processing step that helps to reconstruct finer details and structures within the root hairs, leading to a more accurate representation of their morphology, to understand the improvement in the response of root hairs under different environmental conditions and their impact on nutrient and water uptake, models need to be evolved.
This paper proposes a novel steganographic method that employs a feedback mechanism to improve the efficiency and stealth of data hiding within the Discrete Cosine Transform (DCT) coefficients of JPEG images. This method enhances the correlation between the hidden message and the cover image, while minimizing the perceptible changes to the image. The system starts by dividing the cover image into blocks and applying DCT to each. It then evaluates the correlation between the hidden message and the DCT coefficients to identify potential data embedding points. A trained decision rules algorithm then chooses the optimal data embedding technique, considering factors like the size and location of the DCT coefficient within image blocks. Different embedding techniques are employed. The system subsequently generates feedback based on metrics such as image quality and data detectability, refining the decision ruls's effectiveness over time. By employing this dynamic approach, our system adaptively improves the data hiding process, enhancing capacity and minimizing detectability. This work opens new doors in the realm of steganography, presenting an intelligent system capable of adaptively embedding data with optimized stealth and efficiency.
Image classification tasks leverage CNN to yield accurate results that supersede their predecessor human-crafted algorithms. Applicable use cases include Autonomous, Face, Medical Imaging, and more. Along with the growing use of AI image classification applications, we see emerging research on the robustness of such models to adversarial attacks, which take advantage of the unique vulnerabilities of the Artificial Intelligence (AI) models to skew their classification results. While not visible to the Human Visual System (HVS), these attacks mislead the algorithms and yield wrong classification results. To be incorporated securely enough in real-world applications, AI-based image classification algorithms require protection that will increase their robustness to adversarial attacks. We propose replacing the commonly used Rectifier Linear Unit (ReLU) Activation Function (AF), which is piecewise linear, with non-linear AF to increase their robustness to adversarial attacks. This approach has been considered in recent research and is motivated by the observation that non-linear AF tends to diminish the effect of adversarial perturbations in the DNN layers. To gain credibility of the approach, we have applied Fast Sign Gradient Method (FGSM), and Hop-Skip- Jump (HSJ) attacks to a trained classification model of the MNIST dataset. We then replaced the AF of the model with non-linear AF (Sigmoid, GeLU, ELU, SeLU, and Tanh). We concluded that while attacks on the original model have a 100% success rate, the attack success rate is dropped by an average of 10% when non-linear AF is used.
The BGU CubeSat satellite is from a class of low-cost, compact satellites. Its dimensions are 10×10×30 cm. It is equipped with a low resolution 256×320 pixels short wave infrared (SWIR) camera at the 1.55-1.7mm wavelength band. Images are transmitted in bursts of tens of images at a time with few pixel shifts from the first image to the last. Each image burst is suitable for Multiple Image Super Resolution (MISR) enhancements. MISR can construct a high-resolution (HR) image from several low-resolution (LR) images yielding an image that can resolve more details that are crucial for research in remote sensing. In this research, we verify the applicability of SOTA deep learning MISR models that were developed following the publication of the PROBA-V MISR satellite dataset at the visible red and near IR. Our SWIR multiple images differ from PROBA-V by the spectral band and by the method of collecting multiple images of the exact location. Our imagery data is acquired by a burst of very close temporal images. PROBA-V revisits the satellite at a period smaller than 30 days, assuming the soil dryness is about the same. We compare the results of Single Image Super-Resolution (SISR) and MISR techniques to "off-the-shelf" products. The quality of the super-resolved images is compared by nonreference metrics suitable for remote sensing applications and by experts' visual inspection. Unlike remarkable achievements by the GAN technique that can achieve very appealing results that are not always faithful to the original ground truth, the super-resolved images should preserve the original details as much as possible for further scientific remote sensing analysis.
It is anticipated that in some extreme situations, autonomous cars will benefit from the intervention of a ״Remote Driver״. The vehicle computer may discover a failure and decide to request remote assistance for safe roadside parking. In a more extreme scenario, the vehicle may require a complete remote-driver takeover due to malfunctions or an inability to resolve unknown decision logic. In such cases, the remote driver will need a sufficiently good quality real-time video stream of the vehicle cameras to respond quickly and accurately enough to the situation at hand. Relaying such a video stream to the remote Command and Control (C&C) center is especially challenging when considering the varying wireless channel bandwidths expected in these scenarios. This paper proposes an innovative end-to-end content-sensitive video compression scheme to allow efficient and satisfactory video transmission from autonomous vehicles to the remote C&C center.
Standard video compression algorithms use multiple “Modes”, which are various linear combinations of pixels for prediction of their neighbors within image Macro-Blocks (MBs). In this research, we are using Deep Neural Networks (DNN) with supervised learning to predict block pixels. Using DNNs and employing intra-block pixel values’ calculations that penetrate into the block, we manage to obtain improved predictions that yield up to 200% reduction of residual block errors. However, using intra-block pixels for predictions brings upon interesting tradeoffs between prediction errors and quantization errors. We explore and explain these tradeoffs for two different DNN types. We further discovered that it is possible to achieve a larger dynamic range of quantization parameter (Qp) and thus reach lower bit-rates than standard modes, which already saturate at these Qp levels. We explore this phenomenon and explain its reasoning.
Image steganography is the art of hiding information in a cover image in such a way that a third party does not notice the hidden information. This paper presents a novel technique for image steganography in the spatial domain. The new method hides and recovers hidden information of substantial length within digital imagery, while maintaining the size and quality of the original image. The image gradient is used to generate a saliency image, which represent the energy of each pixel in the image. Pixels with higher energy are more salient and they are valuable for hiding data since their visual impairment is low. From the saliency image, a cumulative maximum energy matrix is created; this matrix is used to generate horizontal seams that pass over the maximum energy path. By embedding the secret bits of information along the seams, a stego-image is created which contains the hidden message. In the stegoimage, we ensure that the hidden data is invisible, with very small perceived image quality degradation. The same algorithms are used to reconstruct the hidden message from the stego-image. Experiments have been conducted using two types of image and two types of hidden data to evaluate the proposed technique. The experimental results show that the proposed algorithm has a high capacity and good invisibility, with a Peak Signal-to-Noise Ratio (PSNR) of about 70, and a Structural SIMilarity index (SSIM) of about 1.
One fundamental component of video compression standards is Intra-Prediction. Intra-Prediction takes advantage of redundancy in the information of neighboring pixel values within video frames to predict blocks of pixels from their surrounding pixels and thus allowing to transmit the prediction errors instead of the pixel values themselves. The prediction errors are of smaller values than the pixels themselves, thus allowing to accomplish compression of the video stream. Prevalent standards take advantage of intra-frame pixel value dependencies to perform prediction at the encoder end and transfer only residual errors to the decoder. The standards use multiple “Modes”, which are various linear combinations of pixels for prediction of their neighbors within image Macro-Blocks (MBs). In this research, we have used Deep Neural Networks (DNN) to perform the predictions. Using twelve Fully Connected Networks, we managed to reduce Mean Square Error (MSE) of the predicted error by up to 3 times as compared to standard modes prediction results. This substantial improvement comes at the expense of more extensive computations. However, these extra computations can be significantly mitigated by the use of dedicated Graphical Processing Units (GPUs).
The demand for streaming video content is on the rise and growing exponentially. Networks bandwidth is very costly and therefore there is a constant effort to improve video compression rates and enable the sending of reduced data volumes while retaining quality of experience (QoE). One basic feature that utilizes the spatial correlation of pixels for video compression is Intra-Prediction, which determines the codec’s compression efficiency. Intra prediction enables significant reduction of the Intra-Frame (I frame) size and, therefore, contributes to efficient exploitation of bandwidth. In this presentation, we propose new Intra-Prediction algorithms that improve the AV1 prediction model and provide better compression ratios. Two (2) types of methods are considered: )1( New scanning order method that maximizes spatial correlation in order to reduce prediction error; and )2( New Intra-Prediction modes implementation in AVI. Modern video coding standards, including AVI codec, utilize fixed scan orders in processing blocks during intra coding. The fixed scan orders typically result in residual blocks with high prediction error mainly in blocks with edges. This means that the fixed scan orders cannot fully exploit the content-adaptive spatial correlations between adjacent blocks, thus the bitrate after compression tends to be large. To reduce the bitrate induced by inaccurate intra prediction, the proposed approach adaptively chooses the scanning order of blocks according to criteria of firstly predicting blocks with maximum number of surrounding, already Inter-Predicted blocks. Using the modified scanning order method and the new modes has reduced the MSE by up to five (5) times when compared to conventional TM mode / Raster scan and up to two (2) times when compared to conventional CALIC mode / Raster scan, depending on the image characteristics (which determines the percentage of blocks predicted with Inter-Prediction, which in turn impacts the efficiency of the new scanning method). For the same cases, the PSNR was shown to improve by up to 7.4dB and up to 4 dB, respectively. The new modes have yielded 5% improvement in BD-Rate over traditionally used modes, when run on K-Frame, which is expected to yield ~1% of overall improvement.
This paper deals with implementing a new algorithm for edge detection based on the Phase Stretch Transform (PST) for purposes of car plate license recognition. In PST edge detection algorithm, the image is first filtered with a spatial kernel followed by application of a nonlinear frequency-dependent phase. The output of the transform is the phase in the spatial domain. The main step is the 2-D phase function which is typically applied in the frequency domain. The amount of phase applied to the image is frequency dependent with higher amount of phase applied to higher frequency features of the image. Since sharp transitions, such as edges and corners, contain higher frequencies, PST emphasizes the edge information. Features can be further enhanced by applying thresholding and morphological operations.
Here we investigate the influence of noise and blur on the ability to recognize the characters in the plate license, by comparison of our suggested algorithm with the well known Canny algorithm.
We use several types of noise distributions among them, Gaussian noise, salt and paper noise and uniform distributed noise, with several levels of noise variances. The simulated blur is related to the car velocity and we applied several filters representing different velocities of the car.
Another interesting degradation that we intend to investigate is the cases that Laser shield license plate cover is used to distort the image taken by the authorities.
Our comparison results are presented in terms of True positive, False positive and False negative probabilities.
Prediction of visual saliency in images and video is a highly researched topic. Target applications include Quality assessment of multimedia services in mobile context, video compression techniques, recognition of objects in video streams, etc. In the framework of mobile and egocentric perspectives, visual saliency models cannot be founded only on bottom-up features, as suggested by feature integration theory. The central bias hypothesis, is not respected neither. In this case, the top-down component of human visual attention becomes prevalent. Visual saliency can be predicted on the basis of seen data. Deep Convolutional Neural Networks (CNN) have proven to be a powerful tool for prediction of salient areas in stills. In our work we also focus on sensitivity of human visual system to residual motion in a video. A Deep CNN architecture is designed, where we incorporate input primary maps as color values of pixels and magnitude of local residual motion. Complementary contrast maps allow for a slight increase of accuracy compared to the use of color and residual motion only. The experiments show that the choice of the input features for the Deep CNN depends on visual task:for th eintersts in dynamic content, the 4K model with residual motion is more efficient, and for object recognition in egocentric video the pure spatial input is more appropriate.
The demand for high quality video is permanently on the rise and with it the need for more effective compression.
Compression scope can be further expanded due to increased spatial correlation of pixels within a high quality video frame.
One basic feature that takes advantage of pixels’ spatial correlation for video compression is Intra-Prediction, which
determines the codec’s compression efficiency. Intra-Prediction enables significant reduction of the Intra-frame (I-frame)
size and, therefore, contributes to more efficient bandwidth exploitation. It has been observed that the intra frame coding
efficiency of VP9 is not as good as that of H.265/MPEG-HEVC. One possible reason is that HEVC’s Intra-prediction
algorithm uses as many as 35 prediction directions, while VP9 uses only 9 directions including the TM prediction mode.
Therefore, there is high motivation to improve the Intra-Prediction scheme with new, original and proprietary algorithms
that will enhance the overall performance of Google’s future codec and bring its performance closer to that of HEVC. In
this work, instead of using different angles for predictions, we introduce four unconventional Intra-Prediction modes for
the VP10 codec – Weighted CALIC (WCALIC), Intra-Prediction using System of Linear Equations (ISLE), Prediction of
Discrete Cosine Transformations (PrDCT) Coefficients and Reverse Least Power of Three (RLPT). Employed on a
selection eleven (11) typical images with a variety of spatial characteristics, by using Mean Square Error (MSE) evaluation
criteria, we show that our proposed algorithms (modes) were preferred and thus selected around 57% of the blocks,
resulting in a reduced average prediction error, i.e. the MSE of 26%. We believe that our proposed techniques will achieve
higher compression without compromising video quality, thus improving the Rate-Distortion (RD) performances of the
compressed video stream.
KEYWORDS: Defense and security, Video, Digital watermarking, Image quality, Multimedia, Steganography, Image compression, Video compression, Image enhancement, Social networks
With the increasing popularity of video streaming services and multimedia sharing via social networks, there is a need to protect the multimedia from malicious use. An attacker may use steganography and watermarking techniques to embed malicious content, in order to attack the end user. Most of the attack algorithms are robust to basic image processing techniques such as filtering, compression, noise addition, etc. Hence, in this article two novel, real-time, defense techniques are proposed: Smart threshold and anomaly correction. Both techniques operate at the DCT domain, and are applicable for JPEG images and H.264 I-Frames. The defense performance was evaluated against a highly robust attack, and the perceptual quality degradation was measured by the well-known PSNR and SSIM quality assessment metrics. A set of defense techniques is suggested for improving the defense efficiency. For the most aggressive attack configuration, the combination of all the defense techniques results in 80% protection against cyber-attacks with PSNR of 25.74 db.
During the last twenty years, digital imagers have spread into industrial and everyday devices, such as satellites, security cameras, cell phones, laptops and more. “Hot pixels” are the main defects in remote digital cameras. In this paper we prove an improvement of existing restoration methods that use (solely or as an auxiliary tool) some average of the surrounding single pixel, such as the method of the Chapman-Koren study 1,2. The proposed method uses the CALIC algorithm and adapts it to a full use of the surrounding pixels.
We propose novel models for image restoration based on statistical physics. We investigate the affinity between these fields and describe a framework from which interesting denoising algorithms can be derived: Ising-like models and simulated annealing techniques. When combined with known predictors such as Median and LOCO-I, these models become even more effective. In order to further examine the proposed models we apply them to two important problems: (i) Digital Cameras in space damaged from cosmic radiation. (ii) Ultrasonic medical devices damaged from speckle noise. The results, as well as benchmark and comparisons, suggest in most of the cases a significant gain in PSNR and SSIM in comparison to other filters.
The popularity of low-delay video applications dramatically increased over the last years due to a rising demand for realtime video content (such as video conferencing or video surveillance), and also due to the increasing availability of relatively inexpensive heterogeneous devices (such as smartphones and tablets). To this end, this work presents a comparative assessment of the two latest video coding standards: H.265/MPEG-HEVC (High-Efficiency Video Coding), H.264/MPEG-AVC (Advanced Video Coding), and also of the VP9 proprietary video coding scheme. For evaluating H.264/MPEG-AVC, an open-source x264 encoder was selected, which has a multi-pass encoding mode, similarly to VP9. According to experimental results, which were obtained by using similar low-delay configurations for all three examined representative encoders, it was observed that H.265/MPEG-HEVC provides significant average bit-rate savings of 32.5%, and 40.8%, relative to VP9 and x264 for the 1-pass encoding, and average bit-rate savings of 32.6%, and 42.2% for the 2-pass encoding, respectively. On the other hand, compared to the x264 encoder, typical low-delay encoding times of the VP9 encoder, are about 2,000 times higher for the 1-pass encoding, and are about 400 times higher for the 2-pass encoding.
KEYWORDS: Forward error correction, Video, Scalable video coding, Receivers, Video compression, Signal to noise ratio, Surface plasmons, Control systems, Quantization, Automatic repeat request
Ideally, video streaming systems should provide the best quality video a user's device can handle without compromising on downloading speed. In this article, an improved video transmission system is presented which dynamically enhances the video quality based on a user's current network state and repairs errors from data lost in the video transmission. The system incorporates three main components: Scalable Video Coding (SVC) with three layers, multicast based on Receiver Layered Multicast (RLM) and an UnEqual Forward Error Correction (FEC) algorithm. The SVC provides an efficient method for providing different levels of video quality, stored as enhancement layers. In the presented system, a proportional-integral-derivative (PID) controller was implemented to dynamically adjust the video quality, adding or subtracting quality layers as appropriate. In addition, an FEC algorithm was added to compensate for data lost in transmission. A two dimensional FEC was used. The FEC algorithm came from the Pro MPEG code of practice #3 release 2. Several bit errors scenarios were tested (step function, cosine wave) with different bandwidth size and error values were simulated. The suggested scheme which includes SVC video encoding with 3 layers over IP Multicast with Unequal FEC algorithm was investigated under different channel conditions, variable bandwidths and different bit error rates. The results indicate improvement of the video quality in terms of PSNR over previous transmission schemes.
Infrared (IR) imagery sequences are commonly used for detecting moving targets in the presence of evolving cloud
clutter or background noise. This research focuses on slow moving point targets that are less than one pixel in size, such
as aircraft at long ranges from a sensor.
The target detection performance is measured via the variance estimation ratio score (VERS), which essentially
calculates the pixel scores of the sequences, where a high score indicates a target is suspected to traverse the pixel. VERS
uses two parameters – long and short term windows, which were predetermined individually for each movie, depending
on the target velocity and on the clouds intensity and amount, as opposed to clear sky (noise), in the background. In this
work, we examine the correlation between the sequences' spatial and temporal features and these two windows. In
addition, we modify VERS calculation, to enhance target detection and decrease cloud-edge scores and false detection.
We conclude this work by evaluating VERS as a detection measure, using its original version and its modified version.
The test sequences are both original real IR sequences as well as their relative compressed sequences using our
designated temporal DCT quantization method.
KEYWORDS: Video coding, Video, Video compression, Telecommunications, Internet, Optical engineering, Scalable video coding, Image processing, Communication engineering, Video processing
Nowadays, we cannot imagine our life without video content and without devices that enable us to acquire and display such content. According to recent research, in 2012, the video content transfer over the Internet was around 60% of the overall Internet data transfer, and the overall video transfer (including the Internet) could reach 90% during the next four years. The TV sets supporting only full high-definition (HD) resolution (i.e., 1080p) are already considered to be outdated due to a dramatic demand for the ultra-HD resolution that often refers to 3840×2160 (4K) or 7680×4320 (8K) resolutions. So, what are the key factors for such tremendous progress? If you are reading this special section on video compression technology, we are sure that you know the answer…
During the last decades, statistical models, such as the Ising model, have become very useful in describing solid state
systems. These models excel in their simplicity and versatility. Furthermore, their results get quite often accurate
experimental proofs. Leading researchers have used them successfully during the last years to restore images. A simple
method, based on the Ising model, was used recently in order to restore B/W and grayscale images and achieved
preliminary results. In this paper we outline first the analogy between statistical physics and image processing. Later, we
present the results we have achieved using a similar, though more complex iterative model in order to get a better
restoration. Moreover, we describe models which enable us to restore colored images. Additionally, we present the
results of a novel method in which similar algorithms enable us to restore degraded video signals.
We confront our outcomes with the results achieved by the simple algorithm and by the median filter for various kinds of
noise. Our model reaches PSNR values which are 2-3 dB higher, and SSIM values which are 15%-20% higher than the
results achieved by the median filter for video restoration.
The paper contributes to No-Reference video quality assessment of broadcasted HD video over IP networks and DVB. In
this work we have enhanced our bottom-up spatio-temporal saliency map model by considering semantics of the visual
scene. Thus we propose a new saliency map model based on face detection that we called semantic saliency map. A new
fusion method has been proposed to merge the bottom-up saliency maps with the semantic saliency map. We show that
our NR metric WMBER weighted by the spatio-temporal-semantic saliency map provides higher results then the
WMBER weighted by the bottom-up spatio-temporal saliency map. Tests are performed on two H.264/AVC video
databases for video quality assessment over lossy networks.
Human vision system is very complex and has been studied for many years specifically for purposes of efficient
encoding of visual, e.g. video content from digital TV. There have been physiological and psychological evidences which
indicate that viewers do not pay equal attention to all exposed visual information, but only focus on certain areas known
as focus of attention (FOA) or saliency regions. In this work, we propose a novel based objective quality assessment
metric, for assessing the perceptual quality of decoded video sequences affected by transmission errors and packed loses.
The proposed method weights the Mean Square Error (MSE), Weighted-MSE (WMSE), according to the calculated
saliency map at each pixel. Our method was validated trough subjective quality experiments.
With the increased use of multimedia technologies, image compression has become increasingly popular. Compression
decreases the high demands for storage capacity and transmission bandwidth. However, when compressing an image,
some part of the information is lost, since the compression smoothes high frequencies thereby distorting small details.
This issue is crucial, especially in military, spying and medical systems. When planning these kinds of systems, the
image compression quality must be considered as well as how it affects the mission performance carried out by the user.
Our goal is to examine the behavior of the human eye during image scanning and try to quantify the effect of image
compression on observer tasks such as target acquisition. For this task, we used the standard JPEG2000 in order to
compress the images at different compression ratios ranging from 10% (the highest) to 100% (the original image). It was
found that animation images were more influenced by compression than thermal images. In general, as the compression
ratio increased the ability to acquire the targets decreased.
We analyze the connection between viewer-perceived quality and encoding schemes. The encoding schemes depend on transmission bit-rate, MPEG compression depth, frame size and frame rate in a constant bit-rate (CBR) video transmission of a MPEG-2 video sequence. The compressed video sequence is transmitted over a lossy communication network with quality of service (QoS) and a certain Internet (IP) loss model. On the end-user side, viewer-perceived quality depends on changes in the network conditions, the video compression, and the video content complexity. We demonstrate that, when jointly considering the impact of coding bit rate, packet loss, and video complexity, there is an optimal encoding scheme, which also depends on the video content. We use a set of subjective tests to demonstrate that this optimal encoding scheme maximizes the viewer-perceived quality.
KEYWORDS: Modulation transfer functions, Image restoration, Filtering (signal processing), Sensors, Electronic filtering, Optical transfer functions, Signal to noise ratio, Image sensors, Fourier transforms, Linear filtering
This paper presents a new scheme for compact shape-coding which can reduce the needed bandwidth for low bit rate MPEG- 4 applications. Our scheme is based on a coarse representation of the alpha plane with a block size resolution of 8x8 pixels. This arrangement saves bandwidth and reduces the algorithm complexity (number of computations), as compared to the Content-based Arithmetic Encoding (CAE) algorithm. In our algorithm, we encode the alpha plane of a macroblock with only 4 bits, while we can further reduce the number of encoding bits by using the Huffman code. The encoding blocks are only contour macroblocks, transparent macroblocks are considered as background macroblocks, while opaque macroblocks are considered as object macroblocks. We show that the amount of bandwidth saving with representing the alpha-plane can reach a factor of 9.5. Such a scheme is appropriate for mobile applications where there is a lack of both bandwidth and processing power. We also speculate that our scheme will be compatible to the MPEG-4 standard.
By applying video smoothing techniques to real-time video transmission, the peak rate and rate variability of compressed video streams can be significantly reduced. Moreover, statistical multiplexing of the smoothed traffic can substantially improve network utilization. In this paper we propose a new smoothing scheme, which exploits statistical multiplexing gain that can be obtained after smoothing of individual video streams. We present a new bandwidth allocation algorithm that allows for responsive interactivity. The local re-smoothing algorithm is carried out using an iterative process.
In this paper we investigate the influence of motion sensor errors on the derivation of the MTF and its implementation in image restoration. We present an analytical approach for estimating the vibration MTF from the measured system MTF by the frequency response of the sensor and their noise data. The goal of this research is to describe an automatic system of restoration of pictures blurred by vibration, and to consider its possible disadvantages. Our method is based on point-spread function verification by the data of motion sensor characteristics. We build an analytical model of the sensor and compare the MTF after sensor errors caused by noise of the system and wrong axis direction of the restoration device. Here, we assume that noise and signal are independent and noise of the system is white Gaussian noise. Some image restoration of degraded images is presented based on improvements of the original wiener filter. We compare performance of inverse and wiener filter operations and consider the dependence of restoration quality on the signal to noise ratio and angel between restoration axis and true vibration direction. There is an interesting and useful relationship in the final graphs. This article brings us to improvement of the initial method, as seen from our simulation. Some restorations of degraded images are presented based on improvements of the original wiener filter. The key to the restoration is determination of the improved optical transfer function unique to the image vibration and sensor characteristics.
One result of the recent advances in different components of imaging systems technology is that, these systems have become more resolution-limited and less noise-limited. The most useful tool utilized in characterization of resolution- limited systems is the Modulation Transfer Function (MTF). The goal of this work is to use the MTF as an image quality measure of image compression implemented by the JPEG (Joint Photographic Expert Group) algorithm and transmitted MPEG (Motion Picture Expert Group) compressed video stream through a lossy packet network. Although we realize that the MTF is not an ideal parameter with which to measure image quality after compression and transmission due to the non- linearity shift invariant process, we examine the conditions under which it can be used as an approximated criterion for image quality. The advantage in using the MTF of the compression algorithm is that it can be easily combined with the overall MTF of the imaging system.
Rapid advantages in computer and telecommunication technologies have made integrated services packet-switched networks possible. It is expected that a significant portion of future networks will carry prerecorded video. Traffic in applications such as Video on Demand (VoD), will include high-fidelity audio, short multimedia clips, and full-length movies. In this paper we study the effect of video rate smoothing of a prerecorded VBR sources, on network utilization and statistical multiplexing gain, such that the end users receive satisfactory service. The enhancement piecewise constant rate transmission and transport (e-PCRTT) algorithm is used as the smoothing algorithm which utilizes equal size intervals and its effect on the statistical characteristics of the multiplexed video is explored. We found that synchronization between the smoothing intervals can significantly improve the efficiency of the smoothing process by reducing the number of bandwidth changes and the rate variability of the multiplexed stream. We investigate how synchronized smoothing intervals influence the potential for statistical gain, as compared to unsynchronized streams and present several examples to illustrate the advantage of synchronized streams over unsynchronized streams.
An enhancement of the Piecewise Constant Rate Transmission and Transport (PCRTT) algorithm for reducing the burstiness of a video stream based on smoothing constant intervals is proposed. the two algorithms are compared by testing 12 compressed video streams according to the Motion-JPEG format. The new algorithms called e-PCRTT is shown to construct transmission rate-plans with smaller buffer sizes compared to the original PCRTT. Alternatively, for the same buffer size e-PCRTT reduces the number of bandwidth changes compared to PCRTT. In addition, e-PCRTT produces a rate- plane with smaller initial playback delay, which applications based on. We also introduce a new scheme for multiplexing several smoothed video traces, into a single constant-bit-rate channel that are synchronized nd have fixed-size intervals.
An algorithm to improve the image compression ratio, by applying low-pass filtering before the compression process is presented. Pre-filtering images prior to encoding can remove high frequencies of the original image, and thus improve the overall performance of the coder. The image degradation caused by the filter combined with the non- linear transformation of a typical compression algorithm reduce the entropy of the original image, thus higher compression ratios can be achieved. The perceived image at the decoder side is reconstructed according to a priory knowledge of the degraded filter by applying decompression and inverse filtering. The results of this work show an improvement of the compression ratio compare to the original Joint Picture Experts Group (JPEG) algorithm, with only small reduction of Mean Square Error. Our algorithm also succeeds to reduce the blocking effect that exists in the original JPEG algorithm.
Any image acquired by optical, electro-optical or electronic means is likely to be degraded by the environment. The resolution of the acquired image depends on the total MTF (Modulation Transfer Function) of the system and the additive noise. Image restoration techniques can improve image resolution significantly; however, as the noise increases, improvements via image processing become more limited because image restoration increases the noise level of the image. The purpose of this research is to check and characterize the MTF and noise level influences on target acquisition probability by a human observer, i.e., checking the worthwhileness of the restoration. The immediate quantity that was measured is not the probability of detection, but rather the number of targets of different sizes and degradation recognized in each scene. Conditions when restoration is advisable are determined. Further research will include real-world target recognition probability.
This paper deals with quantification of a two-dimensional (2- D) sampling process by pixel array. The idea is based on transformation of the Wigner-Seitz cell, which defines the sampling lattice in the spatial domain, into a 'bandwidth cell' in the spatial frequency domain. The area of the bandwidth cell is a quantitative measure of the sampling process. On this basis a description of the oversampling process is developed. We compare different configurations of the sampling pixel array.
Sampling MTF defined in Park, Hock, and de Luca, as an x and y sampling, can be generalized for image data not along x and y directions. For a given sampling lattice (such as in a laser printer, a scene projector, or a focal plane array), we construct a two-dimensional sampling MTF based on the distance between nearest samples in each direction. Because the intersample distance depends on direction, the sampling MTF will be best in the directions of highest spatial sampling, and poorer in the directions of sparse sampling. We compare hexagonal and rectangular lattices in terms of their equivalent spatial frequency bandwidth. We filter images as demonstration of the angular-dependent two-dimensional sampling MTF.
The classical method for determining target acquisition probabilities has always focused on the maximum spatial frequency (frmax) discernible in the image. On the other hand, it is known that the atmosphere degrades all the spatial frequencies as determined by the atmospheric modulation transfer function (MTF). The question arises: do the 'other' frequencies below frmax affect the target acquisition probability. We will present two experimental approaches to this question. In the first, we consider different atmospheric MTFs with the same value of frmax but with different MTF shapes. In the second, we consider a novel Wiener filter which restores all the frequencies to their value prior to the atmospheric blur. Laboratory measurements of observer response time when performing target acquisition will be presented for these case. The results will allow us to check the degree that the entire MTF should enter the target acquisition model, rather than frmax only.
Low noise images are contract-limited, and image restoration techniques can improve resolution significantly. However, as noise level increases, resolution improvements via image processing become more limited because image restoration increases noise. This research attempts to construct a reliable quantitative means of characterizing the perceptual difference between target and background. A method is suggested for evaluating the extent to which it is possible to discriminate an object which has merged with its surroundings, in noise-limited and contrast limited images, i.e., how hard it would be for an observer to recognize the object against various backgrounds as a function of noise level. The suggested model will be a first order model to begin with, using a regular bar-chart with additive uncorrelated Gaussian noise degraded by standard atmospheric blurring filters. The second phase will comprise a model dealing with higher-order images. This computational model relates the detectability or distinctness of the object to measurable parameters. It also must characterize human perceptual response, i.e. the model must develop metrics which are highly correlated to the ease or difficulty which the human observer experiences in discerning the target from its background. This requirement can be fulfilled only by conducting psychophysical experiments quantitatively comparing the perceptual evaluations of the observers with the results of the mathematical model.
Restoration of images blurred by an optical transfer function (OTF), or additive Gaussian noise which affect the Fourier transform amplitude and phase of the image, are considered. A method for reconstructing a two-dimensional image from power spectral data is presented. It is known that the spatial frequencies at which the Fourier transform F(u,v) of an image equals zero are called the real-plane zeros. It has been shown that real-plane zero locations have a significant effect on the Fourier phase in that they are the end points of phase function branch cuts, and it has been shown that real-plane zero locations can be estimated from Fourier transform magnitude data. Thus, real-plane zeros can be utilized in phase retrieval algorithms to help constrain the possible Fourier transform phase function. The purpose of this research is to recover the Fourier transform phase function from the knowledge of the power spectrum itself. By locating the points at which the Fourier transform intensity data are zero, we approximate a nonfactorizable function by its point-zero factors to recover an estimate of the object. A simple iterative method then successfully refines this phase estimate. The basic idea for the restoration is to separate the point-zeros of the modulation transfer function (MTF) or the additive noise from the point-zeros of the original image. Image restoration results according to the method of phase function retrieval for images degraded by additive noise and linear MTF are also presented.
KEYWORDS: Modulation transfer functions, Target acquisition, Target detection, Spatial frequencies, Imaging systems, Systems modeling, Aerosols, Signal to noise ratio, Sensors, Visual system
Many of the standard considerations for modeling of target acquisition of stationary targets in a single field of view must be altered when considering target acquisition in the actual battlefield. Sensors are scanning, targets are moving, and the detection process is considerably more complex than the analysis of the highest detectable frequency detectable superimposed on the target size. In this paper, we discuss several of the issues involved. The use of a single detectable frequency to model the target acquisition process is not sufficient. We consider the replacement of this model with one consisting of an integral of the target spectrum, taking into account the spatial frequency MTF dependence of the imaging system.
The resolution capability of imaging systems is affected by blur resulting from vibration and motion during the exposure. This blur is often more severe than electronic and optical resolution limitations inherent in the system. Such image quality degradation must be considered when dealing with the development and analysis of automatic target recognition (ATR) systems. This research analyzes the influence of image vibrations and motion on the probability of acquiring a target with an ATR system. The analysis includes accepted metrics that characterize the relationship existing between the target and its background. A high level of correlation is expected between these factors and the probability of target detection permitting efficient performance in the prediction and evaluation of any ATR system. Such correlations are considered here in the presence of sensor motion and vibration and situations are considered in which the probability of recognition is improved by the motion, despite the blur. The results of this research can be implemented in military applications as well as in developing image restoration procedures for image-blur conditions.
When carrying out medical imaging based on detection of isotopic radiation levels of internal organs such as lungs or heart, distortions and blur arise as a result of the organ motion during breathing and blood supply. Consequently, the image quality declines, despite the use of expensive high resolution devices. Hence, such devices are not exploited fully. There is a need to overcome the problem in alternative ways. Such as alternative is image restoration. We suggested and developed a method for calculating numerically the optical transfer function (OTF) for any type of image motion. The purpose of this reserach is restoration of original isotope images (of the lungs) by reconstruction methods that depend on the OTF of the real time relative motion between the object and the imaging system. This research uses different algorithms for the reconstruction of an image, according to the OTF of the lung motion, which is in several directions simultaneously. One way of handling the 3D movement is to decompose the image into several portions, to restore each portion according to its motion characteristics, and then to combine all the image portions back into a single image. As additional complication is that the image was recorded at different angles. The application of this reserach is in medical systems requiring high resolution imaging. The main advantage of this approach is its low cost versus conventional approaches.
Image blurring due to sensor motion can often be the limiting factor in the resolution of the picture and, hence, on our ability to detect targets. In a recent paper, we've proposed a method for the restoration of such images; good results have been obtained. In this paper we propose to quantitatively evaluate the effect of such a restoration process on automatic target acquisition. A method is derived to determine the probability of detection for targets which have been blurred by sensor motion. This method is then applied to restored and non-corrected pictures. The degree to which such real-time restoration is worthwhile is discussed.
In this paper, the incorporation of atmospheric aerosol and turbulence blur and motion blur into visible, near infrared, and thermal infrared target acquisition modeling is considered. Here, we show how the target acquisition probabilities and, conversely, the ranges at which objects can be detected are changed by the inclusion of these real-life environmental effects whose blur is often significantly greater than that of imaging system hardware. It is assumed that images are contrast-limited rather than noise-limited, as is indeed the case with most visible, near infrared (IR), and thermal IR sensors. For short focal lengths with low angular magnification, such environmental blur effects on target acquisition are negligible. However, for longer focal lengths with large angular magnification, resolution is limited by them and this has a strong adverse effect on target acquisition probabilities, times, and ranges. The considerable improvement possible with image correction for such environmental blur automatically in a fraction of a second is significant for contrast-limited imaging, and is discussed here too. Knowledge of such environmental MTF is essential to good system design and is also very useful in image restoration for any type of target or object.
This paper deals with the restoration of images blurred as a result of image motion or vibration. The key for restoration algorithm success is to derive accurately the Optical Transfer Function (OTF) representing the image motion degradation in the spatial frequency domain. The basic method of obtaining the OTF from the relative displacement between the camera and the object using a motion sensor has been developed recently and is discussed elsewhere. In this paper, the motion function is derived instead from analysis of a sequence of images. The first step is to obtain the image motion information from the sequence of images according to two well known algorithms - the Block Matching Algorithm (BMA) and Edge Trace Tracking (ETT). The basis for these two methods consists of tracking a block or an edge through a sequence of several images. The results of these two methods were fitted to a sinusoidal function, compared, and there was excellent agreement between them. Finally, the image is restored using the OTF obtained from the tracking method.
The effect of linear image motion and high frequency vibrations on human performance for target acquisition is considered. Two clutter metrics, one local and the other global, are combined to one metric of signal-to-clutter ratio (SCR). The SCR is used as a parameter in the model for actual target acquisition results. Two experiments involving human observers are considered. A static experiment is developed with spatial filters representing image motion, and a dynamic experiment is described which imitates the operation of a scanning camera with a constant velocity. It appears from both experiments that image motion increases the detection time of a target by the observer. As the complexity of the original image increases, detection time is more affected, increasing more rapidly with blur radius in the first experiment and with velocity in the second experiment.
One of the most apparent aspects of motion in real-time airborne systems is acceleration. In this paper, effects of acceleration are considered for two important concerns: image quality and target acquisition. A comparison between the effects of uniform velocity and accelerated motion is presented. There are two opposing considerations when flying over hostile territory: The pilot must fly as fast as possible so as not to be detected or attacked by the enemy, but sufficiently slow so that the degradation of the image quality will not be too severe. Mathematical tools developed recently permit a quantitative analysis of the effects of acceleration on the image quality and acquisition of the target.
The resolution capability of imaging systems is effected by a blurring effect due to the vibration and motion recorded in the image. This disturbance is often more severe than electronic and optical limitations inherent in the system. This fact must be considered when dealing with the development and analysis of automatic target recognition (ATR) systems for military applications. The aim of this research is to analyze the influence of image vibrations and motion upon the probability of acquiring the target with an ATR system. The analysis includes factors that characterize the relationship existing between the target and its background. A high level of correlation is expected between these factors and the probability of target detection, enabling efficient performance in the prediction and evaluation of any ATR system. The results of this research can be implemented in military applications as well as in developing image restoration procedures for image-blur conditions.
The effect of low frequency mechanical vibrations in the image plane on thermal imaging target acquisition is considered. A model is described that takes as input the mechanical vibration data such as amplitude and frequency and the physical characteristics of the target. The output is the probability of detection as a function of time. Analysis indicates that even low amplitude vibration greatly affects the predicted target detection times and, hence, the utility of the IR system in realistic scenarios.
A method of calculating numerically the optical transfer function appropriate to any type of image motion and vibration, including random ones, has been developed. This method has been verified experimentally, and the close agreement justifies implementation in image restoration for blurring deriving from any type of image motion. The goal of this research is to recover the original image from its degraded version. There are many methods of image restoration based on the point spread function. One of the common methods is the Wiener filter. Here, some image restorations of synthetic degradation and physically degraded images are presented, based on a constrained least squares improvement of the original Wiener filter. The key to restoration is determination of the optical transfer function unique to each particular image motion and vibration.
A method of calculating numerically the optical transfer function appropriate to any type of image motion and vibration, including random ones, has been developed. We compare the numerical calculation method to the experimental measurement; the close agreement justifies implementation in image restoration for blurring from any type of image motion. In addition, statistics regarding the limitation of resolution as a function of relative exposure time for low-frequency vibrations involving random blur are described. An analytical approximation to the probability density function for random blur has been obtained. This can be used for the determination of target acquisition probability. A comparison of image quality is presented for three different types of motion: linear, acceleration, and high-frequency vibration for the same blur radius. The parameter considered is the power spectrum of the picture.
One of the most apparent types of motion in real times systems is acceleration. In this paper the effects of this kind of motion are considered for two important areas, image quality and target acquisition. Comparison between effects of linear and acceleration motion is presented in the first section. The problem of acceleration motion is very critical for airborne photography in hostile territory. There are two opposing considerations when flying over hostile territory. On the one hand the pilot must fly fast so he will not revealed or attacked by the enemy. On the other hand he must fly slowly so the degradation process on the image quality will not be so severe. Mathematical tools developed recently permit quantitative analysis of effects of acceleration on image quality and acquisition of the target.
A method of calculating numerically the optical transfer function appropriate to any type of image motion and vibration, including random ones, has been developed. Here, the numerical calculation method is compared to experimental measurement, and the close agreement justifies implementation in image restoration for blurring deriving from any type of image motion. In addition, statistics regarding limiting resolution as a function of relative exposure time for low frequency vibrations involving random blur are described. An analytical approximation to the probability function has been obtained. This can be implemented in target acquisition probability. Comparison of image quality is presented for three different kinds of motion: linear, acceleration, and high frequency vibration for the same blur radius. The parameter considered is the power spectrum of the picture.
A method of calculating numerically the optical transfer function appropriate to any type of image motion and vibration, including random ones, has been developed. Here the numerical calculation method is compared to experimental measurement, and the close agreement justifies implementation in image restoration for blurring deriving from any type of image motion. In addition, statistics regarding limiting resolution as a function of relative exposure time for low frequency vibrations involving random blur are described. This can be implemented in target acquisition probabilities.
Low-frequency mechanical vibrations are a significant problem in robotics, machine vision, and practical reconnaissance where primary image vibrations involve random process blur radii. They cannot be described by an analytical MTF. A method of numerical calculation of MTF, relevant in principle to any type of image motion, is presented. It is demonstrated here for linear, high, and low vibration frequencies. The method yields the expected closed form solutions for linear and high-frequency motion. The low-vibration-frequency situation involves random process blur radii and MTFs that can only be handled statistically since no closed form solution is possible. This is illustrated here. Comparisons are made to a closed form approximate MTF solution suggested previously for low-frequency motion. Agreement between that analytical approximation and exact MTF calculated numerically is generally good, especially for relatively large and linear motion blur radius situations. For nonlinear short exposure motion, MTF levels off at relatively high nonzero values and never approaches zero. Such situations yield a two-fold benefit: (1) larger spatial frequency bandwidth and (2) higher MTF values at all spatial frequencies since MTF does not approach zero.
A new method of numerical calculation of MTF is presented here for image motion in one- dimension. The method is applicable in principle to any type of motion and can be expanded to two-dimensional motion. It is applied here to uniform velocity motion and to sinusoidal vibrations. Comparison to known analytical methods is made where possible, and agreement is excellent. This supports its implementation to any kind of random motion, particularly where no unique analytical MTF is possible.
In many high-resolution photographic and photoelectronic imaging systems, resolution is limited by image motion and vibration and, as a result, the high-resolution capability of the sensor may be wasted. In normal reconnaissance and robotics the sensor moves during the exposure. Some of the resulting image motion can be removed by mechanical compensation, but not all of it. The residual motion blurs the image, and usually this blur becomes the limiting factor for many high-quality imaging systems. The ever-increasing altitudes and coverage requirements of modern imaging have put a premium on high resolution. An application of this paper is the recovery of the original image by inverse filtering that depends on the modulation transfer function (MTF) of the real-time relative motion between the object and the imaging system. An original method developed here for numerically calculating MTF for any type of image motion is the basis of the paper.
A theoretical model, developed by Wulich and Kopeika that gives the MTF for flow vibration frequency sinusoidal image motion applicable to reconnaissance, robotics, and computer vision, is evaluated experimentally to determine (1) accuracy of the MTF model and the validity of assumptions upon which it is based, (2) accuracy of "lucky shot" theoretical analysis to determine the number of independent images required to obtain at least one good quality image, and (3) accuracy of prediction for average blur radius. In most cases agreement between theory and experiment is quite good. Discrepancies are not too great and are attributed to problems with underlying theoretical assumptions where uniform linear motion cannot be assumed. The theory and experiment here are confined to low-frequency sinusoidal vibration where blur radius and spatial frequency content are random processes.
Numerical calculations of modulation transfer functions (MTFs) for low-frequency mechanical vibrations are presented. The problem is significant in practical reconnaissance where primary vibrations are at frequencies too low to be described by the usual closed form Bessel function MTF. The low vibration frequency situation involves random process blur radii and MTFs which can only be handled statistically since no closed form solution is possible. This is illustrated here. Comparisons are made to a closed form approximate MTF solution suggested previously. Agreement is generally good, especially for relatively large and linear blur radius situations.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.