In minimally invasive surgery, smoke generated by such as electrocautery and laser ablation deteriorates image quality severely. This creates discomfortable view for the surgeon which may increase surgical risk and degrade the performance of computer assisted surgery algorithms such as segmentation, reconstruction, tracking, etc. Therefore, real-time smoke removal is required to keep a clear field of view. In this paper, we propose a real-time smoke removal approach based on Convolutional Neural Network (CNN). An encoder-decoder architecture with Laplacian image pyramid decomposition input strategy is proposed. This is an end-to-end network which takes the smoke image and its Laplacian image pyramid decomposition as inputs, and outputs a smoke free image directly without relying on any physical models or estimation of intermediate parameters. This design can be further embedded to deep learning based follow-up image guided surgery processes such as segmentation and tracking tasks easily. A dataset with synthetic smoke images generated from Blender and Adobe Photoshop is employed for training the network. The result is evaluated quantitatively on synthetic images and qualitatively on a laparoscopic dataset degraded with real smoke. Our proposed method can eliminate smoke effectively while preserving the original colors and reaches 26 fps for a video of size 512 × 512 on our training machine. The obtained results not only demonstrate the efficiency and effectiveness of the proposed CNN structure, but also prove the potency of training the network on synthetic dataset.
There is growing interest in video-based solutions for people monitoring and counting in business and security
applications. Compared to classic sensor-based solutions the
video-based ones allow for more versatile functionalities,
improved performance with lower costs. In this paper, we propose a real-time system for people counting
based on single low-end non-calibrated video camera.
The two main challenges addressed in this paper are: robust estimation of the scene background and the number
of real persons in merge-split scenarios. The latter is likely to occur whenever multiple persons move closely,
e.g. in shopping centers. Several persons may be considered to be a single person by automatic segmentation
algorithms, due to occlusions or shadows, leading to under-counting. Therefore, to account for noises, illumination
and static objects changes, a background substraction is performed using an adaptive background model
(updated over time based on motion information) and automatic thresholding. Furthermore, post-processing
of the segmentation results is performed, in the HSV color space, to remove shadows. Moving objects are
tracked using an adaptive Kalman filter, allowing a robust estimation of the objects future positions even under
heavy occlusion. The system is implemented in Matlab, and gives encouraging results even at high frame rates.
Experimental results obtained based on the PETS2006 datasets are presented at the end of the paper.
In this paper we propose a generic framework for efficient retrieval of audiovisual media based on its audio content. This framework is implemented in a client-server architecture where the client application is developed in Java to be platform independent whereas the server application is implemented for the PC platform. The client application adapts to the characteristics of the mobile device where it runs such as screen size and commands. The entire framework is designed to take advantage of the high-level segmentation and classification of audio content to improve speed and accuracy of audio-based media retrieval. Therefore, the primary objective of this framework is to provide an adaptive basis for performing efficient video retrieval operations based on the audio content and types (i.e. speech, music, fuzzy and silence). Experimental results approve that such an audio based video retrieval scheme can be used from mobile devices to search and retrieve video clips efficiently over wireless networks.
In this paper we present a novel approach to shape similarity estimation based on ordinal correlation. The proposed method operates in three steps: object alignment, contour to multilevel image transformation and similarity evaluation. This approach is suitable for use in CBIR, shape classification and performance evaluation of segmentation algorithms. The proposed technique produced encouraging results when applied on the MPEG-7 test data.
In this paper, we present a novel approach for describing and estimating similarity of shapes. The target application is content-based indexing and retrieval over large image databases. The shape feature vector is based on the efficient indexing of high curvature (HCP) points which are detected at different levels of resolution of the wavelet transform modulus maxima decomposition. The scale information, together with other topological information of those high curvature points are employed in a sophisticated similarity algorithm. The experimental results and comparisons show that the technique isolates efficiently similar shapes from a large database and reflects adequately the human similarity perception. The proposed algorithm also proved efficient in matching heavily occluded contours with their originals and with other shape contours in the database containing similar portions.
In this paper we present a technique for shape similarity estimation for content-based indexing and retrieval over large image databases. Here the high curvature points are detected using wavelet decomposition. The feature set is extracted under the framework of polygonal approximation. It uses simple features extracted at high curvature points. The experimental result and comparisons show the performance of the proposed technique. This technique is also suitable to be extended to the retrieval of 3D objects.
Until recently, collections of digital images were stored in classical databases and indexed by keywords entered by a human operator. This is not longer practical, due to the growing size of these collections. Moreover, the keywords associated with an image are either selected from a fixed set of words and thus cannot cover the content of all images; or they are the operators' personal description of each image and, therefore, are subjective. That is why systems for image indexing based on their content are needed. In this context, we propose in this paper a new system, MUVIS*, for content-based indexing and retrieval for image database management systems. MUVIS*indexes by key words, and also allows indexing of objects and images based on color, texture, shape and objects' layout inside them. Due to the use of large vector features, we adopted the pyramid trees are used for creating the index structure. The block diagram of the system is presented and the functionality of each block is explained. The features used are presented as well.
Rational filters are extended to multichannel signal processing and applied to the image interpolation problem. The proposed nonlinear interpolator exhibits desirable properties, such as, edge and details preservation. In this approach the pixels of the color image are considered as 3-component vectors in the color space. Therefore, the inherent correlation which exists between the different color components is not ignored; thus, leading to better image quality than those obtained by component-wise processing. Simulations show that the resulting edges obtained using vector rational filters (VRF) are free from blockiness and jaggedness, which are usually present in images interpolated using especially linear, but also some nonlinear techniques, e.g. vector median hybrid filters (VFMH).
There is a finite number of different weighted order statistic (WOS) filters of a fixed length N. However, even for relatively small values of N, one cannot immediately see if two given WOS filters are the same by simply looking at the weights and the thresholds. This problem is addressed in this paper. We define two WOS filters to be equivalent (the same) if they produce the same output for arbitrary inputs. We shall show that the solution requires the use of integer linear programming and next develop a hierarchical heuristical procedure which may provide a much quicker solution to the given problem. The hierarchy starts with simple checks and proceeds to more and more complicated tests. The procedure is exited as soon as a definite conclusion is reached.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.