Standard video compression algorithms use multiple “Modes”, which are various linear combinations of pixels for prediction of their neighbors within image Macro-Blocks (MBs). In this research, we are using Deep Neural Networks (DNN) with supervised learning to predict block pixels. Using DNNs and employing intra-block pixel values’ calculations that penetrate into the block, we manage to obtain improved predictions that yield up to 200% reduction of residual block errors. However, using intra-block pixels for predictions brings upon interesting tradeoffs between prediction errors and quantization errors. We explore and explain these tradeoffs for two different DNN types. We further discovered that it is possible to achieve a larger dynamic range of quantization parameter (Qp) and thus reach lower bit-rates than standard modes, which already saturate at these Qp levels. We explore this phenomenon and explain its reasoning.
One fundamental component of video compression standards is Intra-Prediction. Intra-Prediction takes advantage of redundancy in the information of neighboring pixel values within video frames to predict blocks of pixels from their surrounding pixels and thus allowing to transmit the prediction errors instead of the pixel values themselves. The prediction errors are of smaller values than the pixels themselves, thus allowing to accomplish compression of the video stream. Prevalent standards take advantage of intra-frame pixel value dependencies to perform prediction at the encoder end and transfer only residual errors to the decoder. The standards use multiple “Modes”, which are various linear combinations of pixels for prediction of their neighbors within image Macro-Blocks (MBs). In this research, we have used Deep Neural Networks (DNN) to perform the predictions. Using twelve Fully Connected Networks, we managed to reduce Mean Square Error (MSE) of the predicted error by up to 3 times as compared to standard modes prediction results. This substantial improvement comes at the expense of more extensive computations. However, these extra computations can be significantly mitigated by the use of dedicated Graphical Processing Units (GPUs).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.