Abstract

Joint Adaptive Pre-processing Resilience and Post-processing Concealment Schemes for 3D Video Transmission

3D Multi-View Video (MVV) is multiple video streams shot by several cameras around one scene simultaneously. In Multi-view Video Coding (MVC), the spatio-temporal and interview correlations between frames and views are often used for error concealment. 3D video transmission over erroneous networks remains a substantial issue thanks to restricted resources and therefore the presence of severe channel errors. Efficiently compressing 3D video with a low transmission rate, while maintaining a top quality of the received 3D video, is extremely challenging. Since it's not plausible to re-transmit all the corrupted Macro-Blocks (MBs) thanks to real-time applications and limited resources. Thus it's mandatory to retrieve the lost MBs at the decoder side using sufficient post-processing schemes, like Error Concealment (EC). Error Concealment (EC) algorithms have the advantage of enhancing the received 3D video quality with no modifications within the transmission rate or within the encoder hardware or software. During this presentation, I will be able to explore tons of and different Adaptive Multi-Mode EC (AMMEC) algorithms at the decoder supported utilizing various and adaptive pre-processing techniques, i.e. Flexible Macro-block Ordering Error Resilience (FMO-ER) at the encoder; to efficiently conceal and recover the erroneous MBs of intra and inter-coded frames of the transmitted 3D video. Also, I will be able to present extensive experimental simulation results to point out that our proposed novel schemes can significantly improve the target and subjective 3D video quality.

In this paper, secure, timely, fast, and reliable transmission of Wireless Capsule Endoscopy (WCE) images having abnormalities to the physicians are considered. The proposed algorithm uses the image preprocessing technique followed by edge detection using the Fisher Transform (FT) and morphological operation so as to extract features. Implementation of a binary classifier called Linear Support Vector Machine (LSVM) is completed so as to classify the WCE images followed by channel condition gain, specific frame are going to be transmitted to the physician. Thus it's mandatory to retrieve the lost MBs at the decoder side using sufficient post-processing schemes, like error concealment (EC). During this paper, we propose an adaptive multi-mode EC (AMMEC) algorithm at the decoder supported utilizing pre-processing flexible macro-block ordering error resilience (FMO-ER) technique at the encoder; to efficiently conceal the erroneous MBs of intra and inter-coded frames of 3D video. Experimental simulation results show that the proposed FMO-ER/AMMEC schemes can significantly improve the target and subjective 3D video quality.

Text superimposed on the video frames provides supplemental but important information for video indexing and retrieval. The detection and recognition of text from the video is thus a crucial issue in automated content-based indexing of visual information in video archives. Text of interest isn't limited to static text. They might be scrolling during a linear motion where only a part of the text information is out there during different frames of the video. The matter is further complicated if the video is corrupted with noise. An algorithm is proposed to detect, classify and segment both static and straightforward linear moving text in a complex noisy background. The extracted texts are further processed using averaging to achieve a top-quality suitable for text recognition by commercial optical character recognition (OCR) software.

We have developed a system with multiple pan-tilt cameras for capturing high-resolution videos of a moving person. This technique controls the cameras in order that each camera captures the simplest view of the person (i.e. one among body parts like the top, torso, and limbs) supported criteria for camera-work optimization. For achieving this optimization in real-time, time-consuming pre-processes, which give useful clues for the optimization, are performed during a training stage. Specifically, a target performance (e.g. a dance) is captured to accumulate the configuration of the body parts at each frame. During a real capture stage, the system compares an online-reconstructed shape with those within the training data for fast retrieval of the configuration of the body parts. The retrieved configuration is employed by an efficient scheme for optimizing special effects. Experimental results show the special effects optimized in accordance with the given criteria. A high-resolution 3D videos produced by the proposed system also are shown as typical use of high-resolution videos.

 


Author(s): Walid El-Shafai

Abstract | PDF

Share This Article