Research Area

2.5  Video coding

  We are researching video coding techniques to transmit full-featured 8K SHV and to realize SHV terrestrial broadcasting.


Performance improvement of 8K/120-Hz HEVC encoder

  We developed an encoder which compresses an 8K video with a frame frequency of 119.88-Hz (hereafter simplified to 120-Hz) using the high efficiency video coding (HEVC) scheme and exhibited it at the NHK STRL Open House 2018. In FY 2018, we improved its video encoding control method to achieve better image quality and added an interface using TS over IP (SMPTE ST 2022-2) to support higher bit rate than ever before. The addition of the TS over IP interface increased the bit rate to the maximum of 480 Mbps and also enabled the transmission of a high-quality 8K/120-Hz video through diverse channels (Figure 2-12).
  For the encoder, we employed a quasi 2-pass encoding technology, which encodes a down-converted 4K/59.94-Hz (hereafter simplified to 60-Hz) video ahead of the corresponding 8K/120-Hz video and controls the 8K encoding using the compression results of the 4K video. This technology uses an encoding control method that adequately allocates the bit amount for an 8K/120-Hz video by detecting areas in a video frame, such as high complexity parts and slice boundaries, in which conspicuous deterioration caused by encoding can be seen. We evaluated the improvement effect of this method using peak signal-to-noise ratio (PSNR). The results demonstrated an improvement in the image quality at a bit rate of around 100 Mbps, including an improvement of about 2.5 dB in a test image(1). This research was conducted in cooperation with FUJITSU LABORATORIES LTD.


Evaluation of backward-converted 120-Hz video

  ARIB Standard STD-B32, which includes the specifications of the video coding scheme for digital broadcasting, employs a stream structure that supports both 120-Hz and 60-Hz. When 120-Hz broadcast is provided, a receiver supporting only 60-Hz decodes and displays video that is frame sub-sampled from 120-Hz video (backward compatibility). We conducted subjective evaluation experiments by non-specialists to verify image quality degradation by the stroboscopic effect caused by frame sub-sampling. The results showed no significant degradation in image quality and demonstrated even a possibility that the subjective quality of sub-sampled fast-moving images is better than that of images produced by a 60-Hz system(2).



Figure 2-12. 8K/120-Hz HEVC encoder

Development and standardization of 8K file format

  We are researching 8K file-based recording technology to enable file-based exchange of 8K content within the broadcast stations. In FY 2018, we studied coding parameters of a file format used for play-out, program exchange and archiving and developed a prototype HEVC decoder for verification (Figure 2-13).
  To study the coding parameters, we investigated the PSNR values for the various combinations of the bit rate and the GOP (Group of Pictures) length through coding experiments with three repetitions of sequential encoding and decoding and verified the subjective image quality for each combination.
  For the standardization of the file format, we participated in the newly launched ARIB JTG (Joint Task Group) on 4K/8K file format and formulated the requirements.


Development and standardization of next-generation video coding technologies

  We are developing high-efficiency video coding technologies for next-generation terrestrial TV broadcasting. As coding tools for intra prediction, we developed a method for improving intra prediction by controlling the coding order, a method for increasing the prediction accuracy by changing the filter applied for prediction signals according to the distance from reference signals when generating prediction samples, and a method for controlling the transform adaptively according to the prediction mode of chroma samples(3)(4). As coding tools for inter prediction, we developed a method for smooth interpolation using the neighboring motion vectors and a method for extrapolation from a certain direction. We also developed a deblocking filter that changes the filter intensity according to the luminance level and a method for changing the transform when generating prediction images using both intra prediction and inter prediction(5). We proposed these technologies as coding tools for Versatile Video Coding (VVC), a next-generation video coding scheme for which standardization efforts began at the JVET (Joint Video Experts Team) international standardization working group formed between ITU-T and ISO/IEC. The proposed method for improving the deblocking filter was adopted in a working draft JVET-L0414. We also contributed to the development of common test conditions for technical evaluation in standardization efforts and formulated the conditions for HDR video coding(6).
  Additionally, we helped the JCT-VC international standardization working group prepare guidelines for the combinations of practical video formats and interfaces, which are industrially required for HEVC codec development(7).



Figure 2-13. Appearance of prototype decoder

Development of coding techniques using machine learning and super-resolution reconstruction and image quality assessment method

  We studied the use of machine learning to increase the speed of the intra prediction mode decision for video coding. As an alternative to a conventional method for deciding the prediction mode using rate-distortion optimization with high computational complexity, we investigated a method for building a convolutional neural network using the information of pixels around the coding unit and the intra prediction mode applied to the neighboring blocks as inputs. Also, to reduce the computation load of neural networks, we used multiple types of neural networks with a small number of parameters by switching among them in multiple stages according to the frequency of the intra prediction mode. We confirmed that this method can reduce the computation volume while suppressing the deterioration in coding efficiency. This research was conducted in cooperation with Meiji University.
  We developed an inter prediction technology that uses a super-resolution technology and a blurring technology (Figure 2-14). Conventional video coding methods perform inter prediction by comparing the locally decoded images of past and present. This means that the resolution of images captured by swivelling the camera (camera panning) tends to be higher when the image is still but lower when it contains large motion due to the influence of the charge storage effect of the camera sensor. These resolution variations between frames cause a resolution difference between reference signals and the signals to be coded in inter prediction, leading to a decline of prediction efficiency. Focusing on this phenomenon, we applied a registration super-resolution process between wavelet multi-scale components and a blurring process using wavelet decomposition. We demonstrated that coded images with a higher quality can be achieved in coding of moving images containing camera panning and the local motion of objects in the frame by using image signals applied with super-resolution reconstruction and blurring process as prediction reference candidates in addition to conventional prediction reference signals for inter prediction(8).
  We studied objective quality metrics suited for HDR image coding using the Hybrid Log-Gamma (HLG) method. Previously, objective quality metrics for HDR image coding were studied for methods using a gamma curve based on the human vision system, such as the perceptual quantizer (PQ) method. Using compressed HLG images, we investigated correlations between the results of subjective evaluation experiments and the values derived from various objective quality metrics. The HLG gamma curve has largely different properties from those of PQ and other methods. The results of our investigation showed that objective quality metrics using the HLG curve were the best and that some objective metrics considered to have a good performance in previous studies are not suited for HLG image coding(9). This research was conducted in cooperation with Universitat Pompeu Fabra.



Figure 2-14. Process block diagram of video coding

Pre-coding processor

  We developed a technology for automatically controlling the parameters of a video processor(10) that performs noise reduction and low-pass filtering as a pre-coding process to suppress possible image breakdown when a moving image subject to coding degradation is entered. We implemented the technology into our video processor for performance improvement. Focusing on the fact that the high-frequency-band components after wavelet-packet decomposition of input video are noise and strong edge components, this technology controls the amount of noise reduction and low-pass filtering from that level. We used the equipment that we developed for transmission experiments on advanced terrestrial TV broadcasting technology in Tokyo and Nagoya, and demonstrated that it was effective for suppressing image breakdown and improving comprehensive broadcast quality. This research was conducted as a government-commissioned project from the Ministry of Internal Affairs and Communications titled "R&D on Advanced Technologies for Terrestrial Television Broadcasting."


 

[References]
(1) K. Chida, X. Lei, S. Iwasaki, Y. Sugito, H. Miyoshi, Y. Uehara, K. Iguchi, K. Kanda: "Development of 8K120Hz Real-time Video Codec," ITE Annual Convention, 22C-1 (2018) (in Japanese)
(2) K. Miura, K. Chida, A. Ichigaya, K. Kanda, Y. Takiguchi and Y. Nishida: "A Study of Backward Compatibility in High Frame Rate Broadcasting," IEICE General Conference, D-11-1 (2019) (in Japanese)
(3) S. Iwamura, S. Nemoto, K. Iguchi, A. Ichigaya, et al.: "Description of SDR and HDR video coding technology proposal by NHK and Sharp," JVET-J0027 (2018)
(4) S. Iwamura, S. Nemoto, A. Ichigaya: "CE6-related: Implicit transform selection for Multi directional LM," JVET-M0480 (2019)
(5) S. Iwamura, S. Nemoto, A. Ichigaya: "CE6-related: Implicit transform selection for Multi-hypothesis inter-intra mode," JVET-M0482 (2019)
(6) A. Segall, E. François, S. Iwamura, D. Rusanovskyy: "JVET common test conditions and evaluation procedures for HDR/WCG video," JVET-L1011 (2018)
(7) Y. Syed, A. Ichigaya, C. Seeger: "Baseband Signalling Specifications Carriage of Usage of Video Signal Coding Types," JCTVC-AH0022 (2019)
(8) Y. Matsuo, A. Ichigaya, K. Kanda: "Super-Resolved and Blurred Decoded Pictures for Improving Coding Efficiency in Inter-Frame Prediction," Proceedings of IEEE ICFSP, pp.120-124 (2018)
(9) Y. Sugito and M. Bertalmío: "Performance Evaluation of Objective Quality Metrics on HLG-Based Image Coding," 2018 IEEE Global Conference on Signal and Information Processing (GlobalSIP), Anaheim, US, pp.96-100 (2018)
(10) Y. Matsuo, K. Iguchi and K. Kanda: "Pre-Processing Equipment of 8K Video Codec with Low-Pass Filtering and Noise Reduction Functions," Proceedings of IEEE ISSPIT, Smart and Sustainable Communities (2018)