NHK Laboratories Note No. 473

Analysis of Overlapped Block Motion Compensation Based on a Statistical Motion Distribution Model

by
Wentao ZHENG, Masahide NAEMURA
(Advanced Audio and Video Coding Division)
ABSTRACT
    Overlapped block motion compensation (OBMC) has been shown to provide reduced prediction errors as well as reduced blocking artifacts compared with the conventional non-overlapped block motion compensation (NOBMC). However, there is no satisfactory theoretical basis that clearly interprets why OBMC can reduce prediction errors. In this paper we present a theoretical analysis of OBMC based on a novel statistical motion distribution model. Our analysis proves theoretically that prediction errors increase towards block boundaries and that OBMC has error reduction and equalization property, with the errors being more reduced at block boundaries than at block centers. The analytical results are justified by empirical experiments with typical image sequences.

1. Introduction
    Motion compensation (MC) is widely used for removing temporal redundancies in the coding of image sequences. Overlapped block motion compensation (OBMC) has been shown to provide reduced prediction errors as well as reduced blocking artifacts compared with the conventional non-overlapped block motion compensation (NOBMC) [1-4], and has been incorporated into ITU-T H.263 and ISO/IEC MPEG-4 standards [5-6]. Previous works related to OBMC techniques include performance comparison of various window shapes [2-4] and improvement of motion estimation methods [4,7,8]. Specifically, in [4], an approach that combines the optimized window design technique with optimized motion estimation was presented. However, there is no theoretical basis that clearly interprets the mechanism of prediction error reductions in OBMC.
    Recently, Lee and Kim presented a theoretical approach to the analysis of OBMC based on the 1-D signal model characterized by the AR(1) process and first-order polynomial motion [9] assumption. Unfortunately, their analysis is by nature restricted to 1-D cases, and the first-order polynomial motion assumption is too constrained for real image sequences. Further, image noise and motion estimation errors are not taken into account in their analysis. As a consequence, the obtained prediction error model can not give a satisfactory explanation of empirical results.
    In this paper, we present an alternative approach to the theoretical analysis of OBMC. It is based on a statistical motion distribution model which we have previously proposed to interpret the space-dependent characteristics of motion-compensated frame differences [10]. We show that prediction errors increase towards block boundaries and that OBMC has error reduction and equalization property, with the errors at block boundaries being more reduced than those at block centers. Although the same conclusions were reached in [4,9], our analysis is formulated strictly on a mathematical basis. Compared to [9], our analysis applies to 2-D cases directly, relaxes the first-order polynomial motion restriction, and leads to a more accurate prediction error model.

2. A statistical motion distribution model
    We briefly explain the statistical motion distribution model introduced in [10], which is the basis of our analysis.
    Let denote a pixel in an MN block and its motion vector be , with Similarly, let (p,q) be another pixel in the block. We showed that can be appropriately modeled with the following probability distribution,


where cH and cV are constants that represent the amount of fluctuation of motion in the horizontal and vertical directions, respectively. s is the aspect ratio of a pixel.
    This model assumes that the difference in motion at the pixel from that at (p,q) has a zero-mean Gaussian distribution with a variance proportional to the squared distance between these two pixels. In other words, the expected value of the motion at is the same as that at (p,q) but it is expected to fluctuate more as the pixel is farther from (p,q).
    Based on the model (1), the space-dependent characteristics of motion-compensated frame differences can be gracefully interpreted [10]. When only one motion vector is used for a whole block, block-matching based motion estimation methods tend to obtain the motion vector of the block center. Letting p=(M-1)/2, q=(N-1)/2, we can derive that the mean and the variance of a prediction error d at are
where denoting the variances of image intensity derivatives. is the variance of noise added to the image intensities. The noise is assumed to be statistically independent of the image intensities. is the variance of the motion vector error which is due to finite precision motion estimation.
    The first term of (3) is proportional to and is caused by motion fluctuation. If the motion is constant over the block (i.e.,cH=cV=0), this term is zero; the more the motion changes, the larger this term will be. It is clear from (3) that prediction errors at block boundaries are larger than those at block centers.

3. Analysis of OBMC based on the statistical motion distribution model
3.1. Modeling OBMC
    For explanatory simplicity, we consider the 1-D case. However, we emphasize that our analysis can be extended to 2-D cases straightforwardly, while the analysis in [9] is by nature restricted to the 1-D case.
    As illustrated in Figure 1, B1 and B2 are two adjacent blocks of size M. C1 and C2 are their respective centers. w 1 and w 2 are the shifted versions of a window function , centered at C1 and C2 respectively. is defined on , and for it satisfies the following constraints.



Figure 1. Overlapped block motion compensation.
Figure 1. Overlapped block motion compensation.

    We consider the motion compensation of a pixel . Denoting the prediction error at by d 1() when using the motion vector of B1 and by d 2() when using the motion vector of B 2 respectively, the OBMC prediction error d obmc() is, of course,

where f () is the image intensity and n() is the noise added to f. u 1 and u 2 represent errors in motion vectors which are introduced to take into account the fact that motion vectors are usually estimated in a precision of one or half-pixel. It is reasonable to assume u 1 and u 2 to be uniformly distributed random variables. u 1 and u 2 are also assumed to be statistically independent of each other and of the other terms in (6).
    As the special 1-D case of equations (2) and (3), the means and the variances of d 1() and d2() can be easily derived.




where is a constant. is the correlation coefficient between , and is the correlation coefficient between


    On the other hand, for pixels , OBMC is performed using the motion vector of B 1 and the motion vector of its left neighboring block. Therefore, the following result can be readily obtained,


3.2. Comparison with NOBMC
    Letting w 1()=1 and w 2()=0, equation (12) reduces to NOBMC. If we denote the NOBMC prediction error by d nobmc(), the variance of d nobmc() is


In the following, we compare OBMC and NOBMC based on (12)-(15).


Figure 2. Comparison of the terms due to motion fluctuation.

    First, we compare and , which are due to motion fluctuation. Figure 2 illustrates gobmc() for = -1, -0.5, 0, 0.5, 1 together with gnobmc() . The block size is M=16, and the "Raised Cosine" window function is assumed. The dashed line indicates the NOBMC case. It is obvious that the first term of (12) depends on correlation functions and is not necessarily monotonically increasing with moving away from the block center. We also note that the sum of this term over the whole block is smaller than that of NOBMC roughly below the line of =0.5.
    In OBMC, the pixels to be predicted lie between the adjacent block centers. Considering the spatial continuity of motion, it is reasonable to assume that are usually of opposite signs, that is, 0. On the other hand, since are intensities of two points that are spatially very close to each other, their derivatives can be usually assumed to be positively correlated ( > 0). As a consequence, we can conclude that > 0 holds for general image sequences and that the first term of (12) is usually smaller than that of (14).
    Next, we note that the second term , which is due to noise added to image intensities, remains the same for OBMC and NOBMC.
    Finally, we compare the third terms, which are due to finite precision motion vector estimation. Since , it is clear that the term is more reduced in OBMC than in NOBMC.
    From the above comparison of (12) and (14), we can conclude that OBMC has the following properties. First, by using a smooth, overlapped window function, the intensity discontinuities at block boundaries are suppressed. Second, the overall motion-compensated frame differences are reduced. Third, the increase in motion-compensated frame differences towards block boundaries is suppressed, which is known as the error equalization property of OBMC [4,9]. The first property reduces blocking artifacts in coded images and thus improves subjective image quality. The second property improves coding efficiency by reducing bits needed for the coding of motion-compensated frame differences. Since the image coding algorithms adopted in ITU-T H.263 and ISO/IEC MPEG-4 standards are designed upon the assumption of a stationary signal source, the third property matches the input signal to the coder and therefore contributes to coding efficiency.

4. Experimental results
    To justify our analysis, we conducted empirical experiments with four ITU-R standard sequences, flower garden, table tennis, bicycles, and popple. For each sequence, 450 frames at a frame rate of 30 Hz were used. The image size is 352 240 pixels (SIF format). The first frames of the four sequences are shown in Figure 3.


Figure 3. The first frames of the test sequences.

    The block size for motion estimation and compensation is 16 16 pixels. For motion estimation, half-pixel accuracy was chosen, and the motion vector search range was restricted to [-16,15.5]. Half-pixel values were determined using bilinear interpolation. The SSD was used as the search criterion. For OBMC, the "Raised cosine" window function was used.
    For each sequence, all but boundary blocks were claimed for prediction error variance calculation. Boundary blocks were excluded from consideration because they are unlikely to be compensated for correctly. These blocks are likely to be coded in intra-frame mode. The prediction error variance at ( , ) is defined as the average of the squared prediction errors of the corresponding pixels in all non-boundary blocks of each sequence.
    Figure 4 shows the experimental results for NOBMC. The phenomenon that prediction errors increase toward block boundaries is clearly demonstrated. In [10], we have shown that (3) closely fits the empirical data.
    Figure 5 shows the experimental results for OBMC. And the comparison of prediction error variances of OBMC and NOBMC is given in detail in Table 1, where the center region is defined as the four pixels at the block center and the boundary region refers to a narrow band of two pixel width along the block boundary. It's clear that for all the sequences, the overall prediction errors are reduced in OBMC, with prediction errors at block boundaries being more suppressed than at block centers. These empirical results are completely consistent with our theoretical analysis in Section 3.


Figure 4. Variances of prediction errors of NOBMC.


Figure 5. Variances of prediction errors of OBMC.

Table 1. Comparison of averaged variances of prediction errors.

5. Conclusions
    In this paper, we presented a theoretical approach to the analysis of OBMC based on a statistical motion distribution model. We theoretically proved that prediction errors increase towards block boundaries and that OBMC has error reduction and equalization property, with the errors at block boundaries being more reduced than those at block centers. Our analysis is formulated strictly on a mathematical basis, and applies to 2-D cases directly. The analysis gives a new insight into the characteristics of OBMC and provides an accurate prediction error model which will benefit the design of an optimal coding algorithm.



References

  1. H. Watanabe and S. Singhal, "Windowed motion compensation," in SPIE Visual Comm. Image Processing, pp. 582-589, Nov. 1991.
  2. C. Auyeung, J. Kosmach, M. Orchard, and T. Kalafatis, "Overlapped block motion compensation," in SPIE Visual Comm. Image Processing, pp. 561-571, Nov. 1992.
  3. J. Katto, J. Ohki, S. Nogaki., and M. Ohta, "A wavelet codec with overlapped motion compensation for very low bit-rate environment," IEEE Trans. Circuit and Systems for Video Tech., vol. 4, no. 3, pp. 328-338, June 1994.
  4. M. T. Orchard and C. J. Sullivan, "Overlapped block motion compensation: An estimation-theoretic approach," IEEE Trans. Image Processing, vol. 3, no. 9, pp. 693-699, Sept. 1994.
  5. ITU-T Recommendation H.263, "Video coding for low bitrate communication," Feb. 1998.
  6. Motion Picture Experts Group (ISO/IEC JTC 1/SC 29/WG 11)}, "Information technology - coding of audio-visual objects -, part 2: Visual (ISO/IEC 14496-2)," 1999.
  7. T. Kuo and C. J. Kuo, "Complexity reduction for overlapped block motion compensation," SPIE Visual Comm. Image Processing, Feb. 1997, pp. 303-314.
  8. R. Rajagopalan, E. Feig, and M. T. Orchard, "Motion optimization of ordered blocks for overlapped block motion compensation," IEEE Trans. Circuit and Systems for Video Tech., vol. 8, no. 2, pp. 119-123, Apr. 1998.
  9. S. Lee and J. Kim, "Analysis on prediction efficiency of overlapped block motion compensation," IEICE Trans. Commun., vol. E82-B, no. 7, pp. 1069-1072, July 1999.
  10. W. Zheng, Y. Kanatsugu, S. Itoh, and Y. Tanaka, "Analysis of space-dependent characteristics of motion-compensated frame differences," in ICIP 2000 Proceedings, Sept. 2000, vol. III, pp. 158-161.
Zheng Wentao Zheng received his B.S. and M.S. degrees in electronics engineering from Tsinghua University, Beijing, China, in 1989 and 1991, respectively, and his Ph.D. degree in electronics engineering from the University of Tokyo in 1994. He then worked as a research fellow with the Telecommunications Advancement Organization of Japan (TAO). From 1995 to 1997, he was offered a National Institute Post Doctoral Fellowship from the Research Development Corporation of Japan (JDRC), working with NHK (Japan Broadcasting Corporation) Science and Technical Research Laboratories on stereoscopic image coding. He joined NHK Science and Technical Research Laboratories in 1997. His research interests include image processing, stereoscopic image and video coding with emphasis on motion and disparity compensation techniques, object-based representation of images and video, and multimedia broadcasting systems. He is a member of IEICE Japan and ITE Japan.
Naemura Masahide Naemura received his M.S. degree in electrical engineering from Kyoto University and joined NHK (Japan Broadcasting Corporation) in 1984. Since 1989, he has been working at NHK Science and Technical Research Laboratories. He was involved in MUSE signal processing and is currently engaged in research on image databases and video-object linked data broadcasting. He is a member of ITE Japan and IEICE Japan.


Copyright 2001 NHK (Japan Broadcasting Corporation) All rights reserved. Unauthorized copy of the pages is prohibited.

BackHome