Colin Manning: Digital Video Compression

Multidimensional Search Spaces

Block based motion compensation as described in previous chapters does not compensate for changes due to illumination variation, object rotation or motion in directions other than that parallel to the camera plane. Multi-dimensional motion compensation schemes attempt to compensate for these additional factors.

Illumination Variation

One of the assumptions made by block based motion compensation is illumination invariance. The block matching algorithms discussed thus far do not cope well with significant variations in lighting. This is because the luminance values of each pixel in the target block and in the candidate blocks are used as a measure of the match. If for example an object moved from a brightly lit area of the scene into the shadows then it would be difficult for the BMA to match it.

A candidate block (a) and two virtual candidate blocks generated by adjusting the illumination. One block (b) is darker than the real candidate and the other lighter (c)

Illumination variation can be modelled reasonably well however by adding some (positive or negative) value to the luminance component of each pixel. In such a compression scheme the receiver would be informed not only of the position of the matching block in the past frame, but also given a value to add to the luminance component in order to compensate for the illumination variation.

Employing such a scheme has the effect of adding an extra dimension to the search space. This makes the block matching process more complex because in addition to examining blocks around the target block, the block matching algorithm must also evaluate the distortion function for hypothetical candidate blocks with different illuminations. If for example the search space consisted of a maximum displacement of ±9 pixels in both the x and y directions, and an illumination variation of ±9 then the search area would contain ((2*9)+1)^3 candidate blocks.

Scaling blocks

Another failing of block based motion compensation is its inability to account for objects moving in a direction perpendicular to the camera plane (i.e. towards or away from the camera). If an object is approaching the camera head-on then it will appear to grow larger between frames. The techniques discussed in previous chapters do not compensate for this kind of motion.

The motion can be compensated for by comparing the target block with candidate blocks of different sizes. At the receiving end the matching block can be extracted from the past frame and re-sized in order to fit into the standard sized area it replaces in the current frame. For example, a hypothetical candidate block of 8x8 pixels might be generated by taking an area in the past frame of size 12x12 pixels and shrinking it. As with illumination variation, allowing the size of blocks to grow or shrink adds another dimension to the search space.

Adding yet another dimension would allow blocks to shrink and grow non-uniformly. Block behaviour of this sort could be the result of object rotation in the direction perpendicular to the camera plane.

Rotating blocks

Even if object motion is restricted to the plane parallel to the camera plane block based motion compensation, as described in the section on interframe compression, cannot account for object rotation. Such motion can be quite common.

Allowing block matching algorithms to consider rotated blocks in the past frame as candidates for a match can solve this problem. The position of the block and the extent of its rotation would have to be transmitted to the receiver so that it could extract it from the past frame and use it as a substitute for a block from the current frame. Allowing rotations of this nature adds another dimension to the search space, although limits could of course be placed on the number of possible rotations that may be considered.
Frost & Theaker investigated matching object rotation [Fros90] as did Gerken [Gerk89], who found that allowing for rotation improved the quality of the matches.

Skewing Blocks

Scaling and rotation preserve the rectangular shape of the block, but is possible to have blocks of any quadrilateral shape. In such a scheme the transmitter would be informed of the positions in the past frame of the four corners of the block. The transmitter would then be required to reshape the block so that it can be used as a substitute for a standard size block from the current frame.

Kim & Kim called this elastic block matching and their methods considered blocks of any quadrilateral shape. They found that blocks of this kind resulted in higher quality images than conventional block matching methods [Kim92].

Universal Block Matching

Fuh & Maragos experimented with techniques that accounted for block rotation, scaling, and intensity variation [Fuh91]. They restricted their experiments to square blocks however. Papadopoulos & Clarkson constructed a hardware coder based on similar principles which was able to compensate for zooming, rotation and uneven stretching [Papa92].

It is possible to account for block motion, illumination, and scaling in a single coding scheme. With each additional variable, however, the number of candidate blocks is multiplied and this can result in very complex algorithms. There is reason to suspect however that assumptions such as the principle of locality of quadrant monotonic data model could be applied equally well to these dimensions. The quadrant monotonic assumption is appropriate for the illumination variation dimension, for example.

Sub-optimal universal block matching is an interesting topic for further research. As well as adaptations of existing sub-optimal matching algorithms, the order in which the variables are minimised must be addressed. Fuh & Maragos suggested minimising the scaling and intensity variables first and then solving the others [Fuh91]. It is reasonable that one should compensate for the most common kinds of motion first and then minimise distortion in the other dimensions.

[return to Advanced BMAs]