Key considerations when selecting a video compression algorithm: Part 1

With the emergence of H.264 and other types of compression, and the growing abundance of sales and marketing claims, it is increasingly difficult to determine what type of compression is right for your application. To make some sense of all the competing claims, you first need to understand the basics of how compression works.

There are two basic types of compression: frame-by-frame compression and temporal compression. Frame-by-frame compression takes a full picture for each frame, compresses it, and sends each picture one after another in a stream. The most popular frame-by-frame compression is Motion JPEG (MJPEG). It is widely used because it produces the highest quality video, is very simple to decode, and is non-proprietary. MJPEG's Achilles' heel, however, is that to achieve high image quality on every video frame, it produces relatively large files, which means it requires more bandwidth to transmit and storage to record.

Temporal compression algorithms were developed to stream video using less bandwidth by trading off image quality, as well as bandwidth and storage consistency, in many typical surveillance applications. In simple terms, temporal compression works like this: whereas MJPEG will take 50 "full" pictures to make 5 seconds of 10 frame-per-second (fps) video (5 seconds X 10 fps = 50), temporal compression algorithms will take 10 or fewer "full" pictures, usually referred to as "key frames", to reproduce 5 seconds of 10 fps temporal compression video. This is done by using mathematical calculations to predict what may change between the key frames, and then sending only what is changing instead of sending the "full" picture. If there is little or no motion between frames, temporal compression will simply reproduce the key frames, which in the example above is a 10:50 ratio or 80 percent more efficient than MJPEG. The efficiency of temporal compression comes with a caveat: Bandwidth and storage requirements can be highly variable depending on the scene recorded. To illustrate this think about a camera that is panning, tilting, or zooming. The entirety of the scene is changing with each frame, forcing the temporal compression to work overtime to interpret what is happening and then calculate what images to send. The result is a large increase in bandwidth and storage required but with marginal video quality to show for it.

This content continues onto the next page...