Quantcast
Channel: Cadence Blogs
Viewing all articles
Browse latest Browse all 6662

Are These Codecs Any Good? Netflix Tests Them

$
0
0
A codec compresses data for transmission. The first codec I had any close encounter with was the full-rate speech coder used in GSM. This compressed voice into 13kb/s. For comparison, wired phones transmitted uncompressed data at 56kb/s (US) or 64kb/s (most of rest of world). Codecs By modern standards, the full-rate codec is poor, both in terms of the quality of the speech (music-on-hold was dreadful) and in terms of the compression. However, in the early 1990s, the limits of the silicon of the era (around 0.35um or 350nm) were around 1MMAC, one million multiply-accumulates per second. The GSM standard also specified a half-rate codec, which compressed the audio down to 5.6kb/s. This roughly doubled the number of calls that could fit into a given bandwidth, at the cost of further deterioration of audio quality. When there was silence, the codecs transmitted nothing, which didn't save any bandwidth since the unused TDMA slots were not available to anyone else, but did save power. Because users find pure silence disconcerting on a call, when nothing was being transmitted, the receiver would generate "comfort noise" so that the listener didn't think that the call had dropped. Even if that's all you know about codecs, you can already see some of the tradeoffs. Using a lot of compute power at the transmitter and the receiver can result in greater compression and thus more efficient use of the transmission channel. But you can't demand more compute power than can be delivered in the era that you are designing for. If you go too far, it will be impossible to build a codec that works, but even if it is feasible, it is likely to take up too much area on the chip and consume too much power, draining the battery. The growth of video on the internet, and the increasing proportion of internet access that is over mobile networks, has increased the importance of optimizing bandwidth. Of course, the application processors in smartphones have gotten faster, and so over time the optimal tradeoff point between using compute power to minimize bandwidth needs moves. The workhorse video codec since around 2003 has been H.264 (pronounced aitch-dot-two-six-four). Just to be confusing, this is also called MPEG-4 part 10, or AVC (for advanced video coding). The H.264 standard was followed by H.265, also known as MPEG-H part 2 and HEVC (for high-efficiency video coding). H.265 only requires about half the bandwidth of H.264 to encode the same video stream. Google/YouTube also developed a new compression format called VP9 which used half the bandwidth of legacy standards. The leading open-source encoder for H.264 is called x264. That codebase was used to create x265 for H.265. Google open-sourced their encoder for VP9 called libvpx. Netflix One company that compresses and transmits a lot of video is Netflix. I've read that at peak periods in the evening, as much as one third of total internet bandwidth is consumed by people watching Netflix. So Netflix has a lot of interest in efficient video compression since they would like people to watch twice as much (or have twice as many subscribers) without bringing the internet to its knees. Since more and more of that video is being watched on mobile devices, good use of bandwidth is even more important. The compression comparisons tended to be based on the theoretical compression possible using the standard, with a limited set of videos. Netflix also is different from something like a video call on Skype. Netflix encodes once, transmits the data to us all, and we decode it millions of times. Almost any amount of computation on the encoding that makes the bandwidth use better is worthwhile. That is not the case on a video call that is encoded once and decoded once. Netflix decided to run some tests to see whether these codecs worked well in the real world, on their huge catalog of movies and TV shows. It takes a lot of compute power to do this comprehensively. As Netflix put it: We sampled 5000 12-second clips from our catalog, covering a wide range of genres and signal characteristics. With three codecs, two configurations, three resolutions (480p, 720p, and 1080p) and eight quality levels per configuration-resolution pair, we generated more than 200 million encoded frames. You may or may not know that Netflix runs entirely on Amazon's AWS where Netflix has reserved servers. When these are not needed by customers watching video, they dynamically reassigned them to run these tests and so completed them in just a few weeks. As an aside, this is a good example of co-opetition since Netflix is totally dependent on AWS for running its business, but also competes with Amazon in the streaming video market. Conclusions Netflix's conclusions: Here’s a snapshot: x265 and libvpx demonstrate superior compression performance compared to x264, with bitrate savings reaching up to 50% especially at the higher resolutions. x265 outperforms libvpx for almost all resolutions and quality metrics, but the performance gap narrows (or even reverses) at 1080p. The received wisdom about virtual reality (VR) is that it will require two streams of 4K video to look "real" enough. This will make the importance of efficient video compression even more important. Netflix are on it, and they are extending these results to 4K video. I am sure that new video compression techniques will evolve for VR too, since the two video streams are different views of the same scene, there is a lot of redundancy, not just between successive frames (which current codecs take advantage of) but between the two frames presented at the same time to the eyes. If you want to know more, then watch Jan's presentation at the SPIE Digital Image Processing Conference . I'd tell you Jan's full name but our blogging platform censors it, you can watch the video to find out why. Previous: Emulation Productivity: Beyond the Specs

Viewing all articles
Browse latest Browse all 6662

Trending Articles