I did not attend the 30-minute talk on "Customizable Processor Basics" given at the Design Automation Conference by Chris Rowen, Tensilica founder and now Cadence fellow. It was not due to lack of interest - it was because the Cadence Theater was so packed that I couldn't get close to it.
This talk was, in short, an excellent introduction to the technology behind Tensilica, which was purchased by Cadence in April 2013. Fortunately for those of us who missed it, an audio recording of the talk with slides is available at the Cadence DAC web site. A brief summary follows.
Quick background: Tensilica pioneered the development of highly specialized, programmable dataplane IP that works synergistically with industry-standard embedded CPU architectures. These customizable IP cores, called Xtensa Dataplane Processing Units (DPUs), support a wide range of embedded data and signal processing applications. Tensilica also developed pre-customized, application-specific DPU subsystems for high-volume applications such as mobile wireless, network infrastructure, auto infotainment, and home applications.
A New Way of Thinking About SoCs
At the beginning of the talk, Rowen challenged the audience to "change the way we think" about processing in a system on chip (SoC), and to realize that there are a number of high-performance and low-energy computations that can be handled in a new and better way. To provide a better solution, it's important to understand the distinction between what's going on in the control plane and what's going on in the dataplane.
While the control plane is more visible, Rowen said, the bulk of the computing may be taking place in the dataplane. That's where you find mission-critical, high-throughput, data-intensive functions such as image processing, audio sampling, network packets, and baseband operations. Historically these functions have been handled in "hardwired" ways. What is occurring now is "an increasing need to think of the dataplane as being programmable."
The dataplane is not just for simple signal-processing functions, Rowen cautioned; it may also include major pieces of deeply embedded control logic. He noted that the software load on a DPU is "significant," perhaps millions of lines of code. This requires mature software-development tools that can handle profiling and analysis.
Because they are customizable, Rowen said, DPUs and their associated software tools can be specifically optimized to the task at hand. If they so desire, chip architects can specify the custom instructions that are best for their application. "What you get is a category of processor which, depending on the application, can be 10 to 100 times more efficient than the traditional, general-purpose, one size fits all DSP, and can complement the CPU by being 10 to 100 times more energy-efficient for the tasks that are happening in the dataplane," Rowen said.
In brief, he said, dataplane processors allow the design team to "offload the CPU and evolve the DSP into something significantly more efficient, while remaining in the comfort zone of software development."
Extending with Xtensa
Rowen identified Xtensa as the "name of our baseline instruction-set architecture as well as the tool environment that allows us to generate all these different flavors of processors." He said about 5,000 variations of the Tensilica architecture are in production and that processors range from 10,000 gates to a couple million gates. He discussed several IP product lines built on top of the Xtensa foundation, including DPU solutions for hi-fi audio, baseband, image and video processing, and more.
The greatest flexibility, however, happens when customers develop their own instruction-set architecture extensions. A chip architect or application expert can describe instruction-set features, "hit the button, and within a matter of minutes take delivery of an optimized processor hardware description and a complete software environment," Rowen said. The deliverables include what you would expect for any modern commercial processor - libraries, operating systems, compilers, debuggers, simulators, and RTL.
That RTL, by the way, is formally proven and it comes with an "absolute guarantee" that the RTL will exactly match its specification. "It's really possible to create datapath elements that are strikingly similar to hardwired RTL elements," Rowen said. And, he noted, you can try many alternative architectures and get real-time feedback on power and performance characteristics.
Want to learn more? Click here to go to the audio recording (scroll down to the Tuesday 2:30 pm presentation on Customizable Processor Basics). And you don't have to stand in line at a crowded theater.
Richard Goering
Related Blog Posts
Why Cadence Agreed to Buy Tensilica - And How it Can Change SoC Design
Q&A: Tensilica Founder Chris Rowen - Perspectives from an IP/SoC Pioneer
We Need to Move "Past EDA": Tensilica Founder Rowen