Quantcast
Channel: Cadence Blogs
Viewing all articles
Browse latest Browse all 6660

Optimizing Power with Palladium

$
0
0
At TSMC's OIP Ecosystem Symposium, Cadence's Frank Schirrmeister presented on Software-Driven Optimization for Performance, Power, and Thermal Tradeoffs . In this case, the "software" doesn't mean Cadence EDA tools but rather using the target software that will eventually run on the SoC once it is fabricated. Not only does this allow realistic loads to be run, it also exercises some of the major power reduction techniques, such as power and clock gating, that are typically invoked under software control. These power approaches do not just affect dynamic power analysis, they affect static power, too. A block that is powered down will not dissipate leakage power either. The challenge is that this may require hundreds of thousands to billions of instructions, which makes emulation the only feasible approach. This is not hyperbole, running a 100MHz chip for 10 seconds requires a billion cycles. The Palladium Z1 (or earlier models) runs real-world scenarios of this scale, and identifies power issues early. (If you don't know much about the Palladium Z1, then see my earlier post Palladium Z1: an Enterprise Server Farm in a Rack .) Simulation is still useful, it gives greater visibility than emulation, but you need to already know which window you want to examine. Power comes in a number of different flavors: Peak power, which can be as little as a single vector. This affects the power delivery network and can cause noise and other problems. Long-term average power is one component of battery life for things like mobile and IoT. Short-term average power can have thermal effects in a way a very short sequence will not. A single high-power vector will not heat up the chip. Leakage power is the major component of power during standby modes. This is very important in IoT where devices spend a lot of their time doing nothing. Different modes, as blocks turn on and off, affect all these values, too. The biggest challenge is usually doing dynamic power analysis (DPA) under various realistic scenarios. These typically drive the architectural decisions and have the biggest impact on the final chip power and thermal effects. William Hale Thompson, Chicago's mayor (trivia fact: he was the last Republican mayor of the city), supposedly said, "vote early and often." Well, power analysis is like that ,too. Do it early and often, so you don't get surprised when it is too late to do anything about it. Deep Cycles SoC power analysis requires deep cycles, often measured in millions or even billions, over and above the vectors that are required for functional verification. One problem of SoCs in general is that they require the operating system to be booted up, but booting the operating system is not representative of power dissipation in normal use. But you have to get through it to get to the "normal use" scenarios. Power analysis gets more accurate the later in the design cycle. But, as with most things, the biggest impact comes from making changes early. Changes at the architectural level are orders of magnitude bigger than anything that, for example, a power-aware synthesis tool can do. But early in the design cycle, the design is incomplete and so estimates need to be done with things like native toggle count, or weighted toggle count, which takes account of the cells being toggled. This coarse-grained power simulation isn't going to give you high accuracy but it can identify areas for further investigation and it can flag early architectural issues. The big advantage is that it can be done early enough in the design cycle when decisions have not all been finalized in a way that would be very expensive to change. Note that the emulation and the dynamic power analysis can be done separately. First the design is emulated to get the data, and then the offline dynamic power analysis engine can be used to get the toggle histograms. The next level up of accuracy is to generate detailed power reports using RTL and gate-level simulation/emulation and the Joules power estimation tool (or other power estimation tools). Again the emulation and the dynamic power analysis can be done separately. Flow The details of the flow depend on exactly what stage in the design has been reached, and what type of analysis is being done. But the above diagram gives a flavor of a flow doing gate-level analysis with Joules based on TCF (toggle count format). In the first phase, the emulation is done to generate the raw data. Then the TCF data is generated offline (offline in the sense that it is not typing up the emulator). Finally the power analysis is done using Joules. Using these flows, Texas Instruments had the challenge to deliver an application processor with optimal performance, in particular with power under 2W. They detected some unexpected power peaks and were able to resolve the design to lower the power consumption. Power estimation and actual silicon measurement had a 96% accuracy. Previous: What Is Automotive Tool Confidence Level 1?

Viewing all articles
Browse latest Browse all 6660

Trending Articles