Scott Barric of MicroSemi is one of the people who have been using the pre-release version of Pegasus, the new physical verification solution that Cadence announced at CDNLive Silicon Valley. See Pegasus Flies to the Clouds for more details. Microsemi have already been using the previous Cadence physical verification solution, PVS (which stands for...yes...physical verification solution) for signoff since 2012, in technologies down to 16nm. The explosion in the number and complexity of design rules doesn't affect them by making the rule decks harder and harder to write since they don't develop their own rules—they use rule decks certified by the foundries. But it makes physical verification slower and slower with each process node. Of course, their chips are not getting any smaller, and their schedules are not getting any more relaxed. The result of this is that existing DRC solutions are not scalable: Poor scalability of current tools means jobs need to be scheduled in advance Last-minute ECOs, which require a full-chip DRC, are very costly Good scalability is needed to run DRC more frequently during the design cycle, without raising costs Some non-obvious issues with the architecture of current DRCs such as PVS are: Multi-CPU scalability falls off at around 64 cores, meaning that, for 28nm, overnight tapeout DRC is impossible It requires a huge master processor with four to eight cores to distribute the rest of the jobs, and having done that, this expensive resource sits largely idle throughout most of the run A job could not start until all the CPU resources it required were available, meaning that often N-1 cores would sit idle waiting for the Nth to finally free up Scott used two test chips, that he called A and B. They are both 28nm, and A is 100mm 2 and B is 600mm 2 (remember B is the Big one). The graphs below show the baseline (PVS not Pegasus) runs. Bottom line is that SoC A everything ran overnight with 96 cores. SoC B couldn't run overnight no matter how much resource was used. With Pegasus, the numbers were much better. The details are in the graphs below, that repeat the above graphs and show the improvement. SoC A was down to "go to lunch, chip-level run is done" and SoC B reached the goal of running overnight. It was as much as 10X faster than PVS, and was still scaling all the way up to 300 cores. That was good, but Scott decided to get even more "real world". As tapeout approaches, their farm gets busy and it is hard to find a bunch of 32-core slots. So he forced it do use the smaller servers with only 4 or 8 CPUs. There was some difference, but it was small at only 12% between the high-end CPUs and the mainstream. Another experiment was to see how it ramped. He launched asking for 500 cores (which he wouldn't get). But as CPUs became available, Pegasus distributed out more, peaking at around 200 CPUs and so averaging 126 for the whole job. Pegasus begins as soon as the first worker process secures a CPU, and then ramps as more become available (and I mean as LSF secures more CPUs, this is all LSF-friendly). Putting all these factors together means that Pegasus gets much faster results, at the same time as putting a lot less demand on the compute farm. It is not getting better results by doing a better job of vacuuming up more of the farm, it is using fewer resources but still delivering results in designers favorite run times (over lunch, overnight). So far, these runs were all on a clean tapeout-ready database. What about a dirty database? He had just the chip for the job, one with a lot of errors. Four million violations, due to bad placement leading to overlapping cells. PVS took eight days, and it didn't even finish, it had to be killed. Pegasus finished in eight hours. Another point that may be obvious but is worth emphasizing: the results were 100% accurate. Pegasus is not running fast by cutting corners and getting the wrong answer. The next thing to look at was how hard it was to migrate from PVS to Pegasus. Scott had to confess that getting a Cadence AE onsite with a desk and a laptop took two days. Getting Pegasus up and running, just one. It uses the same rule decks and reads the same input. One of the other features of Pegasus is that it can use clouds, not just internal server farms. I won't reiterate the reasons this is attractive, nor rehash the security worries that tend to scare management. Scott was still in the middle of doing these experiments so didn't have graphs yet, but so far it has been showing scalability on "hundreds of CPUs." As with many cloud offerings, this offers the possibility of what I might name PVaaS, physical verification as a service. In effect, Scott said, instead of having a fixed cost of server farms, a DRC would cost a certain amount per turn, meaning that design managers would have to decide how much they were willing to pay for. Pegasus can also run in a hybrid mode with, say, 100 internal CPUs and 200 external ones. But he hasn't tried that yet. Someone in the audience asked about the elephant in the room: PVS/Pegasus decks are not the first ones released by the foundries. Will that change? Cadence answered the question, saying that we will work with the foundries now that we can prove we can deliver. "We didn't have a compelling enough story before." Other questions were on the future roadmap. Now that DRC is so great, what about LVS, coloring, and other verification tools. Watch this space, coming later this year. Another question was on IP security going to the cloud. Scott said that ITAR data doesn't leave and is all handled internally on their server farms. A Cadence engineering manager explained that when using the cloud, the GDSII and the rule deck never leave directly, even in encrypted form. A machine reads it inside the firewall, and sends it to the cloud as an internal memory representation. Scott also pointed out that their internal cloud is outsourced, so probably runs on the same computers anyway (but don't tell his pointy-haired bosses). The final question was a commercial one. When you go out to the cloud and have hundreds of CPUs running Pegasus, how is licensing handled? Cadence answered that one. Above 64 cores you can get a gigascale license that is a flat fee for any number of cores. "We don't count any more. There is no penalty on the business side."
↧