Quantcast
Channel: Cadence Blogs
Viewing all articles
Browse latest Browse all 6721

CCIX Update: TSMC, Xilinx, Cadence, Arm...and Jasper

$
0
0
CCIX (pronounced see-six) is the Cache Coherent Interconnect for Accelerators. I wrote about it in my post CCIX is Pronounced C6 and also when Cadence announced its collaboration with TSMC, Arm and Xilinx in Xilinx/Arm/Cadence/TSMC Announce World's First 7nm CCIX Silicon Demonstrator . There have recently been a couple of updates. First, at Arm TechCon on the foundry afternoon, TSMC, Cadence, Xilinx and Arm all gave some background on their collaboration. Then, at the Jasper User Group (JUG) held recently, Cadence's IP group talked about using formal approaches to verify their implementation of CCIX IP that is inside the demonstrator chip. TSMC TSMC's Tom Quan showed the above master roadmap, that lists all the TSMC processes and to which end markets they are applicable. The mainline process is the blue column up the middle for logic. Red means that the process is available, and green means that it is in development. The title of the session was focused on 7nm, and Tom told us that they have 30 customer engagements at 7nm, with 15 tapeouts expected this year. Risk production started in Q2 2017. But unlike more recent processes, there are not just a couple of early adopter companies, they have many engagements on 7nm, which is their latest node. The design requirements are what you would expect: full coloring, cut metal, reliability-aware simulation, variation, thermal aware EM. For HPC there are also H-tree and clock mesh clock-trees, interconnect with via pillars and non-default routing, and pillars pushed up higher in the design flow since synthesis needs to be aware of them. For TSMC, 7nm is finished and going into volume production. But there is a next generation 7nm+ coming. This is an EUV-enabled shrink, with an improved transistor, and higher standard cell density. As you can see from the above image, it will be available about a year after 7nm, which would mean Q2 2018 risk production. The CCIX Demonstrator Tom, as well as Millind Mittal of Xilinx, Ravi Mahatme of Arm, and Nick Heator of Cadence, all talked about the CCIX demonstrator, which required everyone working together to create. Arm and Xilinx talked about the architecture of CCIX. Today, Intel owns datacenter processors and they use products from the programmable logic division (fka Altera) to create offload processing capability for things like Microsoft's Bing search. Microsoft has said that they ship an Intel FPGA on every new server. It is no secret that Arm is trying to get into the datacenter business too, and Xilinx is their FPGA partner. But there are non-FPGA offload processors too. CCIX is an independent standard for connectivity between server processors, memories, and various offload processors. Microsoft have also said that they plan a big percentage of their future datacenters to be Arm-based, so CCIX could play a role there. Watch this space. Cadence's Nick went last, and talked about how CCIX is being integrated into the PCIe gen4 IP core (the basic underlying protocol that CCIX uses for signal transmission—no need to reinvent the wheel). His last slide summed up the collaboration between all four companies on this 7nm demonstrator chip: Validating CCIX IP At JUG, Anish Mathew of Cadence's IP group (and so presenting at JUG from a customer point of view, despite his Cadence badge) presented Coverage-Driven Formal Verification Signoff on a CCIX Design . He admitted that IPG hadn't drunk the formal Kool Aid and considered it just a complement to simulation, good for small control blocks but not for signoff or bug-hunting. They wanted to use formal to validate a real project and picked CCIX since it was starting around the same time and cache-coherent protocols are known to be very tricky to verify. Indeed, Vigyan Singhal said, the day after Anish presented, that "nobody is getting cache-coherent interconnect out without formal." Simulation is just not enough. The goal of the project was to achieve the same vPlan verification plan, with 100% coverage, as simulation. They knew one big challenge would be deep cover items. For this they would require what has become known as semi-formal coverage (sounds like wearing a tuxedo jacket with jeans). Formal can go a certain depth into a design under autopilot. But two other approaches are important in many designs: state-swarm and guidepointing. State-swarm is somewhat akin to constrained random in simulation, and guidepointing is more manual, but both approaches use multiple short/narrow formal searches to build a long path. State-swarm contributes to coverage but there is not much control. Guidepointing is creating a specific sequence of covers to hit a specific coverage target with full control (so more like directed test in simulation). They used formal in three different ways. Classic formal was the most efficient for finding bugs. Coverage was used to get to planned scenarios that were required for signoff, and made extensive use of state-swarm to improve coverage, and guidepointing to get to 100%. Third, soak was used to search for deep, unplanned scenarios, running formal with no time limit (run until manually stopped) but with coverconstraints to avoid unproductive search paths. You can see from the above graph how they got coverage up. On the left, classic formal did all the easy stuff, getting coverage up to the mid 90%. Then, the state swarm phase was used to get it up to something around 99%, and finally guidepointing was used to get the remaining items and get coverage up to 100%. The results can be summarized in two numbers: there were 21 exclusive bugs that only one of formal or simulation found. On the one hand, there were 15 bugs that formal found that simulation missed. But there were 6 bugs that simulation found that formal could not (mostly due to better testing of very long scenarios such as long packets or multiple link up/down sequences, 1000+ cycle timeouts). Some of the bugs that formal missed it now finds with better assertions. In a bit more detail: formal does a better job on short scenarios, sim on long ones. State-swarm is key to improve coverage and confidence on long scenarios formal's ability to hit more corner cases has side effects. Some aspects of the design end up being more robust than necessary but sometimes unrealistic scenarios are hit and it is hard to constrain them away difficult to write end-to-end checks in formal, especially when considering error/abort scenarios effort to close coverage is comparable formal is competitive with simulation even on a complex block like CCIX semi-formal should be enhanced further (to work better on areas where today sim does a better job) More Information More information about the CCIX controller IP . More information about JasperGold . I'm sure you can find Xilinx, TSMC, and Arm's websites without me, although you won't find much about CCIX there yet. Sign up for Sunday Brunch, the weekly Breakfast Bytes email.

Viewing all articles
Browse latest Browse all 6721