Quantcast
Channel: Cadence Blogs
Viewing all articles
Browse latest Browse all 6662

Microsoft CDNLive Keynote: Cloudy with a Chance of Chips

$
0
0
Traditionally at CDNLive Silicon Valley, the first keynote is given by Lip-Bu Tan, Cadence's CEO. Then an external presenter speaks, usually a customer. And then we announce something new. This year, the something new was Pegasus, the new highly parallel design rule checker (DRC). Pegasus is especially appropriate as a name since it ends in US—the digital tools are all about "us"—and it is cloud enabled. You can't get to the clouds unless you fly, so a winged horse fits the bill. I wrote about Pegasus that morning in Pegasus Flies to the Clouds . And watch out for a post coming soon on the CDNLive presentation by Scott Barric of Microsemi on their experience with it on real designs. Talking of clouds, that middle keynote was by Kushagra Vaid, who is the GM of Micorsoft's Azure Infrastructure, which is the name of their cloud. That alone would be enough of a reason to have him come and give the keynote, but he was also a chip designer at Intel where he spent 10 years before he was a principal engineer responsible for the technology direction of the Xeon microprocessors. "I've used plenty of Cadence tools," he acknowledged. His talk was titled The Future of Intelligent Computing . He split the future into three: Cloud scale computing Silicon and system architecture Machine learning / AI Cloud Scale Computing The scale of Microsoft Azure is impressive. There are over 100 datacenters in 140 countries. It represents billions of dollars in investment. The datacenters vary from 5,000 to 100,000 servers depending on the location (and I'm assuming these are multi-core so the number of cores is several times these numbers). The goal is to make it so that every person on the planet has access to essentially unlimited computing when they want it. They are still growing, so they are not there yet, although since computing needs grow every year it is an endless task. There are over 1M production servers, a dark fiber network ("dark fiber" can mean two things, fiber that is laid but waiting to be used, or a geographically wide network that is private). Of course there is 24x7x365 support. The image above shows the numbers. I looked on the Internet to see if I could find a better version of the picture and I could. The trouble is the numbers change so fast. The cleanest image had Bing searches at 1B, so a factor of 7 less. I suspect it will already be up to 8B by the time this post appears! Cloud is all about scale. Microsoft is clearly at scale with these numbers. Kushagra didn't talk market share but Microsoft is #2. The leader is Amazon by a long way. There is a huge diversity of workloads from bare metal (running on the hardware with Microsoft providing nothing else), IaaS (infrastructure as a service), PaaS (platform as a service), and SaaS (software as...). He talked to a reporter recently who said it was like a Noah's ark with all the species in a common boat, with the species being the different customers. A new development is serverless computing, which is especially good for IoT. Instead of keeping an application running all the time, listening for events that come rarely, when an event happens then you can take action. It is a new usage model that is different from traditional computing or even traditional (can it be a tradition when it is so new?) cloud. Silicon and System Scaling Challenges There are plenty of debates about whether Moore's Law is slowing down, but from Kushagra's point of view, CPU releases are slowing down and it takes longer and longer to go from process node to process node. In the cloud, another challenge is that they have divergent workloads and don't always run effectively on a general-purpose CPU. He said that there is a "Cambrian explosion" of silicon tailored to workloads, purpose-built accelerators like the Google TPU (watch for a post on that in the near future, too). Microsoft's main strategy is to add FPGAs to their servers. Every server has an FPGA and, as a result, they have built the world's largest FPGA-based distributed computing fabric. They have been doing this for a couple of years so that probably does mean every server. I don't know what the current speedup is, and Microsoft probably aren't telling, but six years ago using an Altera FPGA more than doubled the speed of Bing searches. Even if that speedup seems relatively small, nothing in the cloud is small. I'm just making numbers up, but if Bing search is 10% of Azure's capacity, doubling the speed saves about 50,000 servers and the associated building, power, and cooling infrastructure. A big change as cloud has grown is that innovations are now available first in the cloud, as opposed to being developed on in-house server farms and then migrated out to the cloud. All the innovation around accelerators and silicon occurs first in the cloud, not just at Microsoft but at their cloud competitors, too. Only later might the innovations trickle down to enterprise servers. Machine Learning and Artificial Intelligence Artificial intelligence (AI) is used every day without you really knowing it, for vision, speech recognition, language translation, and knowledge-based search. Image recognition has just passed human skills in identification. A human and a trained CNN model looked at cats and dogs. The humans were good at telling the dogs from the cats, which is pretty easy even for a toddler, but humans were not good at telling the species of dog. Machine learning is starting to be heavily used. Credit companies use it for fraud detection. Connected cars can learn to drive from your driving patterns. Speech and translation (50 languages into 50 languages) are getting to Star Trek levels. Skype (which is owned by Microsoft, remember) can translate conversations in real time, allowing a conversation between someone speaking English and someone speaking Chinese. If machine learning is about how you learn from patterns, deep learning is about understanding how our brain is wired and trying to replicated it. That can drive up to higher level semantics like learning to appreciate art. In a recent analysis of what level of intelligence today's computer is compared to a human, we've reached the level of a mouse brain. So there is some way to go. Don't worry about the singularity or our mouse overlords taking over, at least for now. For years, there has been discussion of using machine learning for EDA. A lot of the early part of design is iteration to explore the design space, and a lot of the late part of design is iteration to fix the final few problems. Lip-Bu hinted at this, and Anirudh hinted even more during his presentation later, that this is coming. There are a couple of big gains to be had. One is that there is a lot to be gained by learning from one run of the tool to the next, whereas today we throw all that away and start from scratch every time. Another is that there is not a lot of deep knowledge in many of these iterations. For instance, run physical verification, fix errors, run static timing and DRC again, repeat. Today that runs overnight and the engineer shows up in the morning. With machine learning, that could just iterate automatically several times per night. There are starting to be announcements of EDA products and flows that incorporate some machine learning, just for a single tool. The moonshot is to put a lot of the design process together. At some level, such as RTL or maybe SystemC, the design intent has to be expressed. Then a machine-learning EDA suite could implement it, taking the PPA constraints as input. Putting it All Together When you put these three things together—cloud scale, silicon innovation, machine learning—then you get intelligent computing. There will be new business models, and new industries, because we will be able to solve problems that we could not before. For chip design, it could usher in a new golden age by lowering the barrier (cost) to doing a new chip design. A hint of how it could be is already there in the software startup ecosystem. It used to be that a startup had an idea and they needed to raise some money to build a datacenter to scale out their product and bring it to market. Now you just need a few laptops and a contract with a cloud provider who can scale as much as necessary when needed. No infrastructure is need to try out new ideas: show a demo, get funding, build a business. This doesn't yet work in the silicon space where you still need to build infrastructure, buy tools, and so on. But the potential is there, along with the explosion that it could create.

Viewing all articles
Browse latest Browse all 6662

Trending Articles