A couple of weeks ago the Silicon Valley chapter of SEMI held a breakfast meeting Artificial Intelligence at the Edge . There were 5 presentations: Kevin Krewell of TIRIAS Research: There Is No One-Size-Fits-All in ML at the Edge . Dave Pellerin of AWS: Connected Devices: Where Cloud Meets Edge . Nick Ni of Xilinx: Inference at the Edge: Hardware Adaptability Is the Key. Gary Brown of the Movidius group of Intel: A New Era of AI Devices at the Edge: Silicon Innovation in IoT . Steve Roddy of Arm: AI/ML at the Edge . TIRIAS Kevin started off with the really big picture as to how technologies tend to start with a very fragmented industry, with rapid innovation and changing standards. Eventually, the landscape stabilizes, with common standards and mature technologies. Typically there is a lot of company-level consolidation. You can look at industries as varied as automobiles, computers, or search engines to see this in action. It is reasonable to expect we will see this in machine learning. Currently, we are clearly at the fragmented wild-west stage. However, machine learning is not so much a market of its own as enabling technology for other markets such as computer vision, sensor fusion, natural language processing, customized experiences, and more. Indeed, each of those is arguably just an enabling technology for real markets, such as autonomous driving or voice-activated homes. One big trend, the main topic of the day's breakfast, was that ML is moving to the edge. This is driven by three main things: response time (takes too long to do everything in the cloud), bandwidth physics (there isn't enough spectrum to send everything to the cloud and back), and security/privacy (people are uneasy about everything they say or do being uploaded into the ether). There are exceptions, but in general algorithms and training are developed in the cloud, then reduced algorithms and models are used for inference at the edge. Since performance requirements vary by application (avoiding pedestrians in a car vs a thermostat learning usage patterns) there is a huge range of appropriate hardware: In the cloud (for training primarily) we have CPUs, GPUs, TPUs, custom chips. At the edge (for inference primarily) we have a similar portfolio, with more of an emphasis on custom accelerators (programmable but optimized for neural networks). The market is immature but eventually standard semiconductor products might appear in both the cloud and edge. One challenge is that benchmarks are not really optimized for anything close to real applications. Standard approaches like AlexNet along with standard test data like ImageNet allows some level of comparison, but it is unclear if that comparison is meaningful for something very different, such as understanding spoken English. The current situation is that there is lots of ("unprecedented") silicon investment going on in government labs, universities, established semiconductor vendors, semiconductor startups, and IP vendors (for example, every recent new Tensilica processor has some aspect of neural network support built in). There are over 100 startups, many of which will fail, some of which will be acquired. Keven thinks that there is too much of a shift towards "selling the company" rather than building technology for the long-term, which "does not instill confidence." The above table is just a subset. For an even more extensive atlas of startups, see the Cognite Startup Poster . Next up, four of the companies in that table: Amazon, Xilinx, Intel, and Arm. These presentations covered a lot of the same ground as Kevin's introductory presentation, so I will avoid repeating that, and focus on what they said that went beyond the basic technology vector. I think we can all agree that machine learning is a big deal. Amazon Digital transformation requires the cloud. As Dave put it: You can't optimize what you can't measure...and you can't measure what you can't connect. Dave described Amazon: Amazon is a global logistics company. It has software, too, (Prime Video, for example) but mostly logistics. Prime Air is the backbone of delivery. Amazon is also a fabless semiconductor company and develops their own chips, and as a result is very vertically integrated. For example, Echo (Alexa) contains custom silicon. One focus of AWS that cuts across all the solution is the potential to optimize the semiconductor industry. It is a complex industry with thousands of suppliers. Further, products like Cadence Cloud offer the potential to accelerate the semiconductor design process. Amazon uses a lot of machine learning on top of what they call "data lake", secure storage using S3 with all the associated services to get data in, out, and cataloged. He had a list of areas where Amazon uses this internally: Demand forecasting Risk analytics Supply-chain optimization Search Recommendations Data security Computer vision Robotics Q&A systems Advertising Video content analysis Natural language processing Alexa, Echo, Dot, Spot AI services: Rekognition, Lex, Polly, SageMaker One specific area that AWS can help is bringing modern data architecture for industrial use cases, especially in EDA. Until recently, EDA was "special" in the size of the data that it was manipulating. The data is still enormous, of course, but many other applications have huge datasets too, especially in the machine learning area where training requires huge data by definition. Services created for handling these large datasets can be leveraged by EDA. Xilinx Nick pointed out the big challenge of monetizing AI, which is a huge computational demand, but very restrictive power, performance, and cost envelopes. Naturally, Nick's view is that only hardware/software configurable devices can handle the fast-changing environment...such as the products from Xilinx. We used to call these FPGAs, but increasingly companies like Xilinx insist that their products are more programmable systems. This is certainly true but no catchy name has come along to replace "FPGA" so for now Xilinx is stuck with it. One area Xilinx can claim a lot of expertise is in neural network optimization. Just switching to 8-bit fixed point from 32-bit floating point is topping out as a strategy and various forms of pruning and making networks sparse is the next step. Xilinx acquired DEEPHi earlier in the year, who are specialists in this area. For background on this, see my post Hot Chips Tutorial: On-Device Inference . Nick had lots of graphs and tables showing just how effective this is. Nick predicted strong growth to $30B for inference, with almost all the future growth coming in inference at the edge (as opposed to in the cloud). Intel Next up was Gary of the Movidius group of Intel. By now it was getting hard to say anything that hadn't been said already. He was less bearish on data-center inference than Nick (hey, Intel sells a lot of chips in datacenters) but, as he said: Nick said $30B. I'm not going to tell you our numbers. And these don't even include smartphones and PCs. He even had a lot of ideas about just where all this inference would be used, shown in the chart above. Arm Last up was Steve. He opened by emphasizing that everyone is talking about AI and ML. He didn't just mean it figuratively. The chart above shows mentions of "AI" compared to "big data" in earnings calls. I'm not sure whether mentions in earnings calls really constitutes a true trend or whether it is just that every Wall Street analyst expects every company to have an AI strategy. Like everyone else, Steve acknowledged that training was likely to stay in the cloud. But the cloud and the whole internet can't scale fast enough to make "AI everywhere" workable. Inference has to move to the edge. He had some hypothetical numbers: voice recognition in the cloud at scale would require 17 100K node datacenters, so just about feasible, but all the security cameras in a nation would require 42 (per nation) which is not going to happen. He had three laws to how ML must move to the edge: Laws of physics, that says you can't move all the bytes through the internet with the required low latencies. Laws of economics that says that consumer platforms are where the economies of scale are (8B smartphones times 8 cores gives 64B high-performance CPUs). Law of the land: privacny laws getting stronger year-by-year, keeping user data on the user's device eases compliance. Of course, Arm has a new Neural Processor IP. For more details than Steve gave, see my post Some HOT Deep Learning Processors . He emphasized neatly just how fast all the algorithms are changing: state-of-the-art algorithms will change while your device is in the fab, and for years afterwards! Summary I've said it in many previous posts, and this breakfast session emphasized it more in the commonality between all the presenters: Inference is moving to the edge. Compression, sparsity, zero optimization, reduced precision: these are all essential to get the power and price down. Algorithms are changing all the time, so programmability is essential. Balancing efficiency against programmability is key. Sign up for Sunday Brunch, the weekly Breakfast Bytes email.
↧