I loved the first day of the Embedded Vision Summit, held in Santa Clara today. Specifically focusing on the vision side of Lip-Bu’s “ machinelearningdeeplearning ”, this conference was a presentation of five different tracks, focusing on different aspects of this very new industry. Most conventions and trade shows I have attended have been in well-established industries, which, for me, always feels like an old-boys' club. Everyone already knows everyone else, and there isn't much innovation from year to year. And if there is innovation, you must have an advanced degree in its discipline to fully grok it. But because this is a new field — as a friend once said to me, we're just scratching the paint on the surface of this technology — everyone is new, and we're only just beginning to learn all we can do with this rapidly evolving technology. People aren't afraid at this point to say that they don't understand how something works, and a few of the brightest and forward-looking bulbs (does a bulb look forward?) in the bunch are applying what we've learned so far to both develop the field and to monetize the technology (I'm looking at you, Chris Rowen). Marc Pollefeys, Director of Science at Microsoft and Professor at ETH Europe, opened the program with his presentation on "3D Computer Vision and Mixed Reality". Let’s face it, he was talking about the HoloLens . (It’s true, I’m dying to try it.) Despite its misnomer of referencing “holograms” (there is nothing in this technology that’s related to true holography ), Microsoft’s HoloLens really is putting the “vision” in visionary; Pollefeys calls it the “next generation of personal computing.” I do remain skeptical — really, will they become as ubiquitous as the mobile phone or the laptop? — but after his presentation, I found myself marveling at the possibilities. From A^2, drones, games and social media, designing, and learning from medicine to elevator repair, he posits that there won’t be a sector that won’t be touched by this device — or, one assumes, a device like it. (One segment that struck me, as he talked about gesture recognition, what about the application of interpreting sign language? What about conducting? What about demonstrating fine art skills? I charge you, entrepreneurs out there, get ON that already! We all know that the arts and the ADA are where the money is!) (Please forgive the quality of the images; they're the best I could get from where I was sitting!) Seriously, though, will we all be wearing HoloLenses (or something like it) in the future? Will there be an augmentation of our car windshields or sliding glass doors to every patio? Will we all wear our glasses to both read the newspaper and oh by the way also see Pokémon characters hopping around the edge of our peripheral vision, as we make presentations to our compatriots halfway around the world? I remain skeptical, but then again, in 1995 I swore I would never be one of those people with a mobile phone, and see how fast I changed on that one. (My son was born in 1999, and I bought my first cell phone a few weeks after that!) Maybe Marc Pollefeys is right. Something will happen to change our world. The question is who will discover this world-changing technology first, and leverage it to truly change the world. And chances are, it will involve machine learning with sensors to navigate its way. The Fundamentals Track After the keynote, I attended Shehrzad Qureshi’s presentation in the “Fundamentals” track, “Demystifying Deep Neural Networks”, followed by Sammy Sidhu’s “A Shallow Dive into Training Deep Neural Networks.” Because I learned what a neural network is only about two months ago, I thought it would be prudent to attend these sessions to ensure that what I have taught myself since then is accurate, and, considering that this tech is changing faster than a teenager texting in class, that what I have learned is still accurate. And I was pleasantly surprised — the resources I found on my own to write a white paper on the subject (stay tuned, it’s not published yet, but I’ll let you in on a secret, it’s about the Cadence Tensilica C5 DSP ) were mostly the ones referenced by Shehrzad and Sammy. So yay for me! (Let me know in the comments if you’re interested in reading about the kind of overview that was covered in this track; I don’t want to bore you with talking about something you already know. And again, stay tuned for the Tensilica C5 DSP White Paper, where I have written a pretty comprehensive overview.) The Technical Insights Track But I got a little over-confident, I must admit. After lunch, I attended Alexey Rybakov’s talk about “Deep Learning Beyond Cats and Cars: Developing a Real-Life DNN-Based Embedded Vision Product.” I tried to follow, but engineer I am not. I will say that he was talking about the paradigm shift from “Traditional” software R&D to “AI-Driven” software R&D. He also spoke about developing a data strategy when acquiring, preparing, and processing all kinds of data. The Business Insights Track Having been sufficiently intimidated by the people building these networks and such, I decided to try out the Business Insights track, first with Boris Babenko of Orbital Insights, talking about “Using Satellites to Extract Insights on the Ground”. The upshot of this talk was that using satellite technology, the people at Orbital Insights can make some pretty incredible deductions using machine learning algorithms combined with vision. From tracking the literal traffic (cars on the ground) to retail establishments and then using the data to make predictions about stock prices, to tracking the crude oil inventory in countries that aren’t terribly forthcoming about how much oil they might or might not be stockpiling , this data is obviously of great interest not only to market analysts but also … oh, I dunno … the military industrial complex? (This is my personal supposition, but I defy anyone to tell me I’m wrong!) I have no idea how this data is being leveraged around the world (and if I did, I couldn’t say so, obviously), but I would bet that this company is well funded by a wide array of horizontal and vertical markets and customer bases. Of course this company faces challenges — clouds are a bother (literal clouds, because satellites can’t see through them!), cloud computing is really not infinite (no matter how much it may seem after you’ve uploaded your 4000th photo of your daughter’s birthday party), and the lack of engineers with expertise in the business (more on that in the next session)… That said, Boris suggested that they are well positioned to address these challenges. Stay Tuned…. This is becoming too long a post, so I’ll wrap it up for now. I still have a few more sessions from the first day of this Summit to tell you about, and an entire day for tomorrow, so expect a lot about “machinelearningdeeplearning”, especially with regard to vision applications, in the coming days… --Meera
↧