The potential for visual analytics is enormous. The gathering of image and video data through IP-connected cameras, and the intelligent processing of that data either at the edge (close to the device) or in the cloud has the ability to either complement, or perhaps replace, sensor data in many IoT use cases.
From face and object recognition in public surveillance systems, through monitoring in assisted living solutions, smart parking occupancy detection, to retail footfall analysis...the list really is endless.
The diversity of these use cases is breathtaking, in terms of the characteristics of the data and the requirements of the application. In the smart parking example, it’s really sufficient to determine that there is (or is not) an object of the correct size in the specified location; if there is a big rectangular thing there, then the parking space is occupied.
Picking out and recognising a specific face in a crowd of moving people is much more demanding; quality control on a factory production line might be somewhere in the middle. Sometimes the camera needs to move about, sometimes it must be able to pan, sometimes it can stay fixed.
If it wasn’t for the fact that we humans are creatures that derive much of our primary sense data about the physical world from two paired light-sensing organs, we probably wouldn’t think about all of these different applications as belonging to the same ‘vision’ category at all.
This diversity means that creating solutions is best handled by an open-ended, flexible platform with interfaces to other components and to the widest possible universe of developers, rather than by a series of closed vertical systems. So Qualcomm’s announcement on 11 April of its Vision Intelligence Platform seems to fit the bill rather well.
The platform includes: two new System-on-Chips (SoCs), the QCS605 and QCS60, with an onboard image signal processor (ISP), the Qualcomm Artificial Intelligence (AI) Engine, an ARM-based multicore CPU, vector processor and GPU. The platform also incorporates Qualcomm Technologies’ advanced camera processing software, machine learning and computer vision SDKs, and connectivity and security technologies.
The announcement builds on Qualcomm’s launch of a Snapdragon-based IP camera reference design (October 2016), which itself incorporated on-board processing capability for machine learning and AI developers.
There are already some customers lined up preparing to develop products based on the platform, including IP video specialist Kedacom and Ricoh. Perhaps more importantly, there is a raft of technology provider partners, including SenseTime (backed by Alibaba with USD1bn, and described last week by Bloomberg as ‘the most valuable AI start-up in the world’), Pilot.ai and MM Solutions. And although it’s a vision intelligence platform, there are high-end audio capabilities, including noise and echo cancellation, on-device audio analytics and processing features for natural language processing, audio speech recognition, and “barge-in” capability - the capability to support a voice interface in noisy environments.
This is in several ways a bold move by Qualcomm. The QCS605 and QCS603 are based on Wi-Fi. As we’ve indicated before, the IoT domain has multiple and diverse requirements for connectivity, so this approach is to be welcomed. And it builds on the company’s rather distinctive platforms approach to the IoT, of combining connectivity and compute power into segment-oriented packages.
It’s bold too, in that others have a much stronger and better established position in the image and video processing space; Hikvision, the video surveillance specialist company ultimately owned by the Chinese government, comes to mind as the most obvious contender. So does Movidius, acquired by Intel in 2016, which similarly offers powerful devices capable of processing at the edge, and which has some impressive agreements in place with Google and Microsoft, among others, is another.
Qualcomm will face a tough climb in establishing itself in this domain and will find itself in a not entirely familiar position of being a challenger rather than a market leader. There is no shame in that; but abandoning the entire sphere of vision analytics to others would be something of a surrender.