Most of that volume is video, and a huge proportion of that video is not something that anyone will ever want to watch – it’s CCTV or other monitoring footage, or road webcams – the kind of video that is being captured almost to provide a binary piece of information. The street is full, or it’s empty. Someone entered the room, or they didn’t. And yet huge amounts of that video is being transported, processed and stored. It’s expensive, it doesn’t scale, and it creates all sorts of problems for privacy and thus for regulation.
Unsurprisingly, this is a problem that is occupying lots of people and companies. There is a good summary of some of the different approaches being taken in this article, and in the comments below.
But last week I met Michael Tusch, CEO of Apical, who has taken radically different route to dealing with the video data glut. What if the video could be processed at the edge, near to where it was collected, so that only the significant information was transported and stored, rather than the raw bandwidth-hogging footage itself? And what if that processing was based on embedding at the lowest possible level, an understanding of how physical objects, especially people, actually move through environments?
Apical is one of those interesting and important companies that no-one has ever heard of. It licenses technology used in cameras and displays, including of course the cameras and displays in smartphones. It sells designs to chipmakers who embed its ideas into silicon, though it also works with the product manufacturers to ensure that the technology does what it’s supposed to. Its technology is now in a billion devices, and has some of the essential IPR in High-dynamic-range imaging (HDR).
Apical is not a start-up, in that it has been in existence for 12 years and makes money – it has no VC ownership and isn’t looking for funding. But it has something of the small, cool, start-up feel, even if it is based near Loughborough in what will eventually be recognised as the important East Midlands high-tech cluster (remember, you heard it here first) . There are 85 employees from 22 nationalities.
The essence of its cleverness already comes from the principle of dealing with digital images in a way that is based on an understanding of how the eye processes them. So, for example, Apical’s “Assertive” displays make the images they present clearer and brighter by enhancing those aspects of the image that will be picked up by the eye, rather than simply by whacking up the overall brightness and power levels. This means the screen appears brighter at the same time as it actually uses less power.
What Apical did next was to look at the way the visual system handled images of human movement, again so as to enable processing at the edge. We know a lot about how people move their body parts about, and often it’s those aspects of an image that contain the information that we need. Which way is someone moving, or facing? Are they standing up, sitting down, or falling over?
Using heuristics about how images change as a result of these kinds of movement can be used to pre-process the image into information before, rather than after, it is presented to a human for interpretation. As information it can be handled by other kinds of data processing system, in a way that video can’t. Indeed, Apical claims that its approach could be used to define a new ‘internet of behaviour’, alongside the other “internets” like the internet of things. I can’t see that happening, mind you.
Apical calls its new emerging technology ‘Spirit’. It is not yet commercial. Apical has it working, on FPGA chips, typically used for prototyping. The next task is to put into ASICs; it could be in the market by the end of 2015, says Tusch.
There are lots of obvious applications. The big commercial driver is in retail, where there is a strong motivation to see where people are moving and what they are looking at, but without storing details about who they are. Another application could be the kind of image processing that is going to be necessary for autonomous driving to become a reality, where cars will need to rapidly identify pedestrians.
According to Tusch “It’s the possibility to predict a pedestrian’s behaviour not by their motion but by their pose: someone standing on the kerb looking into the street is more of a concern to the car than someone standing in the same place facing a different direction. Today’s ADAS systems are good at detecting people walking across the road in front of the car, but that’s not what we’re doing when we drive through a town centre and are constantly assessing the “threats” that might come across our path. To me that’s the big challenge, and I think Spirit has the ingredients for this.”
It’s nice to be able to report that Apical is not interested in “homeland security” applications, which are not hard to imagine. The technology could be deployed to improve smart cameras, which could then automatically adjust themselves so as to better capture what they ‘knew’ about the content they were filming. It could also be implemented in devices that weren’t cameras, which could analyse and transfer the key aspects of the data represented by the image without capturing or storing the images themselves, thereby obviating at least some privacy concerns. This is, of course, how our brains deal with visual images captured by our eyes; our heads are not actually full of little pictures of what our eyes have seen.
These sensors, or Spirit-enabled cameras, could also be used in telecare deployments, where it’s necessary to know whether the old person is vertical or moving about, but not to capture TV quality images.
Of course, as with all of these things, the most important application may turn out to be none of these, but the underlying technological approach seems to be very promising.
Short personal after note: more than 30 years ago I wrote an undergraduate dissertation on “line finding in artificial intelligence programs” which discussed the way in which such programs could work better by including more environmental information such as texture gradients.
There was a presumption in those days that the trick was to exclude information that wasn’t about the presence of an edge. I was inspired to take a different approach from that of most of the academics in my department by the work of William Clocksin, who I am pleased to see is still publishing really interesting work in this area, and lists “a conceptual frameworks for artificial intelligence (AI) that gives priority to social relationships as determining intelligent behaviour” as one of his research interests.