The European Union's proposal to give internet users the “right to be forgotten” is unfeasible, according to Bob Plumridge, chief technology officer (CTO) for EMEA at Hitachi Data Systems.
At the start of this year, the European Commission proposed a new law that would allow people to demand that organisations that hold their data delete that data, unless there are “legitimate” grounds to retain it.
European Union Justice Commissioner Viviane Reding said that the new rules would help build trust in online services because people would be better informed about their rights and in more control of their information.
While the intentions behind this proposed legislation are good, Plumridge said there is a gap between what policy makers would like to do and what technology is actually capable of doing.
“Say someone comes to me and says, I want you to erase all the information you hold on me. I could do that off the online systems pretty easily. But various studies show that, on average, most corporates have up to nine copies of any unique data,” he says
“Some of that data could be locked in a vault in a mountain somewhere because that’s the ultimate DR protection. So do I have to go and delete those individual records every time somebody says please delete everything you hold on me? Doesn’t sound very practical to me.”
Plumridge believes that the people who are coming up with ideas to improve privacy do not understand the limits of today's technologies. He said that, for the right to be forgotten to become a reality, all of these individual records would need to be linked together.
“You would have your master copy, and every time a copy of that is made, there would be a link between the original and the copy. So if the original is altered, this modification is cascaded through all the copies,” he says.
However, having links between all copies of the data means that if one copy is corrupted, all the others are at risk of being affected. Part of having a good disaster recovery policy is ensuring that back-up copies are not affected by any event that befalls the master copy.
Furthermore, some businesses are required to hold onto data for compliance reasons. For example, the Financial Services Authority (FSA) requires banks to keep details of all banking transactions for a number of years, so they would not be able to comply with a deletion request, even if that customer had left he bank.
“I think what’s happening is these things are coming out of the policy-making side,” he says. “People are starting to look at them and say is that really practical? Good idea but maybe we can't do it now. Maybe we’ll be able to do it in two years' time.”
Data and the real-time economy
The question of how personal data should be used is becoming increasingly important in the real-time economy. Companies with access to your data are not only able to see who you are and how to contact you, but where you are right now and what you are doing – information that is potentially very sensitive.
This information is extremely valuable to companies, because it gives them deep insight into who their customers are, allowing them to tightly focus their marketing efforts and respond quickly to customer demand. However, collecting data is not an end itself – the key is to know how to make use of that data, while still protecting customers' privacy.
Some companies are now employing specialist data scientists to analyse their data and compile reports that will inform future corporate policy. This often involves depersonalising data, so that it cannot be connected with a particular individual.
Plumridge says that, while this is an interesting concept, it will ultimately become impractical, as volumes of data grow. Hitachi has therefore been sinking large amounts of money into R&D around analytical engines, data protection and data linkage, that can automate data processing.
“I think it's inevitable that the Big Data revolution will rely on machine-to-machine communication,” he says. “I just cannot see how an individual would be able to trawl through what is potentially megabytes, and even potentially terabytes of data. Even if they could, the result they come up with would be so long after the event that the result would be almost worthless.”
He gives the example of the Japanese bullet trains, which are built by Hitachi. In the past, employees had to walk along the track at night and identify where maintenance work needed to be done. However, about nine months ago, the company started mounting digital CCTV cameras externally on the trains to record the condition of the tracks as they go along.
When a train arrives at its destination it offloads the data to a server running an analytical engine, which goes through the data and identifies areas of the track that require maintenance work. That information is then packaged up and sent directly to the various maintenance depots along the length of the track.
“Within an hour of a train pulling into the station, the analytics stuff has been done and the maintenance work is prepared. This also means that the maintenance teams know exactly what has been done and why it was done, and that performs part of their archive of the rail system,” he explains.
Robust infrastructure needed
Machine-to-machine (M2M) communication can also be used to enable responsive traffic management, healthcare and distribution of resources. Plumridge says the real-time economy will require a decreasing amounts of human intervention, enabling companies to focus on ways to improve their business.
However, this also requires significant investment in storage hardware and networking infrastructure, to provide the capacity needed for large-scale data analysis.
“If you think about individuals, a lot of us have grown up in this whole period where you don't have to worry about capacity. If you're using an iPhone or laptop, it's not often that your disk gets full, so people tend not to bother deleting stuff,” says Plumridge.
“Then of course, every time you attach your laptop to your corporate network all that data gets backed up, so there's another copy of it, and at some point it will get archived so then there's another copy of it. When you start then to think about the quantities that a lot of these big organisations are storing, those numbers become huge.”
Organisations must invest in the infrastructure to support all this data, because they cannot afford for it to go wrong. If the real-time economy stops being real-time, the impact is immediate, as the recent outages at RBS and Natwest showed.
This means that, as infrastructures are being renewed and upgraded, people are integrating more robust M2M communications, and spending as much time thinking about the amount of bandwidth they need as the types of applications they are using.
“We’re looking to develop the ability to be able to store all of this data in such a way that in two years time, if someone comes up with a brilliant way of re-analysing it, the data’s there, it’s usable, it’s understandable,” says Plumridge.
As the value of data becomes increasingly clear, some companies are choosing to transform their entire business models to make use of this data. Plumridge cites a Danish company called Vestas, which started off building wind turbines.
When deciding where to install wind turbines, it is essential to know environmental information like average wind speed, altitude, temperature and distance from power cables. Vestas therefore spent 12 years analysing wind patterns and weather patterns across the world in 3km by 3km blocks.
“Their business is no longer building wind turbines, it’s now selling this data to people who want to put up wind turbines, because in a 3km block they can tell you exactly what the weather pattern has been there for the last 12 years,” he says.
“So total business transformation on the back of something they were doing to help them sell wind turbines. I think we’re going to start hearing more and more stories like that.”
In many ways, the data revolution is only just begnning. The advent of the Internet of Things will open up a whole new range of possibilities, and Plumridge belives that future applications will not be limited by the technology available, but by people's ability to dream up new use cases.
“It's quite incredible actually, the potential of this stuff, and some of the limiting factor is our ability to think what could be done,” he says.