Anyone who still believes that breaches and credential theft are a simple matter of criminals stealing data might find it illuminating to spend a few minutes in conversation with Digital Shadows CEO, James Chappell.
Breaches and credentials, he quickly suggests, have become a lot more complicated than that thanks to the sheer volume of data that is now out there. According to Chappell, it’s no longer enough simply to know that bad something has happened. Stolen data has turned from a problem of detection into one of deeper understanding and organisations must arm themselves with new tools.
A case in point arrived earlier this month when small US cybersecurity consultancy Hold Security revealed that it had uncovered a 1.17 billion cache of stolen account credentials shown to it by a single Russian hacker. After analysing the data, Hold worked out that 272 million of these were unique, mainly from Google, AOL, Yahoo and Mail.ru webmail accounts. A total of 42.5 million of these credentials had never been seen before, another way of saying that nobody knew they had even been stolen.
The size of this cache drew attention to the industrial scale of credential theft as well as to an issue that has become more pressing with each new breach: how do organisations quickly assess whether their customers’ credentials are affected by a specific incident?
In most cases there is only one tool to hand - Google. The best that could be said of using a search engine to verify stolen credentials is that it’s better than nothing.
Grasping the thorn, Digital Shadows, has developed a new feature for its SearchLight analysis platform that helps stressed organisations quickly work out whether an incident involves its customers or not. The platform can already scan for breached credentials across websites, social media, search engines, and the dark web on an ongoing basis but can be now also deployed retrospectively to spot the seams between newly reported breaches and multiple old or recycled ones.
“We had analysts crawling all over that,” says Chappell of the Hold Security cache. “Quickly it was clear that a lot of those were from previous breaches.”
Anyone using this tool would have had a rapid assessment of their potential exposure. If breached data turns out to be new, the next task is to understand how it might have ended up in the hands of criminals. There are several sources for breached data including straight database theft but also phishing attacks and malware campaigns, each with its own dynamics and set of business implications.
“Instant responders are in a horrible position. They have to work out whether it [the breach] is genuine or not - Google is not effective,” says Chappell.
“You need to automate the process at scale otherwise you can only ever approximate the risk.”
At the simplest level, Digital Shadows’ Search Light platform allows the company’s analysts to compare the credentials in any new breach with what is already on record but this is only the start. Other traces become important. Because a particular cache may have been created from different sources, the system can start to assess which those might be.
According to Chappell, breach caches have what can best be described as their own identity or fingerprint that reveals how they were assembled.
“We can look at the entropy of the data. That tells you whether it’s been created by a machine or pulled together from a bunch of sources,” says Chappell. Even the distribution of certain letters in a cache of a given language such as English offers important clues. Just comparing different data sets is not on its own sufficient to understand them.
“It’s the credential in the context of other credentials.”
Digital Shadows - data armageddon
It’s a subtle problem nobody anticipated; understanding breaches is turning out to be as critical as detecting them in the first place.
Sometimes a tiny number of credentials hidden inside a larger set can turn out to be important. It’s not always a numbers game. Chappell mentions a Pony malware campaign that Digital Shadows detected that successfully targeted an unnamed cloud application, compromising 50 companies. What made it stand out was the modest number of compromised credentials belied their huge importance.
“Just ten lost credentials resulted in the breach of millions of records.”
And the future? Chappell is guarded.
“We may be approaching a breach singularity where all personally identifiable information (PII) is known,” warns Chappell. But what will decide the eventual impact of this frightening prospect is how quickly organisations can separate the serious thefts from the re-animated ones.
The days of sitting around in blissful ignorance are numbered. “Intelligence is like shining a lamp in a cave. You make choices about what you choose to illuminate.”