Storage has always been a passion of mine. In 2000, I was a card-carrying SNIA member and worked on an SNIA committee regarding certification paths for IT pros to prove their storage know-how. That was during my SAN/NAS days at CommVault Systems when I wrote "Enterprise Storage Solutions for Sybex" with Chris Wolf, a noted virtualisation expert at The Burton Group. Times have certainly changed, judging from last week's Storage Networking World, which laid bare the raft of troubles today's organisations face when it comes to storing sensitive data.

First off, sensitive data left unprotected in the enterprise is on the rise. It is amazing the number of internal-use databases that carry Social Security numbers, credit card information, and the like. When it comes to unauthorised access, these databases - SQL and Access in particular - are prime and constant targets.

"There are a number of ways enterprises and vendors have tried to address this problem," Michael Mesaros from Dataguise told me, "and they each have their strengths and weaknesses."

Mesaros continued, "Application firewalls monitor database activity and seek to prevent unauthorized data access by users and malware. This does little good in preventing the spread of sensitive data accessed by authorized users and applications, however. Database encryption protects data from disclosure via direct disk access; however, data must be decrypted to be used by an application, and if the data is visible to the user, it can be stolen or compromised. Finally, data loss prevention [DLP] technologies can crawl databases to find sensitive data, index it, and create data 'fingerprints,' so the data could be recognized at an egress or endpoint. The limitation of DLP, however, is in its ability to provide managers and analysts with a view of how sensitive data is organized within the application."

In response to this increasing problem, Dataguise DgDiscover first locates, then searches the databases deployed on a company's network, returning statistics regarding the volume and location of sensitive information, including credit card numbers, Social Security numbers, personal identifiers, and custom-defined data types. This information allows managers to track sensitive data at the application level. Also, DgDiscover provides an easy way to identify data that requires masking when copied from production servers for non-production use.

Admins then use DgMasker to select columns for masking, which are imported into DgMasker for fast masking-in-place. Masking with DgMasker allows enterprises to leverage application data for business analysis, test, development, and support activities without the risk of compromising sensitive information.

Watching a demonstration of Dataguise's solution was enlightening. On the one hand, the product was impressive; on the other, the data it returned scared me tremendously. How many products these days use a database on the back end, even just SQL Express? And you don't think about it once it is installed and the app is running, right? But that data is open to others. This had me curious whether other vendors offered solutions targeted at unsecured data in production enterprise environments.

One possibility to assist with protecting your environment is to use a data storage encryption solution, such as those offered by EMC, NetApp, Vormetric, and others. Once you discover the location of the sensitive data in your environment, possibly with a tool such as DgDiscover, you can then look to another vendor's tool to encrypt the data against theft.

Database masking, aka data sanitation or data scrambling, is one possible solution, as Dataguise's product shows. With database masking, the real data is obscured, or replaced, with "realistic, but not real data," and it is typically used more with non-production environments, including development, testing, and business analysis. Sometimes when you work with copies of a production environment in order to test or develop, the production data is protected, but the data that has been copied for testing is not.

Companies such as IBM, Oracle, Dataguise, and others provide the ability to mask the data. In many cases, the software solution you choose can automatically mask the sensitive data when the test or development copy is created, saving you the extra step of discovery and then masking.

The discussion was an eye opener for me. It might be worth it for you to run a test of your own environment. See what databases exist and whether you might be able to glean some sensitive information. You might be surprised - and horrified - to see how open that data is. Mask it, encrypt it, lock it down.

If you are already aware and actively protecting your database data, lend some insight in the comments section as to what you use, how it is working for you, and what advice you might have for others to help them better protect their environment.