Follow Us

We use cookies to provide you with a better experience. If you continue to use this site, we'll assume you're happy with this. Alternatively, click here to find out how to manage these cookies

hide cookie message

Blogs

Views from the Lab

Accenture Technology Labs Staff

Data consistency problems in RDBMS applications

Article comments

A common misconception of applications built on the top of a Relational Database Management System is that they are virtually free of data consistency problems

The 2007 New York Times Bestseller, The Black Swan: The Impact of the Highly Improbable by Naseem Nicholas Taleb covers events characterised as 1) being outliers beyond normal expectation, 2) having extreme impact, and 3) being able to be rationalised after the fact. Are data consistency problems in RDBMS applications black swans in IT?
 
A common misconception of applications built on the top of a Relational Database Management System (RDBMS) is that they are virtually free of data consistency problems. This assumption hinges on the condition that applications use RDBMS’s transaction support correctly. But evidently that’s not quite the case. 
 
A recent study by researchers from Purdue University found that more than one hundred transaction-related data-consistency defects in popular web modules such as OpenCart and phpBB.  OpenCart is a widely available shopping cart module with over a million hits. One key function is managing coupons to ensure they are used no more than a maximum number of times. 

The study finds that the code where the coupon’s usage count is read from the database, and the code where the count is updated, reside in a two separate transactions. As a result, an over selling of coupons can happen when two or more users use the coupon at the same time. To perform correctly, the code should have been encapsulated in a single transaction.

This cross-transaction data flow issue is not new and it is a fairly common topic discussed in basic database courses. However, the study by Purdue indicates that this issue is more pervasive than many anticipated. Such a defect could be disastrous for some systems and is difficult to detect. 

It is not easy to spot during manual code review, as different database queries are often encapsulated in different classes and methods. It is also difficult to catch during the testing phase as it requires precise timing for interleaving of transactions to produce a wrong result.

The key in discovering these defects is to locate data dependencies between different database queries. If the value used in an update query comes from a read query, chances are these queries ought to be enclosed in the same transaction. We are not aware of any readily-available tools that can detect this defect automatically. 

However, we can configure dynamic tainting tools, such as Python tainting mode, CORE GRASP etc., to do so. Dynamic tainting tools mark inputs that percolate through the system and can locate data dependencies between two variables in a piece of code. 

Many of these tools are originally designed for detecting security vulnerabilities such as SQL injection by tracing the data flow of raw user input from the GUI into SQL query strings without going through any security check.
 
To locate data-consistency defects mentioned above, dynamic tainting tools have to be manually reconfigured by explicitly specifying the location of each read and update queries. This manual effort is not for the faint of heart, especially for large systems.  

But one can mitigate this issue by focusing on critical modules where defects like this can have significant implications. Ultimately, it is the developer’s responsibility to ensure the database transactions are implemented correctly.

Data consistency problems do occur in RDBMS applications.  They are difficult to detect and become increasingly so as we incorporate and reuse software modules by others.  

But, depending on the consequences of failure, identifying the vulnerability is worth the effort to “turn the Black Swans white.”  Taleb states that what may be a black swan surprise for a turkey is not a black swan surprise to its butcher.  So “avoid being the turkey” and consider data consistency issues in your RDBMS applications.

By Teresa Tung, Manager, Accenture Technology Labs and Chen Fu, researcher, Accenture Technology Labs

Enhanced by Zemanta

Share:

More from Techworld

More relevant IT news

Comments

Send to a friend

Email this article to a friend or colleague:

PLEASE NOTE: Your name is used only to let the recipient know who sent the story, and in case of transmission error. Both your name and the recipient's name and address will not be used for any other purpose.

Techworld White Papers

Choose – and Choose Wisely – the Right MSP for Your SMB

End users need a technology partner that provides transparency, enables productivity, delivers...

Download Whitepaper

10 Effective Habits of Indispensable IT Departments

It’s no secret that responsibilities are growing while budgets continue to shrink. Download this...

Download Whitepaper

Gartner Magic Quadrant for Enterprise Information Archiving

Enterprise information archiving is contributing to organisational needs for e-discovery and...

Download Whitepaper

Advancing the state of virtualised backups

Dell Software’s vRanger is a veteran of the virtualisation specific backup market. It was the...

Download Whitepaper

Techworld UK - Technology - Business

Innovation, productivity, agility and profit

Watch this on demand webinar which explores IT innovation, managed print services and business agility.

Techworld Mobile Site

Access Techworld's content on the move

Get the latest news, product reviews and downloads on your mobile device with Techworld's mobile site.

Find out more...

From Wow to How : Making mobile and cloud work for you

On demand Biztech Briefing - Learn how to effectively deliver mobile work styles and cloud services together.

Watch now...

Site Map

* *