IBM has released the source code for its Unstructured Information Management Architecture (UIMA).
It hopes to encourage independent software vendors (ISV) to use the framework for the creation of complex, enterprise-ready text analytics applications. The code is now available as an open-source development project on SourceForge.net.
While traditional content and knowledge management applications today allow users to search for terms, they dont allow searches for concepts or relationships between words in documents, websites or other text, said an IBM spokesman.
Complex text analytics applications from various vendors do provide that kind of analysis, but plugging them into existing search applications can be difficult because of code compatibility issues, he said.
The idea behind UIMA is to have a standards-based platform developers can use to create specialised text analysis applications, which can then be tied in by users with the search applications of their choice. UIMA defines a common, standard interface that enables text analytics components from multiple vendors to work together.
"Customers had to do the integrations themselves because there are no interfaces" between proprietary text analysis applications and search products, the spokesman said. "Theyve had to custom-tie them together," which is often difficult and costly. "UIMA enables them to tie these things together more easily, providing plug-and-play in a common language."
Last August, IBM announced that more than 15 ISVs had pledged to support UIMA in their text analytics and search products. IBM also introduced its own offering, IBM WebSphere Information Integrator OmniFind Edition, based on UIMA.
Text analytics can comb through documents, comment and note fields, problem reports, e-mail, websites and other text-based information sources, according to IBM, which worked on the development of UIMA for more than four years.