Simply changing the underlying way that data is stored doesn’t really provide a long term solution. We moved from FAT to NTFS to get around the limitations of the file system. Now we’ve run into the limitations of NTFS. With WinFS, to be underlie Longhorn, the next version of Windows, Microsoft is trying to find a long-term solution. That means dealing with the way we work with the data and improving the manageability.

With WinFS there is also a change in terminology. Files and objects are known as items. Items are the atomic unit of data you write, read and work with. WinFS items allow you to have structured data like complex types, semi-structured data such as XML and unstructured data such as the binary object in a file. (Refer to this Techworld feature.) This will be important for developers, as they start to move forward with the WinFS API’s.

Items are persistent. WinFS is a store, one in which there can be different views of items.

The issues over relational technologies also need to be understood. There has been a lot of online discussion over the role of the relational components. Microsoft sees them as being essential to improving the ability to extract information. But it also accepts that however the relational technology is exposed, it must not impact on either developers or users.

Users are not going to want to learn how to manage their indexes, or restructure tables. Microsoft has said that the relational team are aware that the solution must be self-managing.

Developers are not going to want to learn T/SQL statements for use in their code. As a result, Microsoft has decided to use OPath inside WinFS. OPath allows you to search for objects inside an objectspace. Within WinFS, Opath will map managed code operators into SQL operators. The result is that developers will be able to concentrate on using managed code and allow it to deal with the SQL operators.

So how will it work?
Microsoft believes that the key to making data more manageable is to use Metadata. Every object that you store will have associated metadata that is accessible to the OS and applications. You can search on the metadata to find what you want. You will be able to create folders by grouping all objects with a common piece of metadata. Objects will sit in multiple folders but only one copy of them will ever exist unless you insist on making duplicates.

Sounds wonderful? Well maybe. Much will depend on how Microsoft is able to make it work. The idea of using Metadata is not new. Take a look at most office suite products over the last 10 years. Vendors have used metadata to capture the names of the document authors, subject, keywords and other information. Now look in reality at a selection of your own documents, spreadsheets or presentations. How many have accurate or usable metadata. Could you search across a block of directories and find all your output based on the metadata?

If Microsoft is going to make WinFS work, it also has to overcome the current apathy surrounding the use of metadata.

One way of doing that is to persuade users and developers that there are benefits to the use of metadata. One of the keys to this is the word Relationships. By using metadata effectively, you can search for information using relationships between different items. Yet to make those relationships work, they need to be relevant to your data and your business.

As an example, look at the information held on a meeting. If you are suddenly told you have a meeting, the immediate response is when, where, with who, what’s it about and a range of other questions. All of this information should be linked to the meeting because it forms part of the key information.

The ability to find all data related to a particular client, irrespective of where it is, is very attractive. Both business and home users will find this an attractive opportunity. More importantly relationships mean that data can exist in multiple places at the same time. This is because the user simply has a “view” onto the data store. That view determines what they see. Therefore it is possible for an item to validly exist in multiple views due to its relationships and metadata.

Part 3 moves the discussion on to Schemas and then draws the piece to a conclusion.