Microsoft has signed a deal with the British Library that will see it scan 25 million pages from the library's collection that will be made available on its MSN Book Search site next year.
Around 100,000 books from the British Library's 13 million book collection will be digitised.
The agreement comes as Microsoft's competitors, including Google and Yahoo are aggressively compiling online libraries of books amid copyright concerns. The titles to be scanned at the British Library are no longer under copyright restrictions.
Microsoft has contracted the Internet Archive, a non-profit group based in San Francisco that works on digital preservation projects, to do the scanning, said Richard Boulderstone, director of e-Strategy for the British Library. Microsoft is not paying the library for access but the library will benefit, as it has been working for the last 10 years on digitization.
Despite a decade of work, only 0.2 percent of the library's vast collection has been digitized, Boulderstone said. "Actually, for us to have some of these commercial players come along and want to work with us on digitising these collections, it's fantastic," Boulderstone said.
Microsoft's announcement comes as Google announced a significant addition of scans from public-domain books to its Google Print site. The company is working from the collections of the University of Michigan, Harvard University, Stanford University, the New York Public Library and Oxford University.
Google is also facing two copyright infringement lawsuits over the scanning of copyright works in those collections, a practice that the company has halted but vows to resume, citing laws that allow certain liberties with the use of protected material. Google said it will focus on out-of-print and older selections.
Yahoo and Microsoft have thrown their support behind the Open Content Alliance (OCA), a group working to digitise public domain text and films.
MSN said last week it is talking with libraries and publishers about offering copyrighted material in its index. Microsoft eventually plans to build a business model around the search service for copyright works, but so far has said it doesn't intend to charge for searches of non-copyrighted material.
Find your next job with techworld jobs