search extends to text inside books

This will be blogged to death before long, but for good reason. Last night I saw a mention on a mailing list about offering a full text search. It didn’t really sink in until I used it. Now, when you enter search terms in the standard search box, the results returned, in addition to the usual matching on titles and authors, will include books that contain your search terms somewhere inside them.

It’s a very simple idea, I suppose—the fact that it’s been implemented seems amazing, though. You get a little extract from each book containing your search term, with the page number. Click on the page number and you can see an image of the actual page. Amazon have scanned 33 million pages in over 120,000 books so far. You can browse the next or previous two pages to see the full context of the search match. And the search terms are highlighted on the page images.

It’s seems like a very canny move on Amazon’s part, commercially speaking, but like all the best commercial ideas on the web, it weaves commerce and usefulness together almost seamlessly. Amazon’s already an invaluable tool for anyone composing bibliographies and such like, and I’ve often used it to find books on certain subjects that I then order via the local library system. Every now and then something I find seems worth parting with cash for, and I invariably get it at Amazon. In research terms, though, suddenly Amazon becomes an infinitely more useful tool—and hence a more seductive shopfront.

This seems slightly reminiscent of the race to sequence the human genome between the private company Celera Genomics and the international Human Genome Project. Project Gutenberg labours away trying to make all non-copyrighted books available for free, and when it scrapes past 6,000 titles, Amazon announces it’s scanned over 120,000 texts—only this is mostly copyrighted material, for sale.

It’s a very limited analogy, for sure. Yeah, I’m suspicious of all out-and-out money-making ventures, like any thinking person; far be it from me to toot their horns for them. But as the post-dot-com-boom web staggers along, its open-access origins weighed down with commercial concern and uncertainty, Amazon’s bold move seems like as good a compromise between business and utility as anything else.