04 July 2017

The Atlantic: “Torching the Modern-Day Library of Alexandria”

By 2002, it seemed to Page like the time might be ripe to come back to books. With that 40-minute number in mind, he approached the University of Michigan, his alma mater and a world leader in book scanning, to find out what the state of the art in mass digitization looked like. Michigan told Page that at the current pace, digitizing their entire collection—7 million volumes—was going to take about a thousand years. Page, who’d by now given the problem some thought, replied that he thought Google could do it in six.

He offered the library a deal: You let us borrow all your books, he said, and we’ll scan them for you. You’ll end up with a digital copy of every volume in your collection, and Google will end up with access to one of the great untapped troves of data left in the world. Brin put Google’s lust for library books this way: You have thousands of years of human knowledge, and probably the highest-quality knowledge is captured in books. What if you could feed all the knowledge that’s locked up on paper to a search engine?

James Somers

A sad end to one of Google’s most interesting and, dare I say, altruistic projects: digitizing physical books. It could have made a huge library available to the entire world, while at the same time supplying some revenues for authors; instead it got tangled up in complicated legal battles about copyright. Even though it eventually won the case, Google is only allowed to display snippets in search results, while the full text of the books remains locked away on their servers – at least until some other legislative or judicial decision can overrule the current compromise that doesn’t satisfy any of the parties, nor the public.

Book publishing isn’t the healthiest industry in the world, and individual authors don’t make any money out of out-of-print books, Cunard said to me. Not that they would have made gazillions of dollars with Google Books and the Registry, but they would at least have been paid something for it. And most authors actually want their books to be read.

The greatest tragedy is we are still exactly where we were on the orphan works question. That stuff is just sitting out there gathering dust and decaying in physical libraries, and with very limited exceptions, Mtima said, nobody can use them. So everybody has lost and no one has won.

I asked someone who used to have that job, what would it take to make the books viewable in full to everybody? I wanted to know how hard it would have been to unlock them. What’s standing between us and a digital public library of 25 million volumes?

You’d get in a lot of trouble, they said, but all you’d have to do, more or less, is write a single database query. You’d flip some access control bits from off to on. It might take a few minutes for the command to propagate.

