gave a great talk about archiving the web.
Here is the MP3.
Jeff blogged it:
Brewster Kahle arguing that “universal access to all knowledge is possible.” Well, drat, I have to run out for a few minutes and I’ll miss universal knowledge. I’ll pick it up on another blog. He says there are 26 million books in the Library of Congress, the largest in the world; more than half are out of copyright. That’s 26 terabytes of data that would cost $60k to store. He said it costs about $10 to scan a book. He’s working with a company in Toronto to get robotic help. So the cost is $260 million to scan the LOC.
blogged it some of his notes:
Google announced that it will digitize in-print material and out-of-copyright works (like AMZN’s thing).
It costs $10/book to scan they’re digitizing all the books in the Library of Alexandria, and they’re going this in China, too.
A group in Toronto is doing a robot-scanner that will bring the cost in the industrial world where labor is more expensive to scan books for $10. At $10 per, that $260 Million to scan all the books.