Cory Doctorow has written an article about “petascale datacenters” for Nature. The article itself is a good read; I especially liked the description of the Internet Archive’s storage boxes:

Two racks, each the size of a modest refrigerator, each holding north of a petabyte’s worth of information. These are the PetaBoxes, the Internet Archive’s web-in-a-box systems.
[...]
Each rack draws 12 kilowatts, whereas a normal rack at the facility draws 4.5 kilowatts

That’s a lot of storage, and a lot of power! I wonder whether they use any special cooling techniques for these boxes. Or perhaps Cory misunderstood the description; according to the Internet Archive itself these use 6KW per rack, so the 12 KW might be the total for both racks. For some more specs and pictures of these storage clusters see the Internet Archive’s description or Capricorn Technologies.   

There is also some nice info about the tape backups at CERN. Cory:

Each robot-librarian tends 5 PB of data. It will jump shortly to 10 PB each when the 500-GB tapes are switched to 1-TB models — an upgrade that will take a year of continuous load/read/load/write/discard operations, running in the interstices between the data centre’s higher-priority tasks.

Again, these are some incredible numbers. It also means that the tape robots appear to be utilized pretty well: if replacing all 500 GB tapes takes a year, about one tape is changed every hour. There are also some amazing images of this tape robot; you can find all of them in this photoset. I’ve also found a small photoset from when it was still being filled