Valhalla

From Archiveteam
Revision as of 22:42, 18 September 2014 by Mithrandir (talk | contribs) (IRC)
Jump to navigation Jump to search

This wiki page is a collection of ideas for Project Valhalla.

<SketchCow> Basically, we have this situation where we have stuff that is being threatened,
and it's huge, and then it's either not so threatened or it's in a weird quantum state.
So, this really stretches the bounds of what IA does. It's a huge amount of data, it's not likely 
to be overly touched if the originals are up, and IA will spend/lose a lot of money pulling it into their infrastructure.
So maybe we can discuss actual, not pie-in-the-sky possibilities of what we can do to have some sort of not-IA pile of storage.

Join the discussion in #huntinggrounds.

Options

  • Hard drives
    • These would have to be live. HDDs decay quickly, and if they're not spinning, you can't detect failures.
    • Possible software for this kind of thing; syncthing, Tahoe-LAFS, ...?
  • Commercial/archival-grade tapes
  • Consumer tape systems (VHS, Betamax, cassette tapes, ...)
  • Vinyl
  • PaperBack
  • Optar
  • Blu-ray: lasts a LOT longer than CD/DVD but should not be assumed to last more than a decade
"<Drevkevac> still, if its true, you could do, perhaps, raidz3s in groups of 15 disks or so?
<SketchCow> Please add paperbak to the wiki page.
<SketchCow> Fuck Optical Media. not an option;.
<Drevkevac> that would give you ~300GB per disk group, with 3 disks
  • M-DISC"
    • Unproven technology, but potentially interesting.
  • Flash media
    • Wears out quickly, not-so-good long term storage
    • Soliciting donations for old flash media from people, or sponsorship from flash companies?
  • Glass/metal etching

Non-options

  • Ink-based Consumer Optical Media (CDs, DVD, etc.)
    • Differences between Blu-Ray and DVD? DVDs do not last very long.
  • BitTorrent Sync
    • Proprietary (currently), so not a good idea to use as an archival format/platform
  • Amazon Glacier
    • Amazon Glacier seems like a a great idea, until you realize they mean 1 cent per gigabyte per month. This is $120 per terabyte per year. The transfer out of 100TB would also run over $10,000 the month its pulled from the system.
  • Floppies
    • "Because 1.4 trillion floppies exists less than 700 billion floppies. HYPOTHETICALLY, if you set twenty stacks side by side, figure a quarter centimeter per floppy thickness, excluded the size of the drive needed to read the floppies you would still need a structure 175,000 ft. high to house them. Let's also assume that the failure rate for floppies is about 5% (everyone knows that varies by brand, usage, time of manufacture, materials used, etc, but lets say 5% per year). 70 million of those 1.4 trillion floppies are unusuable. Figuring 1.4 MB per floppy disk, you are losing approximately 100MB of porn each year. Assuming it takes 5 seconds to replace a bad floppy, you would have to spend 97,222 hrs/yr to replace them. Considering there are only 8,760 hrs per year, you would require a staff of 12 people replacing floppies around the clock or 24 people on 12 hr shifts. Figuring $7/hr you would spend $367,920 on labor alone. Figuring a nickel per bad floppy, you would need $3,500,000 annually in floppy disks, bringing your 1TB floppy raid operating costs (excluding electricity, etc) to $3,867, 920 and a whole landfill of corrupted porn. Thank you for destroying the planet and bankrupting a small country with your floppy based porn RAID." (source)

From IRC

<Drevkevac> we are looking to store 100TB+ of media offline for 25+ years
<Drevkevac> if anyone wants to drop in, I will pastebin the chat log
<rat> DVDR and BR-R are not high volume. When you have massive amounts of data, raid arrays have too many points of failure.
<rat> Drevkevac: I work in a tv studio. We have 30+ years worth of tapes. And all of them are still good.
<rat> find a hard drive from 30 years ago and see how well it hooks up ;)
<brousch_> 1500 Taiyo Yuden Gold CD-Rs http://www.mediasupply.com/taiyo-yuden-gold-cd-rs.html

Costs

These are just estimates. Calculation: $/TB = Total Cost / Total Capacity

Purpose Cost (USD) Per TB
Tape Media $36.4
Hard drives $43