INTERNETARCHIVE.BAK/torrents implementation

From Archiveteam
Jump to: navigation, search

Create 42000 chunks of 500 GB of the IA, each a zip file.

Make 42000 torrents.

Make an interface to suggest a torrent, at random (or the one most needing seeds), to a user.

Let users add one or more torrents, and seed.

Every 500 GB added/changed in the Internet Archive, make a new zip file, and torrent, and wait for some users to add that one. (Maybe needs a mechanism to ensure that users who have free space remember to check for new torrents.)

This seems like the simplest possible solution.


Note that some bittorrent trackers have torrents that sum to a larger total size than this, seeded healthily. Their torrents tend to be smaller than 500 gb though.

The Geocities torrent, at 900 gb, was an exceedingly large torrent, and there was some trouble keeping it seeded.

At 500 GB, this leaves out users who have some smaller fraction of a disk available to donate. This might reduce contributors significantly. A smaller chunk size might be better.

The user needs to keep their torrent client running, or they won't be counted as a seed. Offline or rarely online storage can be used, but won't be counted. So counting seeds will undercounf the number of copies.

a simplification

Every IA item already has a torrent associated with it. The torrent includes the derived files, but that can be amended (each one could have the current torrent plus one that includes only original files.) The simplest possible solution then is to get a few seeders into each of these swarms (IA is used as a web seed). One way to accomplish that is to write a custom BitTorrent client which automates the process of deciding which swarms each user joins, allows the user to decide how much space to use, etc. A custom BitTorrent client wouldn't be a very simple thing on it's own, but it could be quite simple for users who just want to donate some space without having to think about BitTorrent.

This has the additional advantages of storing the backed-up files on disk in a format which is readily usable by the user, and of requiring little to no additional work on IA's part.