Frequently Asked Questions
How can I help?
Is the Archive Team affiliated with the Internet Archive (archive.org)?
Not directly. A few members are affiliated, but majority of Archive Team members are volunteers who help while not busy at work or school.
How should I go about backing things up?
What would you like to back up? If you want to mirror/backup a website, the de facto tool is Wget (but there's lots more, see Software!). WARC files are highly recommended as they can be ingested by the Wayback Machine.
What are these WARC files in the Internet Archive? How do I extract files from a WARC file?
WARC files are de facto medium of digital preservation of the web. These WARC files are ingested by the Wayback Machine.
There is a growing number of tools that can manipulate WARC files in The WARC Ecosystem.
Where do all the saved files go?
Files are ultimately uploaded to Internet Archive on the Archive Team collection.
I think there is a web site that's going to shut down / sun set / end its incredible journey. Can you save it?
Try searching the wiki for a page about the specific website to find out more about what happened. Typically, there are several ways of recovering files from the Internet Archive:
- A specially crafted username lookup page created by Archive Team
- Allows you to search by your username and will present the relevant materials. Only a small set of projects have this feature.
- The Internet Archive's Wayback Machine
- This method is the easiest for most users but some web pages take months to show up in the Wayback Machine.
- Individual WARC Files uploaded to the Internet Archive
- This method is the most accurate but requires power user skills with working with WARC files. As well, WARC files produced by the Internet Archive are not publicly available (but the ones by Archive Team are always available).
Can someone remove or fix something on the Internet Archive (archive.org)?
Is there a backup of the data on the archiveteam.org website? If so where can I download it?
Additionally, an XML dump of the Mediawiki database (which can be imported into any MediaWiki software) is accessible at http://www.archiveteam.org/dumps. New backups are currently pushed out once a week (and will be increased if changes on the site require it). All images are also wrapped into a images.tar.gz file, although our entire images directory is available at http://www.archiveteam.org/images.
Is there a mirror of the archiveteam.org website?
There is a backup from August 03, 2011 available. The main things that are not included are: Site history, Edit & source of the pages, Special pages and other minor links. (See "Not Crawled.txt") Click here to download.
I went through the wiki and I still have a question! How do I contact the Archive Team?