Difference between revisions of "Software"

From Archiveteam
Jump to navigation Jump to search
Line 7: Line 7:
* [http://crawler.archive.org/ Heritrix] -- what archive.org use
* [http://crawler.archive.org/ Heritrix] -- what archive.org use
* [http://pavuk.sourceforge.net/ Pavuk] -- a bit flaky, but very flexible
* [http://pavuk.sourceforge.net/ Pavuk] -- a bit flaky, but very flexible
* [https://www.tweetscan.com/data.php TweetScan] -- download your Twitter archive going back to 12/2007 in CSV format (requires Twitter account login/password)

Revision as of 00:06, 12 January 2009

General Tools

  • GNU WGET
  • cURL
  • HTTrack
  • Heritrix -- what archive.org use
  • Pavuk -- a bit flaky, but very flexible
  • TweetScan -- download your Twitter archive going back to 12/2007 in CSV format (requires Twitter account login/password)