Difference between revisions of "Software"

From Archiveteam
Jump to: navigation, search
m (Reverted edits by GeorgeHoward (talk) to last revision by Nemo bis)
Line 1: Line 1:
 
__NOTOC__
 
__NOTOC__
 +
== WARC Tools ==
 +
[[The_WARC_Ecosystem]] includes information on wget, Heritrix
 +
 
== General Tools ==
 
== General Tools ==
  
 
* [[Wget|GNU WGET]]
 
* [[Wget|GNU WGET]]
 
** Backing up a Wordpress site: "wget --no-parent --no-clobber --html-extension --recursive --convert-links --page-requisites --user=<username> --password=<password> <path>"
 
** Backing up a Wordpress site: "wget --no-parent --no-clobber --html-extension --recursive --convert-links --page-requisites --user=<username> --password=<password> <path>"
* [[Wget with WARC output]]
 
 
* [http://curl.haxx.se/ cURL]
 
* [http://curl.haxx.se/ cURL]
 
* [http://www.httrack.com/ HTTrack] - [[HTTrack options]]
 
* [http://www.httrack.com/ HTTrack] - [[HTTrack options]]
* [http://crawler.archive.org/ Heritrix] -- what archive.org use
 
 
* [http://pavuk.sourceforge.net/ Pavuk] -- a bit flaky, but very flexible
 
* [http://pavuk.sourceforge.net/ Pavuk] -- a bit flaky, but very flexible
 
* http://warrick.cs.odu.edu/warrick.html
 
* http://warrick.cs.odu.edu/warrick.html

Revision as of 19:39, 21 April 2013

WARC Tools

The_WARC_Ecosystem includes information on wget, Heritrix

General Tools

Hosted tools

Pinboard is a convenient social bookmarking service that will archive copies of all your bookmarks for online viewing. The catch is that it costs $9.25 just to join, plus $25/year for the archival feature and you can only download archives of your 25 most recent bookmarks in a particular category. This may pose problems if you ever need to get your data out in a hurry.

Site-Specific

Format Specific