Difference between revisions of "User:Sanqui"
Jump to navigation
Jump to search
(What I did and what I do) |
(add wget command) |
||
Line 12: | Line 12: | ||
Currently trying to organize saving the sites hosted for free on [[Internet Centrum]]. | Currently trying to organize saving the sites hosted for free on [[Internet Centrum]]. | ||
=== Parallel wget === | |||
<pre> | |||
#!/bin/sh | |||
cat $LIST | xargs -n 1 -P $PARALLEL -I % \ | |||
wget \ | |||
-mc --waitretry 5 --timeout 60 --tries 5 \ | |||
--user-agent="Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:28.0) Gecko/20100101 Firefox/28.0"\ | |||
-e robots=off \ | |||
--warc-header "operator: Archive Team (Sanqui)" --warc-cdx --warc-file="sanqui_00_%" \ | |||
-o "sanqui_00_%.log" \ | |||
% | |||
</pre> |
Revision as of 23:36, 4 February 2015
Email: gsanky@gmail.com
On #archiveteam as Sanqui
Further contact info on my homepage: http://sanqui.rustedlogic.net
I archived Twitch Plays Pokémon logs: https://archive.org/details/tpp_logs
Currently trying to organize saving the sites hosted for free on Internet Centrum.
Parallel wget
#!/bin/sh cat $LIST | xargs -n 1 -P $PARALLEL -I % \ wget \ -mc --waitretry 5 --timeout 60 --tries 5 \ --user-agent="Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:28.0) Gecko/20100101 Firefox/28.0"\ -e robots=off \ --warc-header "operator: Archive Team (Sanqui)" --warc-cdx --warc-file="sanqui_00_%" \ -o "sanqui_00_%.log" \ %