Difference between revisions of "User:Sanqui"

From Archiveteam
Jump to navigation Jump to search
(What I did and what I do)
(add wget command)
Line 12: Line 12:


Currently trying to organize saving the sites hosted for free on [[Internet Centrum]].
Currently trying to organize saving the sites hosted for free on [[Internet Centrum]].
=== Parallel wget ===
<pre>
#!/bin/sh
cat $LIST | xargs -n 1 -P $PARALLEL -I % \
wget \
-mc --waitretry 5 --timeout 60 --tries 5 \
--user-agent="Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:28.0) Gecko/20100101 Firefox/28.0"\
-e robots=off \
--warc-header "operator: Archive Team (Sanqui)" --warc-cdx --warc-file="sanqui_00_%" \
-o "sanqui_00_%.log" \
%
</pre>

Revision as of 23:36, 4 February 2015

Email: gsanky@gmail.com

On #archiveteam as Sanqui

Further contact info on my homepage: http://sanqui.rustedlogic.net




I archived Twitch Plays Pokémon logs: https://archive.org/details/tpp_logs

Currently trying to organize saving the sites hosted for free on Internet Centrum.


Parallel wget

#!/bin/sh
cat $LIST | xargs -n 1 -P $PARALLEL -I % \
 wget \
 -mc --waitretry 5 --timeout 60 --tries 5 \
 --user-agent="Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:28.0) Gecko/20100101 Firefox/28.0"\
 -e robots=off \
 --warc-header "operator: Archive Team (Sanqui)" --warc-cdx --warc-file="sanqui_00_%" \
 -o "sanqui_00_%.log" \
 %