Difference between revisions of "User:Djsmiley2k"

From Archiveteam
Jump to navigation Jump to search
Line 6: Line 6:
* While html in pages can make them look "nice" its ****ing annoying to try and edit nicely if your not a html expert - look into converting into proper mediawiki mark up instead
* While html in pages can make them look "nice" its ****ing annoying to try and edit nicely if your not a html expert - look into converting into proper mediawiki mark up instead
** Can we get some templates for projects (what is a project!?) / archive tasks / other crap
** Can we get some templates for projects (what is a project!?) / archive tasks / other crap
== Generic Wget command ==
  export USER_AGENT="Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27"
  export SAVE_HOST=""
  export WARC_NAME=""
  wget \
  -e robots=off --mirror --page-requisites \
  --waitretry 5 --timeout 60 --tries 5 --wait 1 \
  --warc-header "operator: Archive Team" --warc-cdx --warc-file="$WARC_NAME" \
  -U "$USER_AGENT" "$SAVE_HOST"


== Limit Warrior b/w ==
== Limit Warrior b/w ==

Revision as of 07:52, 10 April 2013

Stuff

  • Need to figure full wiki/site layout - currently everything giant missmash
  • Will set fire to anyone who breaks the nice design changes
  • While html in pages can make them look "nice" its ****ing annoying to try and edit nicely if your not a html expert - look into converting into proper mediawiki mark up instead
    • Can we get some templates for projects (what is a project!?) / archive tasks / other crap

Generic Wget command

 export USER_AGENT="Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27"
 export SAVE_HOST=""
 export WARC_NAME=""
 wget \
 -e robots=off --mirror --page-requisites \
 --waitretry 5 --timeout 60 --tries 5 --wait 1 \
 --warc-header "operator: Archive Team" --warc-cdx --warc-file="$WARC_NAME" \
 -U "$USER_AGENT" "$SAVE_HOST"


Limit Warrior b/w

VBoxManage bandwidthctl archiveteam-warrior-2 --name Limit --add network --limit 3

Must be done while VM is powered off - can't be done with saved state. :(

Remote warrior control

Either ssh forward to local system:

ssh -L 8001:localhost:8001 tim.bowers@xxx.xxx.xxx.xxx -f -N 

OR

curl -d "project_name=punchfork" http://localhost:8001/api/select-project

New Versions

main page


Important URLs

Is the rsync host up?


EC2 Instance setups

debian-squeeze-i386-warrior (ami-9c69f1f5)

User Text: {"downloader": "Smiley", "selected_project": "posterous", "concurrent_items": "6", "shared:rsync_threads": "4"}

Add second disk - 10Gb

Open port 22 0.0.0.0/0

Setup SSH forwarding: ssh -i ./.ssh/amazonkey.pem -N -f -L 8002:localhost:8001 ubuntu@***********.compute-1.amazonaws.com

Set automatic shutdown : echo "0 20 * * * root /sbin/shutdown -h now" | sudo tee /etc/cron.d/shutdown

Digital Ocean

sign up for DO -> use SSDTWEET code -> make a $10 payment -> unleash 500 instances upon the world

apt-get update && apt-get -y install git make python-pip libgnutls-dev liblua5.1-dev && pip install seesaw && git clone https://github.com/ArchiveTeam/yahoomessages-grab.git && cd yahoomessages-grab/ && ./get-wget-lua.sh && run-pipeline pipeline.py --disable-web-server Smiley