Difference between revisions of "Ispygames"
Jump to navigation
Jump to search
Line 19: | Line 19: | ||
USER_AGENT="Mozilla/5.0 (Windows; U; MSIE 9.0; Windows NT 9.0; en-US)" | USER_AGENT="Mozilla/5.0 (Windows; U; MSIE 9.0; Windows NT 9.0; en-US)" | ||
SAVE_HOST="planetdoom.gamespy.com" | SAVE_HOST="planetdoom.gamespy.com" | ||
wget -e robots=off --mirror --page-requisites --waitretry 5 --timeout 60 \ | wget -e robots=off --mirror --page-requisites --waitretry 5 --timeout 60 \ | ||
--tries 10 --warc-header "operator: Archive Team" --warc-cdx \ | --tries 10 --warc-header "operator: Archive Team" --warc-cdx \ | ||
"$SAVE_HOST" --warc-file="$SAVE_HOST" --wait 4 -U "$USER_AGENT" \ | "$SAVE_HOST" --warc-file="$SAVE_HOST" --wait 4 -U "$USER_AGENT" \ | ||
--span-hosts --domains=$SAVE_HOST,pcmedia.gamespy.com,pnmedia.gamespy.com,pspmedia.gamespy.com | --span-hosts --domains=$SAVE_HOST,pcmedia.gamespy.com,pnmedia.gamespy.com,pspmedia.gamespy.com,oystatic.ignimgs.com | ||
</pre> | </pre> |
Revision as of 11:50, 23 February 2013
The News
IGN hit with layoffs, 1UP, UGO and GameSpy shutting down
1UP, UGO and GameSpy to be shut down
The Problems
Once you start digging around these sites you find it to be a mess of inconsistent url schemes and content everywhere. Some files are being hosted on MediaFire.
What we know
- We already have a list of almost all the domains involved
- A clean list with dups and bad domains is already being process and will be posted here when complete.
- Most of the sites are not that big, but a few are huge.
wget test command
USER_AGENT="Mozilla/5.0 (Windows; U; MSIE 9.0; Windows NT 9.0; en-US)" SAVE_HOST="planetdoom.gamespy.com" wget -e robots=off --mirror --page-requisites --waitretry 5 --timeout 60 \ --tries 10 --warc-header "operator: Archive Team" --warc-cdx \ "$SAVE_HOST" --warc-file="$SAVE_HOST" --wait 4 -U "$USER_AGENT" \ --span-hosts --domains=$SAVE_HOST,pcmedia.gamespy.com,pnmedia.gamespy.com,pspmedia.gamespy.com,oystatic.ignimgs.com