Difference between revisions of "Ispygames"

From Archiveteam
Jump to navigation Jump to search
Line 19: Line 19:
USER_AGENT="Mozilla/5.0 (Windows; U; MSIE 9.0; Windows NT 9.0; en-US)"
USER_AGENT="Mozilla/5.0 (Windows; U; MSIE 9.0; Windows NT 9.0; en-US)"
SAVE_HOST="planetdoom.gamespy.com"
SAVE_HOST="planetdoom.gamespy.com"
wget -e robots=off --mirror --page-requisites --waitretry 5 --timeout 60 \
wget -e robots=off --mirror --page-requisites --waitretry 5 --timeout 60 \
--tries 10 --warc-header "operator: Archive Team" --warc-cdx \
--tries 10 --warc-header "operator: Archive Team" --warc-cdx \
"$SAVE_HOST" --warc-file="$SAVE_HOST" --wait 4 -U "$USER_AGENT" \
"$SAVE_HOST" --warc-file="$SAVE_HOST" --wait 4 -U "$USER_AGENT" \
--span-hosts --domains=$SAVE_HOST,pcmedia.gamespy.com,pnmedia.gamespy.com,pspmedia.gamespy.com
--span-hosts --domains=$SAVE_HOST,pcmedia.gamespy.com,pnmedia.gamespy.com,pspmedia.gamespy.com,oystatic.ignimgs.com
</pre>
</pre>

Revision as of 11:50, 23 February 2013

The News

IGN hit with layoffs, 1UP, UGO and GameSpy shutting down
1UP, UGO and GameSpy to be shut down

The Problems

Once you start digging around these sites you find it to be a mess of inconsistent url schemes and content everywhere. Some files are being hosted on MediaFire.

What we know

  • We already have a list of almost all the domains involved
  • A clean list with dups and bad domains is already being process and will be posted here when complete.
  • Most of the sites are not that big, but a few are huge.


wget test command

USER_AGENT="Mozilla/5.0 (Windows; U; MSIE 9.0; Windows NT 9.0; en-US)"
SAVE_HOST="planetdoom.gamespy.com"

wget -e robots=off --mirror --page-requisites --waitretry 5 --timeout 60 \
--tries 10 --warc-header "operator: Archive Team" --warc-cdx \
"$SAVE_HOST" --warc-file="$SAVE_HOST" --wait 4 -U "$USER_AGENT" \
--span-hosts --domains=$SAVE_HOST,pcmedia.gamespy.com,pnmedia.gamespy.com,pspmedia.gamespy.com,oystatic.ignimgs.com