Difference between revisions of "Ispygames"
Line 6: | Line 6: | ||
== The Problems == | == The Problems == | ||
− | Once you start digging around these sites you find it to be a mess of inconsistent url schemes and content everywhere. Some files are being hosted on MediaFire. | + | * Once you start digging around these sites you find it to be a mess of inconsistent url schemes and content everywhere. |
+ | * Some files are being hosted on MediaFire. | ||
== What we know == | == What we know == | ||
Line 13: | Line 14: | ||
* A clean list with dups and bad domains is already being process and will be posted here when complete. | * A clean list with dups and bad domains is already being process and will be posted here when complete. | ||
* Most of the sites are not that big, but a few are huge. | * Most of the sites are not that big, but a few are huge. | ||
+ | |||
+ | == The plan == | ||
+ | |||
+ | * Save the sites and related content | ||
+ | * Backup the twitter feeds for any associated accounts. [http://www.allmytweets.net/ All my tweets] just takes a username and returns the max tweets possible. | ||
Revision as of 11:56, 23 February 2013
The News
IGN hit with layoffs, 1UP, UGO and GameSpy shutting down
1UP, UGO and GameSpy to be shut down
The Problems
- Once you start digging around these sites you find it to be a mess of inconsistent url schemes and content everywhere.
- Some files are being hosted on MediaFire.
What we know
- We already have a list of almost all the domains involved
- A clean list with dups and bad domains is already being process and will be posted here when complete.
- Most of the sites are not that big, but a few are huge.
The plan
- Save the sites and related content
- Backup the twitter feeds for any associated accounts. All my tweets just takes a username and returns the max tweets possible.
wget test command
USER_AGENT="Mozilla/5.0 (Windows; U; MSIE 9.0; Windows NT 9.0; en-US)" SAVE_HOST="planetdoom.gamespy.com" wget -e robots=off --mirror --page-requisites --waitretry 5 --timeout 60 \ --tries 10 --warc-header "operator: Archive Team" --warc-cdx \ "$SAVE_HOST" --warc-file="$SAVE_HOST" --wait 4 -U "$USER_AGENT" \ --span-hosts --domains=$SAVE_HOST,pcmedia.gamespy.com,pnmedia.gamespy.com,pspmedia.gamespy.com,oystatic.ignimgs.com