Xfire

From Archiveteam
Revision as of 21:10, 17 June 2015 by Bzc6p (talk | contribs)
Jump to navigation Jump to search
Xfire
URL http://social.xfire.com
Status Closing
Archiving status Unknown
Archiving type Unknown
IRC channel #xfired (on hackint)

June 10, #xfired:

<bzc6p> So, I've spent some hours on some algorithm despite the fact that an ArchiveTeam go wouldn't be appropriate at the moment.
<bzc6p> Users won't be served even until the deadline.
<bzc6p> And the site has a fuckton of content, discovery would be necessary – All for which we don't have the time.
<bzc6p> If nothing changes, we must see it burn away. Users, save your content while you can.
<bzc6p> In case some change comes, e.g. deadline changes, and we have a chance to archive, a good start may be what I've written on the wiki page.
<bzc6p> But until that, I don't waste a second on it – hope you'll understand, it's all in vain now.
<bzc6p> (It does not prevent anyone from working on it if he has nothing better to do, though.)

June 12, #xfired:

<bzc6p> When we got informed about the closure, me, achip and another guy (outsider) jumped on it. Later we found out a few things:
<bzc6p> (1) We had two days. (It should have closed on 12th Friday, but still up, so it may be irrelevant now.)
<bzc6p> (2) Users were trying to export their stuff. Later it turned out that the export tool basically does nothing, so the fear that we interfere with them is not relevant anymore.
<bzc6p> (3) Site too big, and discovery is also necessary. Given that we have some time (unknown how much) at least we can start grabbing (2 above things are irrelevant).
<bzc6p> But the size is a big concern! Videos are IDd up to 6xxxxx and alphanumerical. That's hundreds of millions of videos, even if they are usually short. Not to mention the screenshots and the 24 million user profiles. I think it's in the hundred terabytes range, and definitely not a few days' work.
<bzc6p> After realizing these and realizing that I don't have time to deal with it, I stopped developing. However, I created a dirty bash script, just for the algorithm, which, based on a few tests, gets screenshots, videos, user page and friend list pretty much well, having a few more todos and testing left.

Shutdown

Notice on June 10th, shutdown scheduled to 12th, still up (17th).

http://www.reddit.com/r/Games/comments/39a41v/xfire_social_profiles_shutdown_save_your/

Archiving

Users have to wait days for their content to be exported. The export tool barely works, however. Site seems to be slow, either way.

There may be hundreds of millions of videos, tens of millions of profiles, who knows how many screenshots.

Probably too late for ArchiveTeam.

bzc6p's dirty bash script, may be buggy

Just for the algorithm. Should be rewritten in lua/python/whatever if we really saved it. Not really tested, may be incomplete.

There may be other things to be saved (e.g. games, communities (database broken?)), this does just the most important: videos, screenshots, friends, profile page, avatar. Spits out a LIST with the list of files to be downloaded (WARCed). Favorite servers and gaming history may be already on the page. For them to work and for the screenshots to be shown the javascripts should be available, but the links are broken, js-es not found on xfire.com but found e.g. on 208.whatever

Sole parameter the user id.

Content should be modified afterwards (some addresses replaced, e.g. the 208.whatever to social.xfire.com or classic.xfire.com). See the reddit thread.

Does not include page requisites, they must be downloaded once.

#!/bin/bash
HOST="208.88.178.38"
rm LIST 2>/dev/null
echo "http://$HOST/profile/$1/" >> LIST
echo "http://$HOST/friends/$1/" >> LIST
wget "http://$HOST/profile/$1/" -O - | grep "src='http://screenshot.xfire.com/avatar/" | cut -d"'" -f 6 >> LIST
wget "http://$HOST/profile/$1/screenshots/" -O - | grep "href=\"/profile/$1/screenshots/" | cut -d'"' -f 2 | sed "s/^/http:\/\/$HOST/g" > screenshots
while read line
do
  MAXPAGE=32000
  OLDMAXPAGE=0
  while [ $OLDMAXPAGE -ne $MAXPAGE ]
  do
    OLDMAXPAGE=$MAXPAGE
    MAXPAGE=`wget $line -O - | grep "page=" | tail -1 | cut -d"=" -f 3 | cut -d"&" -f 1`
    if [ -z "$MAXPAGE" ]; then
      MAXPAGE=0
      break
    fi
  done
  rm albumpages
  echo $line >> albumpages
  [ $MAXPAGE -ge 1 ] && echo "$line?page=0&count=24" >> LIST
  for (( i=1; i<=$MAXPAGE; i++))
  do
    echo "$line?page=$i&count=24" >> albumpages
  done
  cat albumpages >> LIST
  while read line2
  do
    rm album
    wget $line2 -O album
    grep 'src="http://screenshot.xfire.com/s/' album | cut -d'"' -f 2 > thumbnames
    cat thumbnames >> LIST
    cat thumbnames | sed "s/-1/-3/g" >> LIST
    cat thumbnames | sed "s/-1/-4/g" >> LIST
    rm thumbnames
    grep 'href="?view#' album | cut -d'"' -f 2 | sed "s/^/http:\/\/$HOST\/profile\/screenshots\/$1\//g" | sed "s/^/`echo $line2 | cut -d'/' -f 7`\//g" >> LIST
    rm album
    # TODO: support for comments
  done < albumpages
  rm albumpages
done < screenshots
rm screenshots

echo "http://$HOST/profile/$1/videos/" >> LIST
wget "http://$HOST/profile/$1/videos/" -O - | grep "href=\"/profile/$1/videos/" | cut -d'"' -f 2 | sed "s/^/http:\/\/$HOST/g" > videos
while read line
do
  # This is just speculative, but probably it works just as with the screenshots.
  MAXPAGE=32000
  OLDMAXPAGE=0
  while [ $OLDMAXPAGE -ne $MAXPAGE ]
  do
    OLDMAXPAGE=$MAXPAGE
    MAXPAGE=`wget $line -O - | grep "page=" | tail -1 | cut -d"=" -f 3 | cut -d"&" -f 1`
    if [ -z "$MAXPAGE" ]; then
      MAXPAGE=0
      break
    fi
  done
  rm albumpages
  echo $line >> albumpages
  [ $MAXPAGE -ge 1 ] && echo "$line?page=0&count=24"
  for (( i=1; i<=$MAXPAGE; i++))
  do
    echo "$line?page=$i&count=24" >> albumpages
  done
  cat albumpages >> LIST
  while read line2
  do
    rm album
    wget $line2 -O album
    grep "video.xfire.com" album | cut -d'"' -f 4 >> LIST
    grep "video.xfire.com" album | cut -d'"' -f 2 | sed "s/^/http:\/\/$HOST/g" > videos
    rm album
    cat videos >> LIST
    while read line3
    do
      wget $line3 -O - | grep "\.mp4" | cut -d"'" -f 2 >> LIST
      # TODO: support for comments
    done < videos
    rm videos
  done < albumpages
  rm albumpages
done < videos
rm videos