Difference between revisions of "Friendster"

From Archiveteam
Jump to navigation Jump to search
Line 1: Line 1:
{{Infobox project
{{Infobox project
| title = Friendster
| title = Friendster
<!| image = -->
| image = Friendster - Home 1304442914645.png
| URL = http://www.friendster.com/
| URL = {{url|1=http://www.friendster.com/}}
| project_status = {{Unknown}}
| project_status = {{Unknown}}
| archiving_status = {{Unknown}}
| archiving_status = {{Unknown}}
Line 8: Line 8:
'''Friendster''' is an early social networking site which announced on April 25th, 2011 that most of the user-generated content on the site would be deleted on May 31st, 2011. It's estimated that Friendster has over 115 million registered users.
'''Friendster''' is an early social networking site which announced on April 25th, 2011 that most of the user-generated content on the site would be deleted on May 31st, 2011. It's estimated that Friendster has over 115 million registered users.


= Site Organization =
== Site Organization ==


Content on Friendster seems to be primarily organized by the id number of the users, which were sequentially assigned starting at 1. This will make it fairly easy for wget to scrape the site and for us to break it up into convenient work units. The main components we need to scrape are the profile pages, photo albums and blogs, but there may be others. More research is needed
Content on Friendster seems to be primarily organized by the id number of the users, which were sequentially assigned starting at 1. This will make it fairly easy for wget to scrape the site and for us to break it up into convenient work units. The main components we need to scrape are the profile pages, photo albums and blogs, but there may be others. More research is needed


== Profiles ==
=== Profiles ===


Urls of the form <nowiki>'http://profiles.friendster.com/<userid></nowiki>'. Many pictures on these pages are hosted on urls that look like <nowiki>'http://photos-p.friendster.com/photos/<lk>/<ji>/nnnnnijkl/<imageid>.jpg'</nowiki>, but these folders aren't browsable directly. Profiles will not be easy to scrape with wget.
Urls of the form <nowiki>'http://profiles.friendster.com/<userid></nowiki>'. Many pictures on these pages are hosted on urls that look like <nowiki>'http://photos-p.friendster.com/photos/<lk>/<ji>/nnnnnijkl/<imageid>.jpg'</nowiki>, but these folders aren't browsable directly. Profiles will not be easy to scrape with wget.


== Photo Albums ==
=== Photo Albums ===


A user's photo albums are at urls that look like <nowiki>'http://www.friendster.com/viewalbums.php?uid=<userid>'</nowiki> with individual albums at <nowiki>'http://www.friendster.com/viewphotos.php?a=<album id>&uid=<userid>'</nowiki>. It appears that the individual photo pages use javascript to load the images, so they will be very hard to scrape.
A user's photo albums are at urls that look like <nowiki>'http://www.friendster.com/viewalbums.php?uid=<userid>'</nowiki> with individual albums at <nowiki>'http://www.friendster.com/viewphotos.php?a=<album id>&uid=<userid>'</nowiki>. It appears that the individual photo pages use javascript to load the images, so they will be very hard to scrape.
Line 23: Line 23:
i.e. if the album thumb is at http://photos-p.friendster.com/photos/<lk>/<ji>/nnnnnijkl/<imageid>m.jpg, just drop the final 'm' to get the main photo (or replace it with a 't' to get an even tinier version).
i.e. if the album thumb is at http://photos-p.friendster.com/photos/<lk>/<ji>/nnnnnijkl/<imageid>m.jpg, just drop the final 'm' to get the main photo (or replace it with a 't' to get an even tinier version).


== Blogs ==
=== Blogs ===


Unknown.
Unknown.


= How to help =
== How to help ==


== Scrape profiles ==
=== Scrape profiles ===
 
{{Navigation box}}

Revision as of 17:17, 3 May 2011

Friendster
Friendster - Home 1304442914645.png
URL http://www.friendster.com/[IAWcite.todayMemWeb]
Status Unknown
Archiving status Unknown
Archiving type Unknown
IRC channel #archiveteam-bs (on hackint)

Friendster is an early social networking site which announced on April 25th, 2011 that most of the user-generated content on the site would be deleted on May 31st, 2011. It's estimated that Friendster has over 115 million registered users.

Site Organization

Content on Friendster seems to be primarily organized by the id number of the users, which were sequentially assigned starting at 1. This will make it fairly easy for wget to scrape the site and for us to break it up into convenient work units. The main components we need to scrape are the profile pages, photo albums and blogs, but there may be others. More research is needed

Profiles

Urls of the form 'http://profiles.friendster.com/<userid>'. Many pictures on these pages are hosted on urls that look like 'http://photos-p.friendster.com/photos/<lk>/<ji>/nnnnnijkl/<imageid>.jpg', but these folders aren't browsable directly. Profiles will not be easy to scrape with wget.

Photo Albums

A user's photo albums are at urls that look like 'http://www.friendster.com/viewalbums.php?uid=<userid>' with individual albums at 'http://www.friendster.com/viewphotos.php?a=<album id>&uid=<userid>'. It appears that the individual photo pages use javascript to load the images, so they will be very hard to scrape.

On the individual album pages, the photo thumbnails are stored under similar paths as the main images. i.e. if the album thumb is at http://photos-p.friendster.com/photos/<lk>/<ji>/nnnnnijkl/<imageid>m.jpg, just drop the final 'm' to get the main photo (or replace it with a 't' to get an even tinier version).

Blogs

Unknown.

How to help

Scrape profiles