Difference between revisions of "Windows Live Spaces"

From Archiveteam
Jump to navigation Jump to search
(Start tutorial (⅓ ready))
 
(Ready article)
Line 1: Line 1:
On March 16, 2011 will definitely shut down the platform of Windows Live Spaces and since September last year Microsoft has been notifying every user who had a Space active to migrate it to Wordpress using your Windows Live ID, save it to your hard drive or remove. Still, there are many long-abandoned blogs that have not yet been migrated so that they may not survive. For these reasons I decided to create this tutorial to know who wants to save some Spaces
On March 16, 2011 will definitely shut down the platform of Windows Live Spaces and since September last year Microsoft has been notifying every user who had a Space active to migrate it to Wordpress using your Windows Live ID, save it to your hard drive or remove. Still, there are many long-abandoned blogs that have not yet been migrated so that they may not survive. For these reasons I decided to create this tutorial to know who wants to save some Spaces.
 
Any correction is welcome and any questions or suggestions about it must be raised in [[Talk:Windows Live Spaces|the discussion of the article]].


== HTTrack (graphic version)==
== HTTrack (graphic version)==
I will explain what is the procedure to download one or more Spaces using [[HTTrack]] graphic version (WinHTTrack in Windows and in Linux is called WebHTTrack).
I will explain what is the procedure to download one or more Spaces using HTTrack graphic version (WinHTTrack in Windows and in Linux is called WebHTTrack).


I assume that the reader should be familiarized with the use of WinHTTrack (or WebHTTrack) so I'll just explain that you need configure (in the Option Panel of the program) to download a Space of Windows Live Spaces.<ref></ref>
I assume that the reader should be familiarized with the use of WinHTTrack (or WebHTTrack) so I'll just explain that you need configure (in the Option Panel of the program) to download a Space of Windows Live Spaces.<ref>If you do not know how to use this program you can check [http://www.kitamuracomputers.com/tidelog/?p=615 this tutorial] (in English) or [http://www.manueldelafuente.com/2009/10/httrack-posible-solucion-la.html this one] (in Spanish)</ref>


In the section "Scan Rules" must be added the following lines:
In the section "Scan Rules" must be added the following lines:
Line 19: Line 21:
  +*.photos.live.com
  +*.photos.live.com
  +*.spaces.live.com
  +*.spaces.live.com
Line 1 to 7 indicate what types of files are downloaded from a Space (if the program finds one these and this lines can be modified to suit the user), the line 8 is because the program tries to capture the comments any post of a blog on Windows Live Spaces and this action generates errors (in addition to a waste of time when exploring a site), line 9 and 12 are used to capturing Spaces of the list of "friends" who might have the Space user which is capturing at that time (these lines are optional), and lines 10 and 11 are to capture the files and photos<ref></ref> that the user can have uploaded there.
Line 1 to 7 indicate what types of files are downloaded from a Space (if the program finds one these and this lines can be modified to suit the user), the line 8 is because the program tries to capture the comments any post of a blog on Windows Live Spaces and this action generates errors (in addition to a waste of time when exploring a site), line 9 and 12 are used to capturing Spaces of the list of "friends" who might have the Space user which is capturing at that time (these lines are optional), and lines 10 and 11 are to capture the files and photos<ref>I'm not sure if which is allocated in *. photos.live.com will continue to exist after March 16, then I take the opportunity to save the section "Photos" of Spaces (if the user of Space has this section) so that the line 11 is also optional</ref> that the user can have uploaded there.


Finally add in the field Browser "Identity" (from the section Browser ID) the following User Agent:
Finally add in the field Browser "Identity" (from the section Browser ID) the following User Agent:
  Googlebot/2.1 (+ http://www.googlebot.com/bot.html)
  Googlebot/2.1 (+ http://www.googlebot.com/bot.html)


==LSSaver==
<ref>Some of the descriptions in this section was taken from http://www.softsea.com/review/LSSaver.html</ref>
LSSaver is a Windows Freeware software to save an Windows Live Space blog to your local disk. It saves useful informations such as, blog title, content and comments. It is able to save the pictures included in the blog to local disk also.
LSSaver is very simple to use, so that its operation is:
* First, you need to enter a Microsoft Live Space username.
* Then, you click on the "Get" button to retrive all blog entries<ref>This operation may take up to several minutes depending on the number of entries that contains a blog, as well the user connection</ref>, when a blog entry is retrieved, it's title will appear in the tree which is the left part of the window. Wait until all titles are trieved. Then you can browse your blog titles by fold/unfold tree, check those you want to save. Once a blog entry is checked, it's content will appear on the right part of the window, check all blogs you want to save and wait until all of them appear.
*To save the selected blogs, you simply click the Save button, a file selection window will open, select where the files will be saved and give a file name and click the Save button on the window, after a while, all the selected blogs are saved. The saved file is a HTML file, you can open it with a browser.
The program works as it should but we must take into consideration some details that differentiate it from any web site downloader:
*As explained before, when the program save a blog all the articles (and comments) are crammed into an HTML file (which could become a problem if the blog has a lot of content).
*The names of the images are stored as 000001, 000002, etc. thus avoiding that the original can be found on the Internet (this refers to the images of external sites linked in a blog) or recognize the file format.
==Auguste script==
The following script written by [[User:Auguste|Auguste]] parse a profile's friend list for get more profiles, then output them to a textfile and remove duplicates. Unfortunately this script has not been completed because Auguste could not to dedicate enough time to work due to their occupations, but he also public the script [http://pastebin.com/2JjZvWuZ here] in case anyone else wants to finish.
#!/usr/bin/perl
use warnings;
use strict;
use HTML::TokeParser;
use WWW::Mechanize;
#This is an unfinished Perl script to extract a list of Windows Live Spaces profile URLs from a given profile.
#It's poorly written and barely working, and also gets stuck in an infinite loop during the &getMaxPages subroutine.
#Feel free to take and modify as you please.
#SpaceInvader.pl
#Usage: SpaceInvader.pl PROFILEURL [OUTFILE]
#  e.g. SpaceInvader.pl http://foobar.spaces.live.com foobar_output.txt
my $url = "$ARGV[0]";
my $outFile;
if ($ARGV[1])
{
$outFile = "$ARGV[1]";
}
else
{
$outFile = "output.txt";
}
open(OUTFILE, ">$outFile");
my $mech = WWW::Mechanize->new();
my $page;
&updatePage($url);
&updatePage(&findFriendList);
my $cid = &getCid;
my $i;
my $pages = 1;
for ($i = 1; $i <= $pages; $i++)
{
my $currentPage = "http://cid-" . $cid . ".profile.live.com/friends/all/?page=$i";
&updatePage($currentPage);
&findFriends;
$pages = &getMaxPages;
}
print("Ok, we're all done. $i pages total.\n");
close(OUTFILE);
sub updatePage()
{
if ($_[0])
{
$url = $_[0];
}
print("Loading page $url...\n");
$mech->get("$url");
$page = HTML::TokeParser->new(\$mech->{content});
}
sub findFriendList()
{
print("Locating the friend list...\n");
while (my $tag = $page->get_tag("a"))
{
if ($tag->[1]{title} and $tag->[1]{title} eq "View friends")
{
print("Found the friend list...\n\n");
return $tag->[1]{href};
}
}
}
sub findFriends()
{
print("Producing a list of friends for CID $cid from page $i...\n");
while (my $tag = $page->get_tag("a"))
{
if ($tag->[1]{id} and $tag->[1]{id} =~ /ic[\d|\w]+_frame_clip/)
{
unless ($tag->[1]{href} =~ /$cid/)
{
my $profilePage = $tag->[1]{href};
print("  $profilePage\n");
print(OUTFILE "$profilePage\n");
}
}
}
print("Finished page $i - moving on.\n\n");
}
sub getCid()
{
my $temp = $url;
$temp =~ /cid-([\w|\d]+)/;
return $1;
}
sub getMaxPages()
{
$mech->get("$url");
$page = HTML::TokeParser->new(\$mech->{content});
print("Checking if there's another page...\n");
print("$url\n");
my $maxPages;
while (my $tag = $page->get_tag("a"))
{
if ($tag->[1]{id} and $tag->[1]{id} e
==Notes==
==Notes==
<references/>
<references/>
==External links ==
*In English:
**[http://ezinearticles.com/?Windows-Live-Spaces-Officially-Closed Windows Live Spaces Officially Closed]
**[http://techie-buzz.com/tech-news/windows-live-spaces-wordpress-migration.html Windows Live Spaces To Shut Down, Move 30 Million Users To WordPress.Com]
**[http://www.liveside.net/2011/02/21/windows-live-spaces-to-close-march-16th-remember Windows Live Spaces to close March 16th, remember?]
*In Spanish:
**[http://www.danisaur.es/2010/09/30/microsoft-cierra-windows-live-spaces/ Microsoft cierra Windows Live Spaces] 04/03/2011 a las 00:44:58"
**[http://grupogeek.com/2010/10/01/microsoft-cierra-windows-live-spaces-y-transfiere-a-sus-usuarios-a-wordpress/ Microsoft cierra Windows Live Spaces y transfiere a sus usuarios a WordPress]
**[http://tecnokadosh.abbaproducciones.cl/2010/10/1612 Windows Live Spaces se cierra]
**[http://solucionok.blogspot.com/2010/10/windows-live-spaces-llega-su-fin-y.html Solucion OK: Windows Live Spaces llega a su fin y continúa con WordPress.com]
**[http://mynetx.es/5275/recordatorio-windows-live-spaces-cerrara-pronto Recordatorio: Windows Live Spaces cerrará pronto]
**[http://pastehtml.com/view/1dhf1ez.html Email de Windows Live que le llega a cada usuario con un Space activo]

Revision as of 08:34, 14 March 2011

On March 16, 2011 will definitely shut down the platform of Windows Live Spaces and since September last year Microsoft has been notifying every user who had a Space active to migrate it to Wordpress using your Windows Live ID, save it to your hard drive or remove. Still, there are many long-abandoned blogs that have not yet been migrated so that they may not survive. For these reasons I decided to create this tutorial to know who wants to save some Spaces.

Any correction is welcome and any questions or suggestions about it must be raised in the discussion of the article.

HTTrack (graphic version)

I will explain what is the procedure to download one or more Spaces using HTTrack graphic version (WinHTTrack in Windows and in Linux is called WebHTTrack).

I assume that the reader should be familiarized with the use of WinHTTrack (or WebHTTrack) so I'll just explain that you need configure (in the Option Panel of the program) to download a Space of Windows Live Spaces.[1]

In the section "Scan Rules" must be added the following lines:

+*.css +*.js -ad.doubleclick.net/* -mime:application/foobar
+*.7z
+*.pdf +*.doc +*.mid +*.3gp +*.djvu +*.amr +*.mp4 +*.ogg +*.ogv +*.ogm
+*.mov +*.mpg +*.mpeg +*.avi +*.asf +*.mp3 +*.mp2 +*.rm +*.wav +*.vob +*.qt +*.vid +*.ac3 +*.wma +*.wmv
+*.zip +*.tar +*.tgz +*.gz +*.rar +*.z
+*.arj +*.dar +*.lzh +*.lz +*.lza +*.arc
+*.gif +*.jpg +*.png +*.tif +*.bmp
-*.entry#comment
+*.profile.live.com/Lists/*
+*.byfiles.storage.live.com/*
+*.photos.live.com
+*.spaces.live.com

Line 1 to 7 indicate what types of files are downloaded from a Space (if the program finds one these and this lines can be modified to suit the user), the line 8 is because the program tries to capture the comments any post of a blog on Windows Live Spaces and this action generates errors (in addition to a waste of time when exploring a site), line 9 and 12 are used to capturing Spaces of the list of "friends" who might have the Space user which is capturing at that time (these lines are optional), and lines 10 and 11 are to capture the files and photos[2] that the user can have uploaded there.

Finally add in the field Browser "Identity" (from the section Browser ID) the following User Agent:

Googlebot/2.1 (+ http://www.googlebot.com/bot.html)

LSSaver

[3] LSSaver is a Windows Freeware software to save an Windows Live Space blog to your local disk. It saves useful informations such as, blog title, content and comments. It is able to save the pictures included in the blog to local disk also.

LSSaver is very simple to use, so that its operation is:

  • First, you need to enter a Microsoft Live Space username.
  • Then, you click on the "Get" button to retrive all blog entries[4], when a blog entry is retrieved, it's title will appear in the tree which is the left part of the window. Wait until all titles are trieved. Then you can browse your blog titles by fold/unfold tree, check those you want to save. Once a blog entry is checked, it's content will appear on the right part of the window, check all blogs you want to save and wait until all of them appear.
  • To save the selected blogs, you simply click the Save button, a file selection window will open, select where the files will be saved and give a file name and click the Save button on the window, after a while, all the selected blogs are saved. The saved file is a HTML file, you can open it with a browser.

The program works as it should but we must take into consideration some details that differentiate it from any web site downloader:

  • As explained before, when the program save a blog all the articles (and comments) are crammed into an HTML file (which could become a problem if the blog has a lot of content).
  • The names of the images are stored as 000001, 000002, etc. thus avoiding that the original can be found on the Internet (this refers to the images of external sites linked in a blog) or recognize the file format.

Auguste script

The following script written by Auguste parse a profile's friend list for get more profiles, then output them to a textfile and remove duplicates. Unfortunately this script has not been completed because Auguste could not to dedicate enough time to work due to their occupations, but he also public the script here in case anyone else wants to finish.

#!/usr/bin/perl
use warnings;
use strict;
use HTML::TokeParser;
use WWW::Mechanize;

#This is an unfinished Perl script to extract a list of Windows Live Spaces profile URLs from a given profile.
#It's poorly written and barely working, and also gets stuck in an infinite loop during the &getMaxPages subroutine.
#Feel free to take and modify as you please.

#SpaceInvader.pl
#Usage: SpaceInvader.pl PROFILEURL [OUTFILE]
#  e.g. SpaceInvader.pl http://foobar.spaces.live.com foobar_output.txt

my $url = "$ARGV[0]";

my $outFile;
if ($ARGV[1])
{
	$outFile = "$ARGV[1]";
}
else
{
	$outFile = "output.txt";
}

open(OUTFILE, ">$outFile");

my $mech = WWW::Mechanize->new();
my $page;

&updatePage($url);
&updatePage(&findFriendList);
my $cid = &getCid;
my $i;
my $pages = 1;
for ($i = 1; $i <= $pages; $i++)
{
	my $currentPage = "http://cid-" . $cid . ".profile.live.com/friends/all/?page=$i";
	&updatePage($currentPage);
	&findFriends;
	$pages = &getMaxPages;
}
print("Ok, we're all done. $i pages total.\n");

close(OUTFILE);

sub updatePage()
{
	if ($_[0])
	{
		$url = $_[0];
	}
	print("Loading page $url...\n");
	$mech->get("$url");
	$page = HTML::TokeParser->new(\$mech->{content});
}

sub findFriendList()
{
	print("Locating the friend list...\n");
	while (my $tag = $page->get_tag("a"))
	{
		if ($tag->[1]{title} and $tag->[1]{title} eq "View friends")
		{
			print("Found the friend list...\n\n");
			return $tag->[1]{href};
		}
	}
}

sub findFriends()
{
	print("Producing a list of friends for CID $cid from page $i...\n");
	while (my $tag = $page->get_tag("a"))
	{
		if ($tag->[1]{id} and $tag->[1]{id} =~ /ic[\d|\w]+_frame_clip/)
		{
			unless ($tag->[1]{href} =~ /$cid/)
			{
				my $profilePage = $tag->[1]{href};
				print("  $profilePage\n");
				print(OUTFILE "$profilePage\n");
			}
		}
	}
	print("Finished page $i - moving on.\n\n");
}

sub getCid()
{
	my $temp = $url;
	$temp =~ /cid-([\w|\d]+)/;
	return $1;
}

sub getMaxPages()
{
	$mech->get("$url");
	$page = HTML::TokeParser->new(\$mech->{content});

	print("Checking if there's another page...\n");
	print("$url\n");
	my $maxPages;
	while (my $tag = $page->get_tag("a"))
	{
		if ($tag->[1]{id} and $tag->[1]{id} e

Notes

  1. If you do not know how to use this program you can check this tutorial (in English) or this one (in Spanish)
  2. I'm not sure if which is allocated in *. photos.live.com will continue to exist after March 16, then I take the opportunity to save the section "Photos" of Spaces (if the user of Space has this section) so that the line 11 is also optional
  3. Some of the descriptions in this section was taken from http://www.softsea.com/review/LSSaver.html
  4. This operation may take up to several minutes depending on the number of entries that contains a blog, as well the user connection

External links