MobileMe

From Archiveteam
Revision as of 00:28, 25 March 2012 by Aggroskater (talk | contribs) (Errors)
Jump to: navigation, search
MobileMe
MobileMe logo
Screenshot-MobileMe Sign In - Google Chrome.png
URL https://me.com/[IAWcite.todayMemWeb]
Project status Closing on June 30, 2012
Archiving status In progress...
Project source Unknown
Project tracker http://memac.heroku.com/
IRC channel #archiveteam (on EFnet)
Project lead Unknown

Apple's MobileMe will close on June 30, 2012.

From the Wikipedia page:

MobileMe (formerly .Mac and iTools) is a subscription-based collection of online services and software offered by Apple Inc. Originally launched on January 5, 2000, as iTools, a free collection of Internet-based services for users of Mac OS 9, Apple relaunched it as .Mac on July 17, 2002, when it became a paid subscription service primarily designed for users of Mac OS X. Apple relaunched the service again as MobileMe at WWDC 2008 on July 9, 2008, now targeting Mac OS X, Windows, iPad, iPhone, and iPod Touch users.

On February 24, 2011, Apple discontinued offering MobileMe through its retail stores. The MobileMe retail boxes are also not offered through resellers anymore. Apple is also no longer accepting new subscribers for MobileMe. At the WWDC 2011, on June 6, Apple announced it will launch iCloud in the Northern Hemisphere Autumn 2011, which will replace MobileMe for new users. MobileMe itself will continue to function until June 30, 2012, at which point the service will no longer be available, although users are encouraged to migrate to iCloud before that date.

Apple.com/MobileMe shutdown notice (webcite mirror)

Apple Support - Frequently asked questions about the MobileMe transition and iCloud (webcite mirror)


How to help archiving

There is a distributed download script that gets usernames from a tracker and downloads the data.

Make sure you are on Linux, that you have curl, git, a recent version of Bash. Your system must also be able to compile wget.

  • Get the code:
    git clone git://github.com/ArchiveTeam/mobileme-grab.git
  • Get and compile the latest version of wget-warc:
    ./get-wget-warc.sh
  • Think of a nickname for yourself (preferably use your IRC name).
  • Run the download script with
    ./dld-client.sh "<YOURNICK>"
  • To stop the script gracefully, run
    touch STOP
    in the script's working directory. It will finish the current task and stop.

Notes

  • Compiling wget-warc will require dev packages for the various libraries that it needs. Most questions have been about gnutls; install the gnutls-devel or gnutls-dev package with your favorite package manager.
  • Downloading one user's data can take between 10 seconds and a few hours.
  • The data for one user is equally varied, from a few kB to several GB.
  • The downloaded data will be saved in the ./data/ subdirectory.
  • Download speeds from me.com are not that high. You can run multiple clients to speed things up.

Errors

  • If you keep getting errors such as ERROR (3) when running ./dld-client.sh, just forget about that nickname for now and rerun the script, someone would rerun them closer to the closing date. There should be more information about this given here, but there isn't any and this is the best way I know.
  • ERROR (3) is a "File I/O error" from wget. I've gotten one so far. Couldn't tell you what causes it though.
   - Running wget --mirror (at least 8721 files)... ERROR (3).
  Error downloading from web.me.com.
Error downloading 'ikkeisasaki'.
  • The funny thing is that the log seems to imply wget completed just fine:
[archive@centosdevbox web.me.com]$ pwd
/home/archive/mobileme-grab/data/i/ik/ikk/ikkeisasaki/web.me.com
[archive@centosdevbox web.me.com]$ tail -n 3 wget.log
FINISHED --2012-03-23 20:00:31--
Total wall clock time: 27m 34s
Downloaded: 7748 files, 239M in 6m 43s (607 KB/s)
  • However, the download files don't line up with what the "at least" was supposed to be.--Aggroskater 20:28, 24 March 2012 (EDT)

Uploading your data

To upload the data you've downloaded, you can run the ./upload-finished.sh script to upload your data. For example, run this in your script directory: ./upload-finished.sh YOURNICK

Once a user is successfully uploaded, it is moved to the data/uploaded/ subdirectory. If you need to clear disk space you can remove things that are in that directory.

It's generally safe to run the upload script while your download scripts are running; it will only upload users that are finished.

Archive status

There is a status board available here.

You can see the upload progress on archive.org.

Seesaw: a combined download/upload script

Instead of dld-client.sh, which only downloads and requires you to upload later, you can run the seesaw script. It downloads one user, uploads it to the repository, and removes it from your computer before downloading the next user.

./seesaw.sh "<YOURNICK>"

Archive directly to archive.org

  • To reduce overhead, another script has been developed to package users and upload them to archive.org directly via s3 interface. Users are put in archives of at least 10 GiB (and max 10 GiB + size of the last downloaded user), collected in items of 40 archives each.
  • We've solved some technical issues about upload rate and are currently looking for help to scale up.
  • If you have at least 1.5 MiB/s of upload speed capacity (i.e. at least 1.5 MiB/s full duplex), this is the solution for you. A single instance of the seesaw-s3 script is able to use all such bandwidth, because MobileMe and IA (only with this script) are quick enough.
    • Don't use more instances if you don't have more than such bandwidth, or IA servers will suffer from the excessive number of connections. If you have 3 MiB/s, use two instances, and so on.
    • You'll also need at least 20-30 GiB of disk space for each instance for minimum security (it could be much more if you bump into a very big user whose size is added to the 10 GiB limit).

Ask alard on our IRC channel if you want a copy of the script and start archiving faster than ever!

Site structure

(Copied from Wikipedia) There are public subdomain access points to each MobileMe members' individual account functions. These provide direct public web access to each MobileMe users account, via links to each function directly; Gallery, Public folder, published website, and published calendars (not available currently). See list:

  • http://www.me.com – member login.
  • http://gallery.me.com/<username> – member public photo/video Gallery.
  • http://public.me.com/<username> – member Public folder access.
  • http://web.me.com/<username> – member Website access.
  • http://ical.me.com/<username>/<calendar name> – member individual calendar publishing. In the older system, many calendars could be published at the same time. In the current iteration of MobileMe, there is no calendar publishing available.

web.me.com and web.mac.com

The domains web.me.com and web.mac.com point to the same web pages.

Interesting (large) examples

  • web.me.com: rightangles
  • homepage.mac.com: russconte
  • gallery.me.com: aaaashy
  • public.me.com: morkjturner

Tools/Archiving

There's a repository on the ArchiveTeam Github: https://github.com/ArchiveTeam/mobileme-grab

The combined tool for downloading all content for one user is dld-user.sh. It needs a WARC-enabled wget to run.

homepage.mac.com

This is a separate site from web.me.com (older, probably). Almost all of the sites on this domain can be downloaded with wget --mirror.

A script is available in the git repository. You need a WARC-enabled wget to run this.

web.me.com

web.me.com will give you a list of the files in a user's directory. We can use this list of urls to download the complete site, no wget --mirror necessary.

A script is available in the git repository. You need a WARC-enabled wget to run this.

Download procedure:

  • Download http://web.me.com/<username>/?webdav-method=truthget&depth=infinity
  • Parse the WebDAV response to find the url of each file and download them.

public.me.com

The files on public.me.com are accessible via WebDAV: https://public.me.com/ix/<username>

A script is available in the git repository.

Download procedure:

  • Send a PROPFIND request with Depth: infinity to https://public.me.com/ix/<username>. This will return the complete, recursive file list.
  • Parse the WebDAV response to find the href of each file and download them.

gallery.me.com

If you ask nicely, gallery.me.com will give you a zip file of the entire gallery contents.

A script is available in the git repository.

Download procedure:

  • Send a GET to http://gallery.me.com/<username>?webdav-method=truthget&feedfmt=json&depth=Infinity. This will give you a JSON file that contains details about all albums and all photos/videos in the gallery. (Example)
  • Search through this file for the properties largeImageUrl and videoUrl, which contain the urls for the largest versions of images and videos that are available.
  • Use these files to construct a ziplist description:
    <?xml version="1.0" encoding="utf-8" ?>
    <ziplist xmlns="http://user.mac.com/properties/">
      <entry>
        <name><!-- the target path of the file in the zip file --></name>
        <href><!-- the url of the image (the largeImageUrl or videoUrl) --></href>
      </entry>
      ...
    </ziplist>
    
  • Send this document with a POST request with a Content-Type: text/xml; charset="utf-8" header to http://gallery.me.com/<username>?webdav-method=ZIPLIST
  • The server will now generate a zip file for you, containing the files specified in the ziplist document. This may take a short while, but eventually the request will give you a response with a X-Zip-Token header.
  • Use the zip token to download the zip file: http://gallery.me.com/<username>?webdav-method=ZIPGET&token=<ziptoken>.

Note: I found that with very large galleries the ZIP request fails. Therefore, it's better to make one zip file per album. The Python script does that.

ical.me.com

To download a calendar you need the username and the name of the calendar. (There seems to be no way to list all calendars of a specific user.) Once you have these two names, you can download the ics file using one of these urls:

  • http://ical.mac.com/WebObjects/iCal.woa/wa/Download/<calendarname>.ics?u=<username>&n=<calendarname>.ics
  • http://homepage.mac.com/<username>/.calendars/<calendarname>.ics

iDisk

Some of the sites on homepage.mac.com have a section called 'iDisk Public Folder'. You can see the list of files, but can't actually download them. Our current hypothesis is that the files listed in the iDisk Public Folder are also available through public.me.com, so downloading those would be sufficient to get all of the public iDisk content (compare https://public.me.com/ardeshir and http://homepage.mac.com/ardeshir/FileSharing8.html).