AOL

From Archiveteam
Revision as of 22:17, 2 December 2014 by Chfoo (talk | contribs) (→‎Links: add playnet/q-link sources. organize)
Jump to navigation Jump to search
AOL
AOL Screen Shot 2013-01-27 at 8.42.32 PM.png
Status Online! on January 28, 2013
Archiving status In progress... by godane
Archiving type Unknown
IRC channel #aohell (on hackint)

This is about archiving the original AOL, not AOL's current website. The AOL system is currently in major disrepair. It is as if they left the machines sitting in the datacenter, and as they die, they do not fix any issues. There is much broken infrastructure.

Getting Started

You'll need to sign up for a user account here: https://new.aol.com. Not every field is required- phone definitely isn't. Make sure there are no special chars in the username or password, and the username and password is short (8 chars or less).

Client software

If you're running the software through Wine, your choices are very simple: AOL 4.0 or AOL 5.0. Everything else is in various states of not-workingness.

If you're not running through Wine, you can use any version you want, but there's good reasons for not using AOL 6 or higher:

  • Fairly rich legacy of third-party tools up to 5.0
  • Documentation is probably applicable to every later version, but there's not as many or any ways to make good use of it in later versions

Why you might want to use a later version:

  • More things on the site will certainly work (there may be less stuff accessible, but more of it should work because it's actively supported by AOL)

AOL has copies of the 9.x series on their site, and Oldversion has copies of all the other versions.

Setup

In general (these instructions assume you create the account beforehand on AOL's site):

Run aol50.exe (or your chosen version's installer)
Choose Current member, and then add my account to this computer.
When the installer finishes, and AOL launches, it will bring up the AOL setup screen.
You'll want Expert Setup, so you can tell AOL right away that you want to use your TCP/IP connection to connect to AOL (instead of dial-up).
Once you select TCP/IP connection, AOL asks if you want to sign on right away- unless you have proxy settings or can't use the defaults, just hit next.
If you need to change any of the settings before connecting, read the message box carefully- it explains where to go to change the settings.
You want to pick You already have a screen name and password - fill in the account name and password you created earlier on AOL's site.
Press Next, and you are done.

Wine notes:

Biggest difference is that you can't create an account inside of AOL 5 or less, so you must create the account beforehand.

Protocol

Definitions

P3

Communication protocol used to communicate over lossy channels (but can be used over TCP).

There's two types of the packet:

  1. the old one with a plain CRC
  2. the new one with a CRC where it is encoded redundantly.

It consists of packets containing magic start byte, CRC, sequence number, packet type, data, and the magic stop byte.

Form Definition Operator (FDO)

Form display convention or protocol. It consists of a Token and an Atom Stream. It goes into the data portion of a P3 packet. It's like if someone mixed a scripting language, a state machine, X11 display protocol, tree structure, database, and RPC protocol into one big ugly mess.

Token

A Token is a 2 byte value used to dispatch handlers to handle the Atom Stream.

Atom Stream

An Atom Stream consists of an Atom Stream ID and Atoms. Atom Streams are serialized (assembled) and unserialized (disassembled).

Atom

An Atom consists of a 2 byte Atom ID. The Atom ID is processed as two values. The first byte describes the category of the Atom and the second byte describes a specific command or Turing machine operation. Next is zero or more arguments of various types.

Care is needed to assemble/dissemble the Atom Streams because a framing error will cause the rest of the Atom Stream to read out like garbage.

Star Tool

The Star Tool is additional blobs that is appended or patched into an existing install of an AOL client. When installed, appears as a * in the application's menu bar. Things such as "Invoke Database Record" are located in this menu.

Atomic Debugger

The Atomic Debugger disassembles the Atom Streams as it passes through the client.

Remote Area Information Manager (Rainman)

Protocol for displaying information (Pages) in a window.

Visual Publisher

Designs Pages to create AMP files.

Database Form ID

A 32 bit unsigned integer represented as two unsigned 16 bit integers in decimal format used to retrieve forms. For example: the ID "123-4567" is 8065495 in base 10 or 0x007B11D7.

Links

P3/FDO/Tokens/Atoms

Lots of sources: Documents covering how to make AOL forms and various such things:

Samples of custom forms:

Some FDO lessons:

About the class names:

Here is an early version of aol-files.com:

Atoms list:

More internal docs:

General

Penggy

PlayNet

Reverse Engineering

The trunk version of Wireshark includes a dissassembler for the AOL protocol that breaks out the basic header information, such as the packet type and the token. It doesn't go into any detail about the contents of the packet, but this is a good start. This isn't available for download yet, so you'll have to build it yourself, from the svn trunk; once built wireshark reports itself as 1.9.0.

http://db48x.net/temp/Screenshot%20-%2001292013%20-%2008:28:31%20PM.png

Packet Types

INIT (x’23’)
Client sends this to the server to begin comminucation.
ACK (x’24’)
Acknowledge a packet as recieved, for instance an INIT or heartbeat.
SS (x’21’)
An SS requests the other end of the connection to send an SSR.
SSR (x’22’)
An SSR is a response to an SS.
NAK (x’25’)
Negative acknowlegement of a packet, when the packet was recieved incorrectly.
DATA (x’20’)
A packet containing data, identified by a token.
HEARTBEAT (x’26’)
The other side suspects that the line has dropped; respond with an ACK

Tokens

Each packet has a token that determines what is in the data field of the packet. Documentation for these tokens is very sparse; it's likely that AOL never had a comprehensive document listing all of them. Instead, the documentation merely tells the reader to view the list of tokens while logged into the server.

Downloading a file

  1. ← mD – client requests a file (by id?)
  2. → uJ – unicode file name
  3. → tf – start of a download; includes file name (non-unicode?); requests immediate xG ack
  4. ← xG – client acks download
  5. → FF – packet containing file data, no ack requested
  6. → F7 – packet containing file data, no ack requested
  7. ← xG – periodic acks
  8. → F9 – packet containing file data, last in sequence
  9. ← eX – mail download complete (unrelated?)
[21:24:10] <db48x> there's a packet coming from the server with a token tf
[21:24:16] <db48x> the data has a filename in it
[21:24:59] <db48x> the data is in a series of packets with token FF and F7 (no explanation of the difference is available)
[21:25:24] <balrog_> but like when you view a file library 
[21:25:34] <balrog_> how does it tell the server which library to display?
[21:25:36] <db48x> the last packet of the file has token F9
21:25:42] <db48x> haven't figured that out yet
[21:25:56] <balrog_> ah
[21:26:01] <db48x> before this file in the capture there are packets with tokens EB and uJ going from the client to the server
[21:26:03] <balrog_> none of the documentation covers this?
[21:26:09] <balrog_> aaah
[21:26:44] <db48x> and mD
[21:26:51] <db48x> and tokens AT and tD coming back
[21:29:29] <db48x> looks like the tD coming back has the metadata in it
[21:30:50] <balrog_> http://sicexcels.tripod.com/~SicExcels/rm-vpd_info/TokenTypes_Basic.txt
[21:31:12] <balrog_> http://sicexcels.tripod.com/~SicExcels/rm-vpd_info/TokenTypes_Plus.txt
[21:31:16] <balrog_> quite incomplete 
[21:33:26] <db48x> mD = download now, then
[21:34:31] <db48x> and an mF, file description
[21:34:41] <db48x> followed by an AT with a bunch of data
[21:35:35] <db48x> looks like labels for buttons like 'download now', 'download later', 'ask the staff', 'related files'
[21:35:56] <db48x> packet 538
[21:37:19] <db48x> continues in the next AT packet, 540, which looks like it has the description in it
[21:37:29] <db48x> talks about using ShrinkIt to unpack the file

Retransmissions

In normal transmission, messages are being passed in both directions. Each message sent carries the number of the last message correctly received, which is an implicit acknowledgement of all messages up to and including that one. When a message is received correctly, it is passed up to the application level. Then the response number of the message is examined. If it acknowledges any messages currently in the buffer, they are dropped from the buffer. If the receiver of the message has received a certain number of messages without acknowledging, it will send an ACK to keep the sender’s window from closing. (A window is closed when the buffer of sent messages is full, preventing any more transmissions.)

If a single message gets mangled, the receiver will get a bad checksum, and send a NAK (assuming its window is open) requesting re-transmission of all messages starting at the mangled one. It will then ignore out of sequence messages until it gets the mangled message correctly. If its window is closed, and there is no NAK queued, it will queue the NAK for transmission when the window opens. If there is a NAK queued already, it will ignore the new one.

The same NAK logic would apply to messages received out of sequence, assuming that a NAK had not already been sent.

In all cases, where a numbered message is sent, the window is checked. If it is closed, an SS is sent to try to re-open the wondow. When an SS has been sent, and no SSR has been received, all NAKs are accepted, but they are ignored, instead of being processed.

When a SSR is received, any messages that were not received are queued for transmission. When there is a message to send, and the window is open, it is sent and put into buffer. If the window is closed, the message is queued for transmission. This is separate from NAK queue.

URLs

From http://www.applefritter.com/aol: The url for the Apple II New Files library is aol://4400:8287, and here is the URL for the UnForkIt file, contained in that library: aol://4401:8287:636250. The first value identifies the resource type. In this case, either 4400 for a library, or 4401 for a file. The second number, 8287, is the library ID. 636250 is the file ID. The file IDs are not consecutive within libraries.

aol://nnnn

  • 1722: Keywords
  • 2719: Chatrooms (Private room through keyword: aol://2719:2-2-room name)
  • 3548: User profiles
  • 4344: Interactive page
  • 4400: File libraries
  • 4401: Files
  • 586x: ???
  • 9293: IM: aol://9293:[sn] (from http://justinakapaste.com/category/aolaim-tutorials/)

Examples

  • aol://4344:1264.a2main.10029531.514525857
  • aol://4400:8287
  • aol://4344:1264.a2abt.10037404
  • aol://4344:117.mtv.591130
  • aol://4344:226.llll.2755674.520114429 (Access code: 3675)

Sources

List of aol:// URLs. See Links section above for HTTP links about AOL.

Structure

        <balrog_> yes, but aol://4344:nnnn doesn't work without the extra
[19:52] <balrog_> aol://4344:1264.a2main.10029531 also works
        <balrog_> simply aol://4344:1264.a2main does not work.
[20:17] <DrainLbry> so to summarize we've got aol://4400:ID (from
                    spreadsheet), for file libraries, and
                    aol://4344:uniqueidentifier for interactive content
[20:18] <balrog_> aol://4344:uniqueidentifier:ID
        <balrog_> as per
                  http://web.archive.org/web/20060207004722/http://daol.aol.com/aolatoz
                  keywords used to be aol://1722:keyword
        <balrog_> but that's no longer working

Software

<raylee> given #aohell seems dead,
<raylee> i'll just say it here
<raylee> I just found startools aol / master aol / the debugging tools for AOL
<raylee> http://www.aciddr0p.net/aolunorgd/
<raylee> maol*.zip
<raylee> the master.tol / master.aol file goes into the tools dir.. then dbinvokes work PERFECTLY...
  • Regarding archival of file libraries: I (slipstream/raylee) made an autoit script to drive the AOL client (only works perfectly on 9.7) to get everything (metadata, and files). Here's the script. Updated 7-Sep-2014. The script only fails on connection loss or AOL client crash. (By the way, the reason it doesn't work on 7.0 and below, is because I already tried that, and random lag (if you've used these old clients you'll know what I'm talking about) basically kills the script.)

Goals

save forums/files/etc

AOL has a large number of forums on every topic, file libraries containing art, shareware, game mods, etc, etc. These should be fairly easy to enumerate, and once found it should be fairly easy to download all of the forum messages and files. Archives of these would be worth saving.

save everything

Every window that you can click on in AOL was created by a 'producer' at AOL. Many of them are essentially identical, file libraries for instance, but many are also one-offs created for a specific purpose. We ought to save these as well. Going this route will take a more thorough understanding of both the AOL protocol and the FDO scripts.

Plans

There are several ways to go about this, with tradeoffs that are only lightly explored.

custom scraper

Write a scraper in python that understands the AOL protocol and FDO scripts, and writes everything to warc files. Warc save us much of the trouble of figuring out how to organize everything on disk. they also make it much easier to create a server than can pretend to be the AOL server, or that can translate into http/html to allow anyone with a web browser to see what AOL was like.

wget-aol

Modify wget to support the AOL protocol. Very ambitious, but it would let us reuse wget's infrastructure, which may make the task easier; we'd be able to concentrate on just implementing the protocol and FDO parsing and leave the rest to wget. Would that reuse save us time, or would dealing with wget's internals drive us mad? Hard to say. This method would also allow us to create warc files.

script the client

Drive the real AOL client, perhaps with debugging tools installed, in order to capture both the FDO sources and screenshots of the rendering. Probably more fragile, but we wouldn't have to understand the actual protocol. Wouldn't be able to create warc files.

Complete client clone

An attempt to write a client library, client interface, client recording suite is located at https://github.com/chfoo/notaol/. It's far from complete; currently stuck on atom serialize/unserialize.

Archives

Archives are being uploaded to IA by godane: https://archive.org/search.php?query=creator%3A%22AOL+Files%22