Difference between revisions of "Yahoo! Groups"
(Fix typo) |
Switchnode (talk | contribs) (add history endpoint, clean up formatting a bit) |
||
Line 60: | Line 60: | ||
===Groups=== | ===Groups=== | ||
* https://groups.yahoo.com/api/v1/search/groups (search) | * https://groups.yahoo.com/api/v1/search/groups (search) | ||
:- Known params: maxHits, offset, query, sortBy ( | :- Known params: maxHits, offset, query, sortBy (values: OLDEST, RELEVANCE, MEMBERS, LATEST_ACTIVITY, NEWEST) | ||
* https://groups.yahoo.com/api/v1/dir/categories/0/ (list of subcategories and discoverable groups under the root) | * https://groups.yahoo.com/api/v1/dir/categories/0/ (list of subcategories and discoverable groups under the root) | ||
:- Known params: start | :- Known params: start | ||
:- Pagination: | :- Pagination: Page size is 10. Does ''not'' have a count param. May be limited to 500 total results regardless of start value. start is the result index, not the group id. | ||
: Groups in subcategories can be listed by swapping '0' for the subcategory id (the full idList is not required). There is a /1/ with a small number of groups. | : Groups in subcategories can be listed by swapping '0' for the subcategory id (the full "idList" value is not required). There is a /1/ with a small number of groups. | ||
* https://groups.yahoo.com/api/v1/groups/concatenative/ (specific group information) | * https://groups.yahoo.com/api/v1/groups/concatenative/ (specific group information) | ||
===Messages=== | ===Messages=== | ||
* https://groups.yahoo.com/api/v1/groups/concatenative/history (calendar summary) | |||
:- Known params: ts, tz, chrome | |||
* https://groups.yahoo.com/api/v1/groups/concatenative/messages (list) | * https://groups.yahoo.com/api/v1/groups/concatenative/messages (list) | ||
:- Known params: count, start | :- Known params: count, start, sortOrder (ASC, DESC), direction (1, -1) | ||
:- Pagination: | :- Pagination: Page size defaults to 10, with no known limit. No known limit on total results. start is the message id, not the result index. sortOrder adjusts the order of results in the json response's array, whereas direction determines which way to iterate through ids from start. | ||
* https://groups.yahoo.com/api/v1/groups/concatenative/messages/1/ (specific message) | * https://groups.yahoo.com/api/v1/groups/concatenative/messages/1/ (specific message) | ||
Line 80: | Line 83: | ||
===Topics=== | ===Topics=== | ||
* https://groups.yahoo.com/api/v1/groups/concatenative/topics (list) | * https://groups.yahoo.com/api/v1/groups/concatenative/topics (list) | ||
:- Known params: count | :- Known params: count | ||
* https://groups.yahoo.com/api/v1/groups/concatenative/topics/1 (specific topic) | * https://groups.yahoo.com/api/v1/groups/concatenative/topics/1 (specific topic) | ||
===Attachments=== | ===Attachments=== | ||
* https://groups.yahoo.com/api/v1/groups/a_furrys_world/attachments (list) | * https://groups.yahoo.com/api/v1/groups/a_furrys_world/attachments (list) | ||
:- Known params: count, start, sort ( | :- Known params: count, start, sort (TITLE, TIME), order (ASC, DESC) | ||
:- Pagination: | :- Pagination: Page size defaults to 20, with no known limit (maximum tested: 93). | ||
* https://groups.yahoo.com/api/v1/groups/<groupname>/attachments/<attachmentId> (specific attachment) | * https://groups.yahoo.com/api/v1/groups/<groupname>/attachments/<attachmentId> (specific attachment) | ||
Line 94: | Line 99: | ||
===Files=== | ===Files=== | ||
* https://groups.yahoo.com/api/v2/groups/a_furrys_world/files (list) | * https://groups.yahoo.com/api/v2/groups/a_furrys_world/files (list) | ||
:- Known params: sfpath (pass in a pathURI to retrieve the file listings of this subdirectory) | :- Known params: sfpath (pass in a pathURI to retrieve the file listings of this subdirectory) | ||
:- | :- Pagination: None. | ||
: Entries with "type" 0 are files; 1, directories. | |||
===Photos=== | ===Photos=== | ||
* https://groups.yahoo.com/api/v3/groups/a_furrys_world/photos (list of photos) | * https://groups.yahoo.com/api/v3/groups/a_furrys_world/photos (list of photos) | ||
:- | :- Known params: count, start, orderBy (MTIME), sortOrder (ASC, DESC), ownedByMe (TRUE, FALSE), lastFetchTime, photoFilter (ALL, PHOTOS_WITH_EXIF "Originals", PHOTOS_WITHOUT_EXIF "Shared") | ||
:- | :- Pagination: Page size defaults to 20, with no known limit. | ||
: | : "totalPhotos" field in response gives total in group. | ||
* https://groups.yahoo.com/api/v3/groups/a_furrys_world/albums (list of albums) | * https://groups.yahoo.com/api/v3/groups/a_furrys_world/albums (list of albums) | ||
:- | :- Known params: count, start, albumType (PHOTOMATIC, NORMAL), orderBy (MTIME, TITLE), sortOrder (ASC, DESC) | ||
:- Pagination: Page size defaults to 12, with no known limit. | |||
:- | : albumType defaults to NORMAL. PHOTOMATIC albumType requires the "READ" permission for "ATTACHMENTS". "total" field in response gives total number of albums of the selected type in group; however, this seems to have an off-by-one error for the NORMAL type of albums. | ||
: | |||
* https://groups.yahoo.com/api/v3/groups/a_furrys_world/albums/1841906391 (specific album) | * https://groups.yahoo.com/api/v3/groups/a_furrys_world/albums/1841906391 (specific album) | ||
:- Observed parameters similar to photos and albums endpoints, with additional ordinal sortOrder option | :- Observed parameters similar to photos and albums endpoints, with additional ordinal sortOrder option | ||
: | : Photomatic albums ''must'' be loaded with the albumType parameter set to PHOTOMATIC. | ||
===Links=== | ===Links=== | ||
* https://groups.yahoo.com/api/v1/groups/a_furrys_world/links (list) | * https://groups.yahoo.com/api/v1/groups/a_furrys_world/links (list) | ||
:- Known params: linkdir | :- Known params: linkdir | ||
:- | :- Pagination: None. | ||
: | : linkdir takes the folder parameter from a dir. Nested folders should be joined with '/'. You need to keep track of the path to a given folder yourself (eg, linkdir + '/' + folder). | ||
===Polls=== | ===Polls=== | ||
* https://groups.yahoo.com/api/v1/groups/relationship-poll/polls (list) | * https://groups.yahoo.com/api/v1/groups/relationship-poll/polls (list) | ||
:- Known params: count, start | :- Known params: count, start | ||
:- | :- Pagination: Page size defaults to 10, with no known limit. There is no "total" field in the response. | ||
* https://groups.yahoo.com/api/v1/groups/a_furrys_world/polls/3549106 (specific poll) | * https://groups.yahoo.com/api/v1/groups/a_furrys_world/polls/3549106 (specific poll) | ||
: | : Polls return all votes cast, non-anonymised, including identifying metadata for all viewers. | ||
===Databases=== | ===Databases=== | ||
Line 136: | Line 141: | ||
* https://groups.yahoo.com/api/v1/groups/a_furrys_world/database/1/ (specific table) | * https://groups.yahoo.com/api/v1/groups/a_furrys_world/database/1/ (specific table) | ||
* https://groups.yahoo.com/api/v1/groups/a_furrys_world/database/1/records (table contents) | * https://groups.yahoo.com/api/v1/groups/a_furrys_world/database/1/records (table contents) | ||
:- | :- Pagination: None. | ||
* https://groups.yahoo.com/neo/groups/groupname/database/1/records/export | * https://groups.yahoo.com/neo/groups/groupname/database/1/records/export (export target) | ||
:- | :- Known params: format (CSV, TSV) | ||
===Members=== | ===Members=== | ||
* https://groups.yahoo.com/api/v1/groups/iswipe/members/confirmed (list of confirmed members) | * https://groups.yahoo.com/api/v1/groups/iswipe/members/confirmed (list of confirmed members) | ||
:- Known params: count, start, sortBy, sortOrder, ts, tz, chrome. | :- Known params: count, start, sortBy, sortOrder, ts, tz, chrome. | ||
:- Pagination: | :- Pagination: Page size defaults to 10, with a limit of 100. No known limit on total results. | ||
: May be blocked for normal members (as may all the other members endpoints). Includes moderators and bouncing members, with identifying metadata. | : May be blocked for normal members (as may all the other members endpoints). Includes moderators and bouncing members, with identifying metadata. | ||
* https://groups.yahoo.com/api/v1/groups/iswipe/members/moderators (list of moderators) | * https://groups.yahoo.com/api/v1/groups/iswipe/members/moderators (list of moderators) | ||
Line 154: | Line 159: | ||
===Events=== | ===Events=== | ||
Overlaps with Yahoo Calendar API, check | |||
Overlaps with Yahoo Calendar API, check yahoo-group-archiver code. | |||
== Python Yahoo! Group archivers == | == Python Yahoo! Group archivers == | ||
* [https://github.com/IgnoredAmbience/yahoo-group-archiver/network/members yahoo-group-archiver] scrapes a group using the JSON API and (for private endpoints) the two cookies Yahoo uses to verify a logged-in user. Relevant forks include [https://github.com/Frankkkkk/yahoo-group-archiver Frankkkkk] and [https://github.com/nsapa/yahoo-group-archiver nsapa]. Needs merging. Various branches have support (largely untested) for file attachments, photos, links, folders, and events. | * [https://github.com/IgnoredAmbience/yahoo-group-archiver/network/members yahoo-group-archiver] scrapes a group using the JSON API and (for private endpoints) the two cookies Yahoo uses to verify a logged-in user. <s>Relevant forks include [https://github.com/Frankkkkk/yahoo-group-archiver Frankkkkk] and [https://github.com/nsapa/yahoo-group-archiver nsapa]. Needs merging. Various branches have support (largely untested) for file attachments, photos, links, folders, and events.</s> Most stuff has been merged back into IgnoredAmbience's master. (Exceptions: full WARC support?, mtime work from Frankkkkk.) Needs consistent/WARC-appropriate handling for random 500 errors and attachment 404s. | ||
* [https://github.com/andrewferguson/YahooGroups-Archiver YahooGroups-Archiver] is similar, but scrapes only messages (not files or any other data). It is not currently under active development. | * [https://github.com/andrewferguson/YahooGroups-Archiver YahooGroups-Archiver] is similar, but scrapes only messages (not files or any other data). It is not currently under active development. |
Revision as of 16:30, 3 November 2019
Yahoo! Groups | |
URL | http://groups.yahoo.com/ |
Status | Closing |
Archiving status | In progress... |
Archiving type | Unknown |
IRC channel | #yahoosucks (on hackint) |
Yahoo! Groups is Yahoo's combination mailing list service/web forum; it's the result of the acquisition of eGroups and some other Yahoo! stuff. In addition to archives of and a web interface for mailing lists, it offers file uploads, photo uploads, links, polls, and an events calendar.
Uploading of new content will be disabled 28 October 2019, and all content, including message history, will be deleted 14 December 2019.[1] (The mailing lists themselves will continue to function.)
It's been stable for a long time (since the late 90s), long enough for some specialised software to be developed to do backups of it. (Not many other websites can say that.)
Nominating Notable Non-Private Groups for Archival
Groups can be nominated for archival using this form. Please note that this form should not be used for groups that require administrator approval to join.
Adding Private Groups to the Public Archive
Administrators / Moderators can request that their private group (we consider a private group to be one that requires approval for new members) be included in the public archive. Before you do this, please ensure that the members of the group are happy about being part of the public archive.
To add the group to the list of private groups to be archived, all you need to do is send a membership invite to the email archiveteamprivateyahoogroup@gmail.com. (Note that only group admins can do this). We'll be monitoring that email regularly to accept any membership requests we receive. Once that account is a member, the group should be scheduled to be part of the public archive.
Please make sure that when you invite the Archive Team account, you do not select the Add only to mailing list option, as this will prevent Archive Team from archiving the group.
Statistics
As of 2019-10-16 the directory lists 5619351 groups. 2752112 of them have been discovered. 1483853 (54%) have public message archives with an estimated number of 2.1 billion messages (1389 messages per group on average so far). 1.8 billion messages (86%) have been archived as of 2018-10-28.
The following graphs are slightly outdated:
Private groups of interest
Group | Notes | Admin consent? |
---|---|---|
numberactivation | see all the press coverage | Not yet contacted; FOI request made |
hpslash | see Fanlore page | Not yet contacted |
Potentially relevant: List of groups with Fanlore pages (contains both private and public groups), Archive Trans Yahoo's list (all private at last check)
Site structure
There’s a convenient JSON API. Some endpoints require logged-in group membership or other permissions (depending on group settings).
Groups
- - Known params: maxHits, offset, query, sortBy (values: OLDEST, RELEVANCE, MEMBERS, LATEST_ACTIVITY, NEWEST)
- https://groups.yahoo.com/api/v1/dir/categories/0/ (list of subcategories and discoverable groups under the root)
- - Known params: start
- - Pagination: Page size is 10. Does not have a count param. May be limited to 500 total results regardless of start value. start is the result index, not the group id.
- Groups in subcategories can be listed by swapping '0' for the subcategory id (the full "idList" value is not required). There is a /1/ with a small number of groups.
- https://groups.yahoo.com/api/v1/groups/concatenative/ (specific group information)
Messages
- https://groups.yahoo.com/api/v1/groups/concatenative/history (calendar summary)
- - Known params: ts, tz, chrome
- - Known params: count, start, sortOrder (ASC, DESC), direction (1, -1)
- - Pagination: Page size defaults to 10, with no known limit. No known limit on total results. start is the message id, not the result index. sortOrder adjusts the order of results in the json response's array, whereas direction determines which way to iterate through ids from start.
- https://groups.yahoo.com/api/v1/groups/concatenative/messages/1/ (specific message)
- https://groups.yahoo.com/api/v1/groups/concatenative/messages/1/raw (specific message, raw content including headers)
- Some messages may have encoding issues.[2] Sometimes (as in the linked case) the non-raw endpoint has the correct characters, sometimes it does not; this is likely related to the originating email client.
Topics
- - Known params: count
- https://groups.yahoo.com/api/v1/groups/concatenative/topics/1 (specific topic)
Attachments
- - Known params: count, start, sort (TITLE, TIME), order (ASC, DESC)
- - Pagination: Page size defaults to 20, with no known limit (maximum tested: 93).
- https://groups.yahoo.com/api/v1/groups/<groupname>/attachments/<attachmentId> (specific attachment)
Attachment may be of several types: photo, file, ...?
Files
- - Known params: sfpath (pass in a pathURI to retrieve the file listings of this subdirectory)
- - Pagination: None.
- Entries with "type" 0 are files; 1, directories.
Photos
- https://groups.yahoo.com/api/v3/groups/a_furrys_world/photos (list of photos)
- - Known params: count, start, orderBy (MTIME), sortOrder (ASC, DESC), ownedByMe (TRUE, FALSE), lastFetchTime, photoFilter (ALL, PHOTOS_WITH_EXIF "Originals", PHOTOS_WITHOUT_EXIF "Shared")
- - Pagination: Page size defaults to 20, with no known limit.
- "totalPhotos" field in response gives total in group.
- https://groups.yahoo.com/api/v3/groups/a_furrys_world/albums (list of albums)
- - Known params: count, start, albumType (PHOTOMATIC, NORMAL), orderBy (MTIME, TITLE), sortOrder (ASC, DESC)
- - Pagination: Page size defaults to 12, with no known limit.
- albumType defaults to NORMAL. PHOTOMATIC albumType requires the "READ" permission for "ATTACHMENTS". "total" field in response gives total number of albums of the selected type in group; however, this seems to have an off-by-one error for the NORMAL type of albums.
- - Observed parameters similar to photos and albums endpoints, with additional ordinal sortOrder option
- Photomatic albums must be loaded with the albumType parameter set to PHOTOMATIC.
Links
- - Known params: linkdir
- - Pagination: None.
- linkdir takes the folder parameter from a dir. Nested folders should be joined with '/'. You need to keep track of the path to a given folder yourself (eg, linkdir + '/' + folder).
Polls
- - Known params: count, start
- - Pagination: Page size defaults to 10, with no known limit. There is no "total" field in the response.
- Polls return all votes cast, non-anonymised, including identifying metadata for all viewers.
Databases
- https://groups.yahoo.com/api/v1/groups/a_furrys_world/database (list of tables)
- https://groups.yahoo.com/api/v1/groups/a_furrys_world/database/1/ (specific table)
- https://groups.yahoo.com/api/v1/groups/a_furrys_world/database/1/records (table contents)
- - Pagination: None.
- - Known params: format (CSV, TSV)
Members
- https://groups.yahoo.com/api/v1/groups/iswipe/members/confirmed (list of confirmed members)
- - Known params: count, start, sortBy, sortOrder, ts, tz, chrome.
- - Pagination: Page size defaults to 10, with a limit of 100. No known limit on total results.
- May be blocked for normal members (as may all the other members endpoints). Includes moderators and bouncing members, with identifying metadata.
- https://groups.yahoo.com/api/v1/groups/iswipe/members/moderators (list of moderators)
- https://groups.yahoo.com/api/v1/groups/iswipe/members/bouncing (list of bouncing members)
- https://groups.yahoo.com/api/v1/groups/iswipe/members/suspended (list of suspended members)
- Very often (always?) blocked for normal members.
- https://groups.yahoo.com/api/v1/groups/iswipe/members/banned (list of banned members)
- Very often (always?) blocked for normal members.
Events
Overlaps with Yahoo Calendar API, check yahoo-group-archiver code.
Python Yahoo! Group archivers
- yahoo-group-archiver scrapes a group using the JSON API and (for private endpoints) the two cookies Yahoo uses to verify a logged-in user.
Relevant forks include Frankkkkk and nsapa. Needs merging. Various branches have support (largely untested) for file attachments, photos, links, folders, and events.Most stuff has been merged back into IgnoredAmbience's master. (Exceptions: full WARC support?, mtime work from Frankkkkk.) Needs consistent/WARC-appropriate handling for random 500 errors and attachment 404s.
- YahooGroups-Archiver is similar, but scrapes only messages (not files or any other data). It is not currently under active development.
- yahoo-groups-backup scrapes a group using Selenium, storing message info and metadata (both rendered message body and raw email) into a Mongo database. It also provides a script to dump its data to static HTML pages that can be viewed in the browser.
Other archivers
- Yahoo Group Archiver: Perl, defunct.
- PGOffline: Windows, proprietary. 14-day free trial, after which download and export is disabled (but view still works). Includes attachments. Stores data in a SQLite database internally.
- Yahoo Messages Export: Chrome extension. Messages only. Saves as mbox.