Difference between revisions of "Gna!"
Jump to navigation
Jump to search
(→rsync grab sign-up: rm 'uploaded to' column, dealt with elsewhere) |
(updated shutdown timeline, plus more info on scraping ticket attachments) |
||
Line 24: | Line 24: | ||
*** Only option for third parties looks like web scraping. | *** Only option for third parties looks like web scraping. | ||
** There's no supported interface for grabbing issue attachments (such as patches) even for project admins though. | ** There's no supported interface for grabbing issue attachments (such as patches) even for project admins though. | ||
*** Attached files are allocated global increasing integer IDs, e.g. [https://gna.org/bugs/download.php?file_id=29845 file #29845]. It's | *** Attached files are allocated global increasing integer IDs, e.g. [https://gna.org/bugs/download.php?file_id=29845 file #29845]. It looks like you don't have to get the 'bugs' bit correct, so it's possible to scrape all public files by varying the ID. | ||
** Individual tickets can be private. (Maybe files too?) | ** Individual tickets can be private. (Maybe files too?) | ||
* '''File hosting''' at http://download.gna.org/ | * '''File hosting''' at http://download.gna.org/ | ||
Line 40: | Line 40: | ||
* A notice of pending shutdown / request for takeover was first announced in Nov 2016<ref>https://mail.gna.org/public/project/2016-11/msg00001.html</ref> suggesting a time frame of six months | * A notice of pending shutdown / request for takeover was first announced in Nov 2016<ref>https://mail.gna.org/public/project/2016-11/msg00001.html</ref> suggesting a time frame of six months | ||
* A news item about shutdown was posted to the front page 2017-01-31 linking to the above. | * A news item<ref>https://gna.org/forum/forum.php?forum_id=2545</ref> about shutdown was posted to the front page 2017-01-31 linking to the above. A reply to that on 4 Feb suggests shutdown will happen "within 3 months, or when the hardware dies". | ||
* | * This suggests shutdown by around the beginning of May 2017. | ||
== rsync grab sign-up == | == rsync grab sign-up == |
Revision as of 09:17, 30 March 2017
Gna! | |
URL | http://www.gna.org |
Status | Closing |
Archiving status | In progress... |
Archiving type | Unknown |
IRC channel | #gnarm (on hackint) |
Gna! is a centralized location where software developers can develop, distribute and maintain free (GPL-compatible) software. It is an instance of the Savane code-hosting platform[1].
Hosted data
As of 2017-02 it claimed to have 1458 hosted projects. (Many are probably abandoned and will not be saved by their project admins before shutdown.)
- Code hosting using CVS, Subversion, and Arch
- All subversion repos available via anonymous rsync: rsync://svn.gna.org/svn/ (ref: bottom of every project's svn page e.g. [1]). (In FSFS format, which is supposed to be portable.)
- Ditto CVS, it looks like: rsync://svn.gna.org/cvs/
- Arch/tla [2]: rsync://download.gna.org/arch/
- There's also a ViewVC web front-end to browse code.
- Ticket tracking
- Up to 4 trackers per project: 'bugs', 'patch', 'task', 'support'
- Project admins (only) can set up XML export of their own ticket text/metadata ("Export" item on tracker admin menu).
- Only option for third parties looks like web scraping.
- There's no supported interface for grabbing issue attachments (such as patches) even for project admins though.
- Attached files are allocated global increasing integer IDs, e.g. file #29845. It looks like you don't have to get the 'bugs' bit correct, so it's possible to scrape all public files by varying the ID.
- Individual tickets can be private. (Maybe files too?)
- File hosting at http://download.gna.org/
- Anonymous rsync available at rsync://download.gna.org/download/
- Project websites on home.gna.org
- e.g. http://home.gna.org/freeciv/
- These are managed via Subversion [3], so grabbing svn by rsync as above should also save website data + history
- Mailing lists using Mailman
- Which means public archives are available in mbox format (albeit with email addresses mangled). e.g. [4]
- Some mailing lists are private.
- Project metadata: groups, users, news, help topics etc. In a database and probably only available via web scraping.
- Usage stats at http://stats.gna.org/
Shutdown Notice
- A notice of pending shutdown / request for takeover was first announced in Nov 2016[2] suggesting a time frame of six months
- A news item[3] about shutdown was posted to the front page 2017-01-31 linking to the above. A reply to that on 4 Feb suggests shutdown will happen "within 3 months, or when the hardware dies".
- This suggests shutdown by around the beginning of May 2017.
rsync grab sign-up
- All done!
This gets code and file hosting but not other stuff. <180Gibyte, all in.
Please choose --bwlimit wisely (5M?)
What | Size | No files | Who/when |
---|---|---|---|
rsync://svn.gna.org/svn/ | ~41 Gibyte | ~1m | PurpleSym 2017-02-25 (via svnrdump; 18G lzip'd) mkram 2017-02-26 (via rsync) |
rsync://svn.gna.org/cvs/ | ~7.5 Gibyte | ~200k | mkram 2017-02-25 |
rsync://download.gna.org/arch/ | ~318 Mibyte | ~71k | mkram 2017-02-25 (except admindir) |
rsync://download.gna.org/download/ | ~116 Gibyte | ~130k | mkram 2017-02-25 |
rsync://download.gna.org/www/ | ~6.4 Gibyte | ~177k | mkram 2017-02-25 (except "some authentication folder and .bashhistory") |
For mkram's rsync grab, breakdown by project and upload schedule at Gna!/projects.