Gna!

From Archiveteam
Revision as of 16:30, 8 May 2017 by Zeryl04 (talk | contribs)
Jump to navigation Jump to search
Gna!
Gna.org screenshot 20170225.png
URL https://gna.org/
Status Closing
Archiving status In progress...
Archiving type Unknown
IRC channel #gnarm (on hackint)

Gna! is a centralized location where software developers can develop, distribute and maintain free (GPL-compatible) software. It is an instance of the Savane code-hosting platform[1]. It hosts for popular free software projects such as Battle for Wesnoth and Freeciv (full list).

It is shutting down due to lack of admin effort; possibly in May 2017.

Hosted data

As of 2017-02 it claimed to have 1458 hosted projects. (Many are probably abandoned and will not be saved by their project admins before shutdown.)

Here's a breakdown of the kinds of data stored and what various people can do to grab the data:

  • Third party describes what random anonymous Internet people (e.g., Archive Team) can do
    • Done shows bits that we have already rescued
    • Help shows bits that someone could usefully do
  • Members describes things that only members of the relevant project can do (if better)

Data stored:

  • Code hosting using CVS, Subversion, and Arch (done 2017-02-25, not updated since; see subpage)
    • Third parties can grab all code with full history:
      • All subversion repos available via (insecure) anonymous rsync: rsync://svn.gna.org/svn/ (ref: bottom of every project's svn page e.g. [1]). (In FSFS format, which is supposed to be portable.)
        • Gna members can get the same data with integrity protection over SSH (for any svn repository), but must use svnrdump; this supposedly creates a faithful copy of the important stuff, but is lightly munged.
      • Ditto CVS, it looks like: rsync://svn.gna.org/cvs/
        • Gna members can get the same data over SSH (for any project), but must use CVS commands. Don't know if there's a standard tool for reconstructing server-side repo state.
      • Arch/tla [2]: rsync://download.gna.org/arch/
        • Gna members can get the same data securely over sftp (for any project)
    • There's also a ViewVC web front-end to browse SVN/CVS code. (No point grabbing this if you've got the above)
  • Ticket tracking (not saved, help wanted) 2017-05-04 by Zeryl04
    • Up to 4 trackers per project: 'bugs', 'patch', 'task', 'support'
    • Gna members (only) can set up XML export of their own ticket text/metadata ("Export" item on tracker admin menu).
      • Only option for third parties looks like web scraping. Help: can someone look into this? (Someone pointed ArchiveBot at it but it doesn't seem to have grabbed much)
      • Exported XML is published to an unauthenticated URL of the form https://gna.org/export/project/user/number.xml . number might be global; a recent export had number 66. In principle this namespace could be mined by third parties although it's a rather large search space (1458 projects * 9116 users * 66 numbers) and would only catch recent or periodic exports, since they are cleared out quickly.
    • There's no supported interface for grabbing issue attachments (such as patches) even for project admins though.
      • Third parties can scrape attachments by relying on their increasing integer IDs, e.g. file #29845. It looks like you don't have to get the 'bugs' bit correct, so it's possible to scrape all public files by varying the ID. (done/ongoing by JTN, not uploaded anywhere yet)
    • Individual tickets can be private. (Maybe files too?) But the XML export includes private tickets (yes, to an unauthenticated URL).
  • File hosting at http://download.gna.org/ (done 2017-02-25, not updated since; see subpage)
    • Third parties can do (insecure) anonymous rsync from rsync://download.gna.org/download/
    • Gna members can get the same data (for any project) securely with rsync-over-SSH (rsync -avz user@download.gna.org:/var/ftp/ dest/), or with sftp
  • Project websites on home.gna.org (done 2017-02-25, not updated since; see subpage)
  • Mailing lists using Mailman (done 2017-05-04 by Zeryl04 using this code; got public HTML+mbox, uploaded to archive.org. ArchiveBot also has something, not sure what.)
    • Which means public archives are available to third parties in mbox format (albeit with email addresses mangled). e.g. [4]
      • Note, the most recent mbox link on inactive lists (e.g., [5]) is broken; replace "2014-09.partial.mbox.gz" with "2014-09.mbox.gz" to fix it
      • It may be worth grabbing the HTML archives too, as they contain some info not available in the mboxes, e.g. "X-From-R13" in HTML comments contains reversibly obfuscated From address
    • Some mailing lists are private. Even project admins can't see the archives at the moment (sr 3421).
  • Project metadata: groups, users, news, help topics etc. In a database and probably only available via web scraping. Help: can someone look into this?
  • Usage stats at http://stats.gna.org/

Gna admins have not so far been responsive to requests for help from at least some project members wishing to migrate or rescue their data, presumably due to the same lack of effort that is why the site is shutting down. They haven't been approached about Archive Team style bulk backup (or at least JTN has not done so).

Shutdown Notice

  • A notice of pending shutdown / request for takeover was first announced in Nov 2016[2] suggesting a time frame of six months
  • A news item[3] about shutdown was posted to the front page 2017-01-31 linking to the above. A reply to that on 4 Feb suggests shutdown will happen "within 3 months, or when the hardware dies".
  • This suggests shutdown by around the beginning of May 2017.
  • As of early May 2017, it's still up, although its SSL certificate has been allowed to lapse.

References