Formats

From Archiveteam
Revision as of 18:18, 18 February 2009 by Tom Morris (talk | contribs) (New page: A very good rule of thumb with data formats is to pick those that are ''no more complex than the data being represented'', that are ''recoverable with simple tools'' and ''widely implement...)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

A very good rule of thumb with data formats is to pick those that are no more complex than the data being represented, that are recoverable with simple tools and widely implemented. In general, if you have written a text document and it's not viewable and editable in a low-level text editor like Notepad (or Emacs, Vim, TextMate, BBEdit, gedit, kate, pico/nano etc.), you should probably take the time to convert it into a plain-text format - keep the rich format also. If you are backing up data in a format that's not widely understood, be sure to also keep backups of the software you use to open it and any registration keys - as you may find that a file made with version 2.x of a piece of software won't open the all new, singing and dancing version 5.x!

Text

Plain text, HTML and non-bloated XML formats are all good bets (DocBook, TEI etc.). PDF seems to have reached a point where it's open enough that it should be readable long into the future. For mathematical documents, LaTeX documents are text-based, have open implementations and the TeX format has been around since 1969.