https://wiki.archiveteam.org/api.php?action=feedcontributions&user=Tom+Morris&feedformat=atomArchiveteam - User contributions [en]2024-03-28T13:47:23ZUser contributionsMediaWiki 1.37.1https://wiki.archiveteam.org/index.php?title=WikiTeam&diff=3915WikiTeam2011-04-18T12:49:20Z<p>Tom Morris: updated citizendium</p>
<hr />
<div><center>'''We save wikis, from Wikipedia to tiniest wikis'''</center><br />
{{TOCright}}<br />
== Tools and source code ==<br />
* [http://code.google.com/p/wikiteam/ WikiTeam Google Code repo] (MediaWiki wikis scrapper, it generates a XML dump, very cool;download images if desired too)<br />
* There are many wiki engines, the most famous is MediaWiki. So, the tools must me ready to read data from almost every wiki engine to saved them all.<br />
* [http://dl.dropbox.com/u/63233/Wikitravel/Source%20Code%20and%20tools/Source%20Code%20and%20tools.7z Scripts of a guy who saved Wikitravel]<br />
<br />
{{-}}<br />
== Wiki dumps ==<br />
<center>'''[http://code.google.com/p/wikiteam/downloads/list?can=1 Visit the download section] in Google Code'''</center><br />
{{-}}<br />
{| class="wikitable" border=1 width=99% style="text-align: center;"<br />
! Wiki !! Wiki is online? !! Dumps available? (official or home-made) !! Comments/Details !! Saved by us? Who? Where?<br />
|-<br />
| [http://s23.org/wikistats/anarchopedias_html.php Anarchopedias]<br />
|-<br />
| [http://archiveteam.org Archive Team Wiki] || Yes || Official: no. Home-made: [http://code.google.com/p/wikiteam/downloads/list?can=1&q=archiveteam yes] || - || WikiTeam <br />
|-<br />
| Bulbapedia || Yes || Official: no. Home-made: no || - || dr-spangle is working on it with a self-built PHP downloader<br />
|-<br />
| [[Citizendium]] || Yes || Official: [http://en.citizendium.org/wiki/CZ:Downloads daily] (no full history). Home-made: [[Citizendium|yes]], April 2011 || No image dumps available || <br />
|-<br />
| [http://s23.org/wikistats/editthis_html.php EditThis] || Yes || Official: no. Home-made: in progress<br />
|-<br />
| enciclopedia.us.es || Yes || Official: no. Home-made: no || Sysop sent me page text sql tables || emijrp<br />
|-<br />
| [http://s23.org/wikistats/gentoo_html.php Gentoo wikis] || Yes || Official: no. Home-made: [http://code.google.com/p/wikiteam/downloads/list?can=1&q=gentoo yes] || || WikiTeam<br />
|-<br />
| GNUpedia || No || Official: no. Home-made: no || No database. This "wiki encyclopedia" was only HTML pages. Only ~3 articles were sent to the mailing list. After that, the project was closed || -<br />
|-<br />
| [http://s23.org/wikistats/metapedias_html.php Metapedia] || Yes || Official: ?. Home-made: no || - || -<br />
|-<br />
| [http://s23.org/wikistats/scoutwiki_html.php Neoseeker] || Yes || Official: ?. Home-made: no || - || -<br />
|-<br />
| [[Nupedia]] || No || Official: ?. Home-made: Yes, saved from IA || - || - <br />
|-<br />
| OmegaWiki || Yes || Official: [http://www.omegawiki.org/Development daily] || - || - <br />
|-<br />
| OpenStreetMap || Yes || Official: Yes. Home-made: no<br />
|-<br />
| [http://s23.org/wikistats/opensuse_html.php OpenSUSE wikis] || Yes || Official: no. Home-made: [http://code.google.com/p/wikiteam/downloads/list?can=1&q=opensuse in progress] || - || - <br />
|-<br />
| OSDev || Yes || Official: [http://wiki.osdev.org/OSDev_Wiki:About weekly] || - || Not yet<br />
|-<br />
| [http://s23.org/wikistats/scoutwiki_html.php Scout wikis]<br />
|-<br />
| [http://s23.org/wikistats/uncyclomedia_html.php Uncyclomedias]<br />
|-<br />
| Wikanda || Yes || Official: no. Home-made: [http://code.google.com/p/wikiteam/downloads/list?can=1&q=wikanda yes] || - || emijrp<br />
|-<br />
| [[Wikia]] || Yes || Official: [http://wiki-stats.wikia.com/ on demand] || No image dumps available || Not yet <br />
|-<br />
| [http://wikifur.com WikiFur] || Yes || Official: [http://dumps.wikifur.com/ yes] || No image dumps available || Not yet <br />
|-<br />
| WikiHow <br />
|-<br />
| [[Wikimedia Commons]] || Yes || Official: [http://dumps.wikimedia.org/commonswiki/latest/ periodically] || No image dumps available || Not yet <br />
|- <br />
| [[Wikipedia]] || Yes || Official: [http://dumps.wikimedia.org/backup-index.html periodically] || No image dumps available. English Wikipedia dump uses to be very old || Not yet <br />
|-<br />
| [http://s23.org/wikistats/wikisite_html.php Wiki-site.com]<br />
|-<br />
| WikiTravel || Yes || Official: [http://wikitravel.org/en/Wikitravel:Database_dump not yet]. Home-made: [http://code.google.com/p/wikiteam/downloads/list?can=1&q=wikitravel yes], another of [http://dl.dropbox.com/u/63233/Wikitravel/Complete%20zip/WikitravelComplete14-June-2010.7z 2010-06-14] || - || WikiTeam <br />
|}<br />
<br />
=== Tips ===<br />
Some tips:<br />
* When downloading Wikipedia/Wikimedia Commons dumps, pages-meta-history.xml.7z and pages-meta-history.xml.bz2 are the same, but 7z use to be smaller (better compress ratio), so use 7z.<br />
<br />
== Closing/In danger ==<br />
* Gentoo wikis: Error 503 Service Unavailable as of 2011-04-06 http://s23.org/wikistats/gentoo_html.php<br />
** Again up. [http://code.google.com/p/wikiteam/downloads/list?can=1&q=gentoo Saved]! [[User:Emijrp|Emijrp]] 21:30, 10 April 2011 (UTC)<br />
<br />
== Offline wikis and wikifarms ==<br />
elwiki.com<br />
<br />
* 2011<br />
** wik.is<br />
* 2010<br />
** <br />
* 2009<br />
** <br />
* 2008<br />
** Scribblewiki (wikifarm)<br />
<br />
== External links ==<br />
* http://wikiindex.org - A lot of wikis to save<br />
* http://wiki1001.com/<br />
* http://meta.wikimedia.org/wiki/List_of_largest_wikis<br />
* http://s23.org/wikistats/<br />
* http://en.wikipedia.org/wiki/Comparison_of_wiki_farms<br />
* http://en.wikipedia.org/wiki/User:Emijrp/Wikipedia_Archive<br />
* http://blog.shoutwiki.com/<br />
* http://wikiheaven.blogspot.com/<br />
<br />
{{Navigation box}}<br />
<br />
[[Category:Archive Team]]</div>Tom Morrishttps://wiki.archiveteam.org/index.php?title=Wikipedia&diff=3914Wikipedia2011-04-18T12:44:48Z<p>Tom Morris: wikimedia commons update</p>
<hr />
<div>'''Wikipedia''' is the largest [[wiki]] on the planet, with several million articles available in English and several million more in dozens of available languages.<br />
<br />
[[File:Wikipedia nostalgia.png|thumb|right|[http://nostalgia.wikipedia.org Wikipedia nostalgia], a frozen version of Wikipedia from 2001]] <br />
For once, a site that recognizes the importance of third-party backups! They have a [http://download.wikipedia.org/ main downloads page] from which you can get XML dumps from [http://download.wikipedia.org/backup-index.html individual wikis] (Wikimedia Foundation hosts more than 700 wikis: Wikipedias, Wiktionaries, Wikinews, Wikisources, Wikibooks, Wikiquotes, Wikiversities, Wikispecies, Wikimedia Commons).<br />
<br />
You can download all the articles of the [[English Wikipedia]] (with complete edit history) in an unique file (compressed in [[7zip]] format) from [http://download.wikimedia.org/enwiki/20100130/ here] (ATTENTION: 31 GB! Unpacked it expands up to 5.2 TB. Direct link: [http://download.wikimedia.org/enwiki/20100130/enwiki-20100130-pages-meta-history.xml.7z pages-meta-history.xml.7z]).<br />
<br />
There's an old article dump (2008/03/12) [http://thepiratebay.org/torrent/4794236/enwiki-20080312-pages-articles.xml.bz2 up on The Pirate Bay], from the [http://thepiratebay.org/user/archiveteam/ ArchiveTeam TPB account]. Also, a [http://dumps.wikimedia.org/archive/enwiki/20060816/ dump from 2006] is available.<br />
<br />
Some [http://www.archive.org/search.php?query=wikipedia%20dump Wikipedia dumps] in the Internet Archive.<br />
<br />
There is no current public backup for images uploaded to [[Wikimedia Commons]] which has about 10 million images and other media files uploaded on it's services.<br />
<br />
<center>'''No more [[Library of Alexandria|Libraries of Alexandria]] destroyed.'''</center><br />
<br />
[[File:Size of English Wikipedia in August 2010 (L).png|thumb|center|700px|English Wikipedia in August 2010, if printed.]]<br />
<br />
== Vital signs ==<br />
<br />
Stable, but they seriously use a lot of tactics to get donations.<br />
<br />
== See also ==<br />
* [[Wikia]]<br />
* [[Wikis]]<br />
* [[Nupedia]]<br />
* [[GNUPedia]]<br />
* [[Citizendium]]<br />
<br />
== External links ==<br />
* http://download.wikipedia.org/<br />
* http://download.wikimedia.org/archive/ some incomplete old dumps, English Wikipedia mainly<br />
* [http://lists.wikimedia.org/pipermail/foundation-l/2010-December/063088.html old wikipedia backups discovered] http://noc.wikimedia.org/~tstarling/wikipedia-logs-2001-08-17.7z<br />
* http://en.wikipedia.org/wiki/User:Emijrp/Wikipedia_Archive<br />
<br />
[[Category:Wikis]]</div>Tom Morrishttps://wiki.archiveteam.org/index.php?title=User:Tom_Morris&diff=3911User:Tom Morris2011-04-18T12:26:40Z<p>Tom Morris: Created page with 'UK geek who hates seeing good content disappear. * [http://tommorris.org homepage] * [http://enwp.org/User:Tom_Morris wikipedia]'</p>
<hr />
<div>UK geek who hates seeing good content disappear.<br />
<br />
* [http://tommorris.org homepage]<br />
* [http://enwp.org/User:Tom_Morris wikipedia]</div>Tom Morrishttps://wiki.archiveteam.org/index.php?title=Citizendium&diff=3909Citizendium2011-04-18T12:25:51Z<p>Tom Morris: september 2011</p>
<hr />
<div>{{Infobox project<br />
| title = Citizendium<br />
| image = Welcome to Citizendium - Citizendium 1292887672746.png<br />
| description = Citizendium mainpage in 2010-12-21<br />
| URL = http://citizendium.org<br />
| project_status = {{online}}<br />
| archiving_status = {{saved}} as of 2011-04<br />
}}<br />
<br />
The '''Citizendium''' is a [[wiki]] that constantly "pursuits the highest standards of writing, reliability, and comprehensiveness". The wiki is basically Wikipedia with higher standards.<br />
<br />
The site only has funding to cover until September 2011.<br />
<br />
== April 2011 dump ==<br />
Using the Wikiteam script, I made a dump of the full histories up to April 2011. They are available through [http://www.megaupload.com/?d=BO69BM9E Megaupload] and [http://www.archive.org/details/Citizendium2011-04-14Wikidump Internet Archive]. —[[User:Tom Morris|Tom Morris]] 12:07, 18 April 2011 (UTC)<br />
<br />
== External links ==<br />
* http://citizendium.org<br />
* Dumps available (but only the last version of the page, not the whole history): http://en.citizendium.org/wiki/CZ:Downloads<br />
<br />
{{Navigation box}}<br />
<br />
[[Category:Wikis]]</div>Tom Morrishttps://wiki.archiveteam.org/index.php?title=Citizendium&diff=3901Citizendium2011-04-18T12:07:14Z<p>Tom Morris: added april 2011 dump</p>
<hr />
<div>{{Infobox project<br />
| title = Citizendium<br />
| image = Welcome to Citizendium - Citizendium 1292887672746.png<br />
| description = Citizendium mainpage in 2010-12-21<br />
| URL = http://citizendium.org<br />
| project_status = {{online}}<br />
| archiving_status = {{saved}} as of 2011-04<br />
}}<br />
<br />
The '''Citizendium''' is a [[wiki]] that constantly "pursuits the highest standards of writing, reliability, and comprehensiveness". The wiki is basically Wikipedia with higher standards.<br />
<br />
== April 2011 dump ==<br />
Using the Wikiteam script, I made a dump of the full histories up to April 2011. They are available through [http://www.megaupload.com/?d=BO69BM9E Megaupload] and [http://www.archive.org/details/Citizendium2011-04-14Wikidump Internet Archive]. —[[User:Tom Morris|Tom Morris]] 12:07, 18 April 2011 (UTC)<br />
<br />
== External links ==<br />
* http://citizendium.org<br />
* Dumps available (but only the last version of the page, not the whole history): http://en.citizendium.org/wiki/CZ:Downloads<br />
<br />
{{Navigation box}}<br />
<br />
[[Category:Wikis]]</div>Tom Morrishttps://wiki.archiveteam.org/index.php?title=Lycos_Europe&diff=404Lycos Europe2009-02-23T19:17:10Z<p>Tom Morris: </p>
<hr />
<div>Lycos Europe have "decided to discontinue all unprofitable activities". This means a substantial chunk of their user-content sphere, including:<br />
<br />
[http://www.lycos.co.uk/info.html Death-warrant is here]<br />
<br />
* Lycos Europe '''member pages''' (Tripod) - http://members.lycos.co.uk/* - Attempts to register now return the following dismal message: [http://www.lycos.co.uk/stopregistration.html] - Deadline 28th Feb<br />
* [http://iq.lycos.co.uk Lycos iQ] (Question/Answer community) - Deadline 25th Feb<br />
* '''[http://www.lycos.co.uk/emailsms/ Email and SMS services]''' - see [http://mail.lycos.co.uk/] - Dead as of 15th Feb<br />
* '''[http://fotomania.lycos.de Fotomania]''' - Photo community (in some locales) - Dead as of 15th Feb<br />
* [http://love.lycos.co.uk Love @ Lycos] (Dating) - Dead as of 15th Feb<br />
<br />
* Jubii.com holdings (apart from the original Danish portal, which seems to have been bought. [http://translate.google.com/translate?prev=hp&hl=en&u=http://www.business.dk/article/20090107/medier/90107112/&sl=auto&tl=en]) specifically:<br />
::*'''[http://www.jubii.com/ Jubii email]'''<br />
::*'''[http://tv.jubii.co.uk/ Jubii TV]''' - videos<br />
::*'''[http://vdownload.jubii.com/ VDownload]''' servers - file storage<br />
<br />
<br />
".co.uk" used above for convenience, but the same goes for hosted content across seven or eight countries: UK (.co.uk), France (.fr), Germany (.de), Netherlands (.nl), Spain (.es), Italy (.it), Denmark (.dk), Switzerland (.ch) (?) or Austria (.at).<br />
<br />
<br />
This is AOL Hometown all over again, but now we have a few weeks to work with. <br />
<br />
'''Tripod''' pages should be first priority: small files, high uniqueness. Then '''Jubii hosted files''', if we can find them. Then '''Fotomania''' and '''Jubii TV'''.<br />
<br />
Not much we can do for the emails. [[Introduction |But a bit]].<br />
<br />
== Backup Tools ==<br />
<br />
* Email: Lycos are providing a software tool that communicates with their mail servers and downloads your stuff. Requires Outlook. [http://f012.mail.lycos.co.uk/app/settings/webdav/list.jsp]<br />
<br />
* Jubii TV - Cunning use of [[wget]] will probably work.<br />
<br />
* Hosted sites - wget, if you can figure out where to point it.<br />
<br />
== Vital Signs ==<br />
<br />
On death row. Deadline: February 28th 2009.<br />
<br />
== Who's Working On It? ==<br />
<br />
Teaspoon - is harvesting the sites listed in the UK directory. <br />
<br />
Please help: wget -i <pick a file><br />
<br />
* [http://ermsays.net/lycos-de.txt Germany] is huge<br />
* [http://ermsays.net/lycos-nl.txt Netherlands] is huge<br />
* [http://ermsays.net/lycos-es.txt Spain]<br />
* [http://ermsays.net/lycos-it.txt Italy]<br />
* [http://ermsays.net/lycos-fr.txt France] is really huge<br />
* [http://ermsays.net/lycos-uk.txt UK]<br />
* [http://ermsays.net/lycos-dk.txt Denmark] is tiny, and archived (but the more copies the safer)<br />
<br />
=== IQ ===<br />
[[User:Tom Morris|Tom Morris]] has completed Lycos iQ UK. Once the site has gone down for good, I'll prepare the archives for public release.</div>Tom Morrishttps://wiki.archiveteam.org/index.php?title=Lycos_Europe&diff=400Lycos Europe2009-02-23T01:47:36Z<p>Tom Morris: </p>
<hr />
<div>Lycos Europe have "decided to discontinue all unprofitable activities". This means a substantial chunk of their user-content sphere, including:<br />
<br />
[http://www.lycos.co.uk/info.html Death-warrant is here]<br />
<br />
* Lycos Europe '''member pages''' (Tripod) - http://members.lycos.co.uk/* - Attempts to register now return the following dismal message: [http://www.lycos.co.uk/stopregistration.html] - Deadline 28th Feb<br />
* [http://iq.lycos.co.uk Lycos iQ] (Question/Answer community) - Deadline 25th Feb<br />
* '''[http://www.lycos.co.uk/emailsms/ Email and SMS services]''' - see [http://mail.lycos.co.uk/] - Dead as of 15th Feb<br />
* '''[http://fotomania.lycos.de Fotomania]''' - Photo community (in some locales) - Dead as of 15th Feb<br />
* [http://love.lycos.co.uk Love @ Lycos] (Dating) - Dead as of 15th Feb<br />
<br />
* Jubii.com holdings (apart from the original Danish portal, which seems to have been bought. [http://translate.google.com/translate?prev=hp&hl=en&u=http://www.business.dk/article/20090107/medier/90107112/&sl=auto&tl=en]) specifically:<br />
::*'''[http://www.jubii.com/ Jubii email]'''<br />
::*'''[http://tv.jubii.co.uk/ Jubii TV]''' - videos<br />
::*'''[http://vdownload.jubii.com/ VDownload]''' servers - file storage<br />
<br />
<br />
".co.uk" used above for convenience, but the same goes for hosted content across seven or eight countries: UK (.co.uk), France (.fr), Germany (.de), Netherlands (.nl), Spain (.es), Italy (.it), Denmark (.dk), Switzerland (.ch) (?) or Austria (.at).<br />
<br />
<br />
This is AOL Hometown all over again, but now we have a few weeks to work with. <br />
<br />
'''Tripod''' pages should be first priority: small files, high uniqueness. Then '''Jubii hosted files''', if we can find them. Then '''Fotomania''' and '''Jubii TV'''.<br />
<br />
Not much we can do for the emails. [[Introduction |But a bit]].<br />
<br />
== Backup Tools ==<br />
<br />
* Email: Lycos are providing a software tool that communicates with their mail servers and downloads your stuff. Requires Outlook. [http://f012.mail.lycos.co.uk/app/settings/webdav/list.jsp]<br />
<br />
* Jubii TV - Cunning use of [[wget]] will probably work.<br />
<br />
* Hosted sites - wget, if you can figure out where to point it.<br />
<br />
== Vital Signs ==<br />
<br />
On death row. Deadline: February 28th 2009.<br />
<br />
== Who's Working On It? ==<br />
<br />
Teaspoon - is harvesting the sites listed in the UK directory. <br />
<br />
Please help: wget -i <pick a file><br />
<br />
* [http://ermsays.net/lycos-de.txt Germany] is huge<br />
* [http://ermsays.net/lycos-nl.txt Netherlands] is huge<br />
* [http://ermsays.net/lycos-es.txt Spain]<br />
* [http://ermsays.net/lycos-it.txt Italy]<br />
* [http://ermsays.net/lycos-fr.txt France] is really huge<br />
* [http://ermsays.net/lycos-uk.txt UK]<br />
* [http://ermsays.net/lycos-dk.txt Denmark] is tiny, and archived (but the more copies the safer)<br />
<br />
=== IQ ===<br />
[[User:Tom Morris|Tom Morris]] is in the process of archiving Lycos iQ UK. If you want to help, shout on IRC.</div>Tom Morrishttps://wiki.archiveteam.org/index.php?title=Yahoo!&diff=399Yahoo!2009-02-22T19:16:01Z<p>Tom Morris: linkified</p>
<hr />
<div>===As of January, 2009, Archive Team no longer considers Yahoo a dependable location for data.===<br />
<br />
This is not based on their engineering, which has shown itself to be consistent and with few outages. Rather, it appears the company is in relative free-fall with regards to which projects they will maintain and what comes under any given knife for cost-cutting measures.<br />
<br />
When a company enters this sort of spiral with regard to one of their core businesses (hosting and providing of information services), and consistently gives little or no indication of their next move, it becomes incumbent upon the users of that service to either demand changes in policy, or find alternatives, even poor ones, and build those up.<br />
<br />
When a company decides (or, more accurately, someone with the company decides) that a website or sub-site is no longer viable, then it's living on borrowed time. Like a store closing, or a very sick pet, it becomes a matter of how to bring things to a close. This is entirely up to the closing party, and from their behavior, we can see how they will consider doing this.<br />
<br />
Previously, Yahoo showed some level of restraint in how they would shut down services. For example, when [[Yahoo! Photos]], a photo sharing site, was closed in favor of the bright and shiny new property [[Flickr]], it was announced, a special site was provided to assist users in transferring their photos to other sites, and there was an opportunity to purchase an archive CD of your content. <ref>http://help.yahoo.com/l/us/yahoo/photos/photos3/closing/closing-02.html</ref>. It should be noted, however, that [[Yahoo! Photos]] was closed under much protest and duress of the userbase, who in some cases had no interest in transferring to [[Flickr]] and wished merely to maintain their own interface.<br />
<br />
But now, Yahoo seems to have no issues with very quick shutdown, with little warning, and almost no regard for the quality of the site.<br />
<br />
Some examples of this new behavior:<br />
<br />
* Yahoo closed [http://www.crunchbase.com/product/yahoo-brickhouse Brickhouse], their in-house development and prototype department (think of it as an incubator) in December of 2008. They were swift enough to close down the building within weeks. <ref>http://george08.blogspot.com/2008/12/not-quite-what-i-had-in-mind.html</ref><br />
* In December of 2008, Yahoo began layoffs at [[Flickr]], a site previously untouchable, including George Oates<ref>http://george08.blogspot.com/2008/12/not-quite-what-i-had-in-mind.html</ref>, who designed the interface of [[Flickr]], and championed the site's interaction with the "Commons", including the US Library of Congress, and making Creative Commons licenses the default for [[Flickr]]'s photo uploads. Oates was laid off mid-trip on a fact-finding and information trip for Yahoo, having met and advocated [[Flickr]] to a number of prominent folks. <ref>http://www.guardian.co.uk/technology/blog/2008/dec/11/yahoo-flickr-layoffs</ref><br />
* On or about January 27, 2009, with ''absolutely no notice'', [[Yahoo Pets]] was shut down, all content removed from the web, and completely redirected under another Yahoo property, [[Shine]]. <ref>http://blog.dogster.com/2009/01/28/yahoo-quietly-shutters-yahoo-pets-grin/</ref><br />
<br />
===Please do not use Yahoo or Yahoo-owned sites for any non-retrievable personal data.===<br />
<br />
Non-retrievable data means that there is no export function, or way to pull your personal data off the site. You should continue to use it if you can be assured that the Yahoo function you are using will not dramatically affect your life if it disappears tomorrow. Because it might.<br />
<br />
===Yahoo Services===<br />
* [[Flickr]]<br />
* [[Delicious]]<br />
* [[Upcoming]]<br />
* [[Yahoo! Groups]] <br />
<br />
== References ==<br />
<references/></div>Tom Morrishttps://wiki.archiveteam.org/index.php?title=Formats&diff=389Formats2009-02-18T18:18:42Z<p>Tom Morris: New page: A very good rule of thumb with data formats is to pick those that are ''no more complex than the data being represented'', that are ''recoverable with simple tools'' and ''widely implement...</p>
<hr />
<div>A very good rule of thumb with data formats is to pick those that are ''no more complex than the data being represented'', that are ''recoverable with simple tools'' and ''widely implemented''. In general, if you have written a text document and it's not viewable and editable in a low-level text editor like Notepad (or Emacs, Vim, TextMate, BBEdit, gedit, kate, pico/nano etc.), you should probably take the time to convert it into a plain-text format - keep the rich format also. If you are backing up data in a format that's not widely understood, be sure to also keep backups of the software you use to open it and any registration keys - as you may find that a file made with version 2.x of a piece of software won't open the all new, singing and dancing version 5.x!<br />
<br />
== Text ==<br />
Plain text, HTML and non-bloated XML formats are all good bets (DocBook, TEI etc.). PDF seems to have reached a point where it's open enough that it should be readable long into the future. For mathematical documents, LaTeX documents are text-based, have open implementations and the [http://en.wikipedia.org/wiki/TeX TeX] format has been around since 1969.</div>Tom Morrishttps://wiki.archiveteam.org/index.php?title=Main_Page&diff=388Main Page2009-02-18T17:36:46Z<p>Tom Morris: </p>
<hr />
<div>[[Image:Archiveteam.jpg|center|300px]]<br />
<br />
<br />
=== HISTORY IS OUR FUTURE ===<br />
''And we've been trashing our history''<br />
<br />
This website is intended to be an offloading point and information depot for a number of archiving projects, all related to saving websites or data that is in danger of being lost. Besides serving as a hub for team-based pulling down and mirroring of data, this site will provide advice on managing your own data and rescuing it from the brink of destruction.<br />
<br />
Feel free to join us on IRC! We're on the EFnet network in a channel called '''#archiveteam''', where we say truly awful things.<br />
<br />
===What's here===<br />
<br />
''Archive Team''<br />
<br />
* [[Who We Are]] and how you can join our cause!<br />
<br />
* [[Deathwatch]] is where we keep track of sites that are sickly, dying or dead.<br />
<br />
* [[Fire Drill]] is where we keep track of sites that seem fine but a lot depends on them.<br />
<br />
* [[Projects]] is to keep track of AT endeavors.<br />
<br />
* [[Philosophy]] describes the ideas underpinning our work.<br />
<br />
''DIY Data Rescue''<br />
<br />
* [[Introduction|The Introduction]] is an overview of basic archiving methods.<br />
<br />
* [[Why Back Up?]] Because they don't care about you.<br />
<br />
* [[Software]] will assist you in regaining control of your data by providing tools for information backup, archiving and distribution. <br />
<br />
* [[Formats]] will familiarise you with the various data formats, and how to ensure your files will be readable in the future.<br />
<br />
* [[Storage Media]] is about where to get it, what to get, and how to use it.<br />
<br />
* [[Recommended Reading]] links to others sites for further information.<br />
<br />
The site is still very new. Please be patient with the missing bits or help us fill them in.</div>Tom Morrishttps://wiki.archiveteam.org/index.php?title=Google&diff=387Google2009-02-18T17:23:37Z<p>Tom Morris: /* Gmail */ added offlineimap</p>
<hr />
<div>Google probably isn't Evil per se, but they do want you to put all of your data on their servers. Trusting any one company that much is probably a bad idea. If your entire life is on Google, what happens to Google happens to you.<br />
<br />
== Backup Tools ==<br />
<br />
== Blogger ==<br />
<br />
* [http://google-opensource.blogspot.com/2009/01/google-blog-converters-10-released.html Google Blog Converters 1.0] uses Python to convert between Blogger, LiveJournal, MovableType, and WordPress. <br />
<br />
* Blogger can now export the entire contents of a blog, over at [http://draft.blogger.com/ Blogger in Draft].<br />
<br />
=== Gmail ===<br />
<br />
* [http://www.gmail-backup.com/ Gmail Backup]<br />
* Gmail provides IMAP access, so you can use [http://software.complete.org/software/projects/show/offlineimap OfflineIMAP] to backup and sync your complete archive in standard UNIX maildir format, usable by Mutt, Thunderbird and most sane e-mail clients. See [http://soren.overgaard.org/2007/12/15/backing-up-gmail-using-offlineimap/ this blog post] for more details.<br />
<br />
=== Google Docs ===<br />
<br />
* [http://1st-soft.net/gdd/ GM Script by Peter Schafer] download Google Docs en masse.<br />
<br />
* [http://code.google.com/p/gdatacopier/ gdatacopier] "Bi-directional copy utility & API for Google docs"<br />
<br />
=== Google Calendar ===<br />
<br />
* [http://www.google.com/support/calendar/bin/answer.py?hl=en&answer=37111 Export your Google Calendar]<br />
<br />
=== Google Reader ===<br />
<br />
* [http://ze-ze.cn/2008/01/how-to-backup-articles-from-google-reader.html How to Back Up Articles from Google Reader]<br />
<br />
=== Other ===<br />
<br />
Does a tool suite exist that backs up all of the Google Apps cloud?<br />
<br />
== Vital Signs ==<br />
<br />
Pump up the NASDAQ.</div>Tom Morrishttps://wiki.archiveteam.org/index.php?title=Alive..._OR_ARE_THEY&diff=386Alive... OR ARE THEY2009-02-18T17:18:04Z<p>Tom Morris: </p>
<hr />
<div>Like many sites before them, these places indicate a sunny outlook, a clean bill of health and a total sense of "all systems go". But as we've found out from those many sites before them, fortunes can change overnight.<br />
<br />
Archive Team considers these sites specifically of interest because they solicit so much content, contain so many works and projects by a wide group of people, or have the internet particularly dependent on them. Consider this a fire drill.. know what you can do to get your data off these sites and back them off for later.<br />
<br />
=== Sites ===<br />
<br />
* '''[[Facebook]]''' seems stable at the moment.<br />
* '''[[Friendfeed]]''' is a happy clam.<br />
* '''[[Google]]''' wants you to think they will be here forever.<br />
* '''[[Twitter]]''' is tweaking away.<br />
* '''[http://en.wikipedia.org Wikipedia]''' will surely be here forever and ever!<br />
* '''[[Delicious]]''' loves to change their API, which has a side effect of making it difficult to back up.<br />
* '''[[whitehouse.gov]]''' is up and running for #44, but we've lost all info for #43. (See also: [http://www.kottke.org/09/01/old-whitehousegov-down-the-memory-hole kottke] and [http://www.readwriteweb.com/archives/whitehousegov_president_web_presence.php Read Write Web].) We also want to watch out for site changes / disappeared pages that were embarassing or whatnot.<br />
* '''[http://www.infoanarchy.org Infoanarchy]''' The site is functioning again. Might be worth backing up, though. For months, a simple database error that could be fixed with one command KO'd this site unexpectedly with a wealth of P2P information lost. [http://eng.anarchopedia.org/infoAnarchy]</div>Tom Morris