Alive... OR ARE THEY

From Archiveteam
Jump to: navigation, search

Like many sites before them, these places indicate a sunny outlook, a clean bill of health and a total sense of "all systems go". But as we've found out from those many sites before them, fortunes can change overnight.

Archive Team considers these sites specifically of interest because they solicit so much content, contain so many works and projects by a wide group of people, or have the internet particularly dependent on them. Consider this a fire drill.. know what you can do to get your data off these sites and back them off for later.

Contents

Still Alive

Not so alive, rather living deads (owned by Yahoo!)

  • Flickr contains billions of files, hundreds millions of which are under a Creative Commons license or stored there by many museums and other cultural institutions. The site was tumblr-ised in 2013 and has been poorly functional ever since; pro users were removed, so it doesn't yet have a business model. Additionally, it's owned by Yahoo!, need to say more?!

All the others

  • Academic Earth (http://academicearth.org/ [IA] [WebCite]) has been worryingly unloved for a while, and holds a mountain of free education that's invaluable to the world.
  • Encyclopedia Astronautica is the most comprehensive collection of the history of space travel. Period. Seriously, the official NASA history folks will refer you this website if they can't answer your questions. However, Mark Wade (the sole creator/maintainer) abandoned his blog at the end of 2007, and the Encyclopedia has not been updated since May of 2008, despite much happening in the space exploration world since then.
  • Angelfire has been in constant decline for many years now.
  • AnimeMusicVideos.org (http://www.animemusicvideos.org/ [IA] [WebCite]) is fine right now, but they rely on donations and host vast amounts of user-edited music videos on their server (presumably without mirrors). Hard to download as you have to be a member to get all the download links, and after downloading a handful you have to vode before you can d/l again (or you can donate which presumably gives you 1 year of free d/l access). Also, this site might be a grey area, copyright-wise, as the videos are all cut together from copyrighted material.
  • Codecademy (http://www.codecademy.com/ [IA] [WebCite]) has a large amount of valuable coding lessons.
  • cyberpunkreview.com: 80s science fiction fansite and community http://cyberpunkreview.com/ [IA] [WebCite] hasn't seen much staff activity in a long time, although the forums are going strong. UPDATE: Looking active again. Aggroskater 08:26, 19 March 2012 (EDT)
  • Delicious (http://www.delicious.com/ [IA] [WebCite]) loves to change their API, which has a side effect of making it difficult to back up.
  • Facebook (http://www.facebook.com/ [IA] [WebCite]) seems stable at the moment.
  • FanFiction (http://www.fanfiction.net/ [IA] [WebCite]) represents many thousands of user-generated stories, essays and huge amounts of work.
  • Google (http://www.google.com/ [IA] [WebCite]) wants you to think they will be here forever.
  • IFTTT (http://ifttt.com/ [IA] [WebCite]) is still growing.
  • Internet Archive (http://www.archive.org/ [IA] [WebCite]) seems stable at the moment but its 16 petabytes of data aren't mirrored anywhere else, the code for their system isn't open source and generally they're a single point of failure for a large amount of the web's history. Why should there be only 1 internet archive?
  • JSFiddle (http://jsfiddle.net/ [IA] [WebCite]) is referenced in many StackOverflow answers, as well as other forums, etc. It shows no signs of going away, but should we archive it just in case?
  • Know Your Meme (http://knowyourmeme.com/ [IA] [WebCite]) is at this point the de facto central repository for information on internet memes and culture. It is as popular as ever at the moment, but even with this popularity, former owners Rocketboom had trouble financing it. In the spring of 2011 was sold to Cheezburger Networks, a site which has been known to "reorganize" its properties, sometimes with a detrimental effect on content. Though it was quite a different story, I might remind people what happened to Encyclopedia Dramatica.
  • Last.fm (http://www.last.fm/ [IA] [WebCite]) is being cloned by free software developers in the form of Libre.fm -- they have a tool, Lastscrape which can get all your listening data out into a tab delimited text file.
  • Literotica.com (http://literotica.com/ [IA] [WebCite]) Contains over 290,000 user-written stories and poems. First pass at a backup: part1.rar, part2.rar, part3.rar, part4.rar -- contains the text of all stories as of the backup date in XML format. (One page of one story is missing because it doesn't exist on the site; embedded images and audio are not included this time; non-English stories aren't labelled with their language).
  • LiveJournal (http://www.livejournal.com/ [IA] [WebCite]) fired a bunch of US-based developers, but is still serving from its new (presumably cheaper) data center in Montana.
  • Pastebin (http://www.pastebin.com/ [IA] [WebCite]) is still getting filled with text.
  • Pixiv (http://www.pixiv.net/ [IA] [WebCite]) and deviantArt (http://www.deviantart.com/ [IA] [WebCite]) are the largest Japanese and American (respectively) fanart (and valuable art in general) collections on the internet.
  • Pouet (http://www.pouet.net/ [IA] [WebCite]) is an important site of the demoscene. It indexes and ranks demoscene productions ('prods') and also includes a free-for-all BBS-style forum.
  • Reddit (http://www.reddit.com/ [IA] [WebCite]) is where many of the users have now migrated. Stable for now, but team is small.
  • SourceForge (http://www.sourceforge.net/ [IA] [WebCite]) is a critical repository of open source code, information, and webpages. It is mirrored and maintained, but there are sure to be parts that are neither.
  • The Pirate Bay (http://www.thepiratebay.org/ [IA] [WebCite]) is one of the largest and most popular torrent search engines. It's still having persistent legal problems. The tracker went down in November 2012, but the site still serves torrents and magnet links. If a torrent is lost, it becomes impossible to connect to other computers distributing the shared files. Considering that there are links to TPB all over this wiki, this site is pretty dang important. After they were raided in December 2014, a project known as The Open Bay was launched, which lets anybody host a mirror of TPB with automatic database updates, so even if TPB goes down again, temporarily or not, its database is still available.
  • Tumblr (http://tumblr.com [IA] [WebCite]) is a highly popular blogging platform which was bought by Yahoo! in May, 2013.
  • TVTropes (http://www.tvtropes.org/pmwiki/pmwiki.php/Main/HomePage [IA] [WebCite]) is a popular wiki dedicated to finding recurring patterns in fiction, and discussing fiction in general. No word on whether there are backups. The administrators have a tendency to delete things indiscriminately, usually to save on disk space: article edit histories are frequently purged, and old forum threads have been known to get deleted mercilessly. A backup is now available.
  • Twitter (http://www.twitter.com/ [IA] [WebCite]) is tweaking away.
  • WebCite (http://www.webcitation.org/ [IA] [WebCite]) itself seems to be having trouble with funding, and is facing "possible discontinuation." As this site serves as a stable reference for fleeting Web references, it would be pretty disastrous if it went away.
  • whitehouse.gov (http://www.whitehouse.gov/ [IA] [WebCite]) is up and running for #44, but we've lost all info for #43. (See also: kottke and Read Write Web.) and #43 is available at http://georgewbush-whitehouse.archives.gov/ thanks to the Presidential Records Act. We also want to watch out for site changes / disappeared pages that were embarassing or whatnot.
  • Wikia (http://www.wikia.com/ [IA] [WebCite]), the for-pay arm of Wikipedia (just kidding, it's a different company, but shares a lot of people) is a repository of directed, unsubject-to-wikipolitics wikis, many of them intense and completist. It'd be bad for them to go away.
  • WikiLeaks (http://wikileaks.org/ [IA] [WebCite]) contains several thousand leaked documents from sources such as the Iraq War and the cables famously known under the label 'Cablegate'. Due to the content on the website, and that PayPal and Amazon (very) quickly dropped their hosting for them during Cablegate's opening days, it should be considered a potential target for any number of government committees for quick shutdown.
  • WikiLeaks (http://wikileaks.org/ [IA] [WebCite]) has an uncertain financial situation, and the site was inaccessible for some time in 2010.
  • Wikipedia (http://www.wikipedia.org/ [IA] [WebCite]) will surely be here forever and ever! Fortunately, we don't have to take their word for it as they offer dumps of the data minus the photos. However no-one has verified that Wikipedia can actually be restored from these dumps. If disaster strikes then we could discover a serious problem.

So Worried

Did someone leave the oven on?

  • FriendFeed (http://friendfeed.com/ [IA] [WebCite]) has been purchased by Facebook, leaving FriendFeed users uncertain as to its future and mostly unsupported. The Twitter bridge, for instance, has not worked for years now.
  • Ning in 2010 has laid off 40% of staff and seems to be running out of money [1]. There is certainly some networks worth archiving among the 2 million networks[2] they host. Grouply[3] and Posterous[4] say they are going to offer migration tools.
  • As of 2014, ScraperWiki Classic is now read-only. But don’t worry! You can transfer this scraper to Morph.io if you want to continue editing it.
  • Convozine hasn't been active lately. Their last reply to a support question was in 2012, their last update in the "News" section was December 2011, and their last blog post was in January 2013. (See [5] and [6].)
  • debates.oireachtas.ie on September 18th, 2012 the Houses of Oireachtas website announced that it would no longer be updating its XMl data for Irish parliamentary debates (1919-2012). Access to pre-existing data is still available, but is likely to disappear, if the current trend continues. It would be useful to at least capture the XML data that is there, while it is still available.
  • ownlog.com - once one of the most popular and oldest blog platform in Poland seems to be dying slowly - no development and actualizations except most critical maintenance.
  • The Grid (magazine in Toronto) printed its last issue on July 3rd 2014 (see here) not sure how long the site will stay up. Saved by ArchiveBot
  • Nakido (site) claims to be a "time capsule" that will "host your files for decades" - except it's a commercial enterprise selling premium acounts, and uses a proprietary P2P platform for delivery. What could possibly go wrong?
  • Groklaw will no longer be posting new articles, "due to government monitoring of the internet, particularly e-mail." Whether or not its archives will remain online is unclear, although it does seem rather unlikely it will 100% disappear. OTOH, better safe than sorry.
  • Strawpoll.me
  • The Centralstation Community has closed. The site is a UK-based social network for artists and creatives that provides hosting for content and portfolio. Users are being advised to back up their work as the new version of their platform will rely on existing media hosting sites like Flickr, Vimeo, and Soundcloud.

Fire Alarm Sounds Like Whoop Whoop Whoop

I smell smoke.

  • Ovi Store's infrastructure is slowly rotting away.
  • Blip.tv will be removing accounts/videos on September 1st, 2014.


References

  1. [http://gamemakerblog.com/2014/10/04/its-official-digital-store-will-replace-gamemaker-sandbox/

DeathwatchAlive... OR ARE THEYProjects


[view]  [edit]                   Archive Team                  
Current events Alive... OR ARE THEY · Deathwatch · Projects · Download available archives
Archiveteam.jpg
Archiving projects Archive.is · BetaArchive · Internet Archive · It Died · OldApps.com · OldVersion.com · OSBetaArchive · TEXTFILES
The Dead, the Dying & The Damned · UK Web Archive · WebCite
Blogs/website hosts Angelfire · Blog.pl · Blogger · Blogster · Brace.io · EtherPad · FortuneCity · Free ProHosting · Fuelmyblog · GeoCities (patch) · Google Business Sitebuilder · Google Sites · ISP Hosting · Jux · LiveJournal · My Opera · Open Diary · Ownlog · Posterous · Prodigy.net · Proust · Roon · Splinder · Swipnet · Tripod · Verizon Mysite · Verizon Personal Web Space · Vox · Windows Live Spaces · Wordpress.com · Xanga · Yahoo! Blog · Zapd
Cloud hosting/file sharing services AnyHub · Box · Dropbox · Google Drive · Google Groups Files · iCloud · MediaCrush · MediaFire · Mega · MegaUpload · MobileMe · OneDrive · RapidShare
Corporations Apple · IBM · Google · Microsoft · Yahoo!
Events Arab Spring · Occupy movement · Spanish Revolution
Font Repos Google Web Fonts · GNU FreeFont · Fontspace
Image hosting services Canv.as · Camera+ · Cameroid · Flickr · Fotopedia · Geograph Britain and Ireland · ImageShack · Imgur · Instagr.am · Madden GIFERATOR · MLKSHK · Microsoft Clip Art · Nokia Memories · Odysee · Panoramio · Photobucket · Picasa · Picplz · Ptch · puu.sh · Relay.im · Snapjoy · TwitPic · Wallbase · Webshots · Wikimedia Commons
Knowledge/Wikis arXiv · Citizendium · Edit.This · Encyclopedia Dramatica · Everything2 · infoAnarchy · GeoNames · GNUPedia · Google Books · Insurgency Wiki · Knol · Nupedia · OpenCourseWare · OpenStreetMap · Orain · Patch.com · Project Gutenberg · Puella Magi · Referata · SongMeanings · ShoutWiki · The Internet Movie Database · The Pirate Bay · TropicalWikis · Urban Dictionary · Webmonkey · Wikia · Wikidot · WikiHow · Wikkii · WikiLeaks · Wikipedia (Simple English Wikipedia) · Wikispaces · Wik.is · Wiki-Site · WikiTravel
Microblogging Heello · Identi.ca · Jaiku · Plurk · Sina Weibo · Tumblr · Twitter · TwitLonger
Music/Audio Audimated.com · digCCmixter · Dogmazic.net · exfm · Free Music Archive · Gogoyoko · Indaba Music · Jamendo · Last.fm · MOG · PureVolume · Reverbnation · ShareTheMusic · SoundCloud · Soundpedia · Twaud.io
People Aaron Swartz · Michael S. Hart · Steve Jobs · Mark Pilgrim · Dennis Ritchie · Len Sassaman Project
Q&A Askville · Answerbag · Answers.com · Ask.com · Askalo · Baidu Knows · Blurtit · ChaCha · Experts Exchange · Formspring · GirlsAskGuys · Google Answers · Google Questions and Answers · JustAnswer · MetaFilter · Quora · StackExchange · The AnswerBank · The Internet Oracle · Uclue · WikiAnswers · Yahoo! Answers
Social bookmarking Addinto · Backflip · Balatarin · BibSonomy · Bkmrx · Blinklist · BlogMarks · BookmarkSync · CiteULike · Connotea · Delicious · Digg · Diigo · Dir.eccion.es · Evernote · Excite Bookmark · Faves · Favilous · folkd · Freelish · Getboo · GiveALink.org · Gnolia · Google Bookmarks · Hacker News · HeyStaks · IndianPad · Kippt · Knowledge Plaza · Licorize · Linkwad · Menéame · Microsoft Developer Network · Microsoft TechNet · Mister Wong · My Web · Mylink Vault · Newsvine · Oneview · Pearltrees · Pinboard · Pocket · Reddit · sabros.us · Scloog · Scuttle · Simpy · SiteBar · Slashdot · Squidoo · StumbleUpon · Twine · Vizited · Yummymarks · Xmarks · Yahoo! Buzz · ZipList · Zootool · Zotero
Social networks Bebo · BlackPlanet · Classmates.com · Cyworld · deviantART · Dogster · Dopplr · douban · Ello · Facebook · Flixster · Friendster · Gaia Online · Google+ · Habbo · hi5 · Hyves · LinkedIn · Miiverse · mixi · MyHeritage · MyLife · Myspace · Netlog · Odnoklassniki · Orkut · Plaxo · Qzone · Renren · Skyrock · Sonico.com · Tagged · Upcoming · Viadeo · Vkontakte · WeeWorld · Weibo · Wretch · Yahoo! Groups · Yahoo! Neighbors · Yahoo! Stars India · more sites...
Software/code hosting services Android Development · Alioth · Assembla · BerliOS · Betavine · Bitbucket · BountySource · Codecademy · CodePlex · Freepository · Free Software Foundation · GNU Savannah · GitHub · GitHub Downloads · Gitorious · Gna! · Google Code · ibiblio · java.net · JavaForge · KnowledgeForge · Launchpad · LuaForge · Maemo · mozdev · OSOR.eu · OW2 Consortium · Openmoko · Ourproject.org · Ovi Store · Project Kenai · RubyForge · SEUL.org · SourceForge · TestFlight · tigris.org · Transifex · TuxFamily · Yahoo! Downloads
Video hosting services Academic Earth · Blip.tv · Epic · Google Video · Justin.tv · Nokia Trailers · TED Talks · Ustream · Viddler · Viddy · Vimeo · Vstreamers · Yahoo! Video · YouTube
Other 4chan · Akoha · Ancestry.com · April Fools' Day · Amplicate · Circavie · Club Nintendo · Cobook · Co.mments · Countdown · Distill · Dmoz · Easel · Electronic Frontier Foundation · Emulation Zone · ESPN Forums · FanFiction.Net · Feedly · Ficlets · FriendFeed · forums.starwars.com · FunnyExam.com · FurAffinity · Game Developer Magazine · Gopher · Google Books Ngram · Google Helpouts · Google Reader · Halo · IFTTT · isoHunt · Jajah · Lulu Poetry · Mailman · Minecraft.net · Mochi Media · Mozilla Firefox · MyBlogLog · NBII · Pastebin · Propeller.com · Quantcast · Quizilla · RadioShack · Salon Table Talk · Scoop · Slidecast · SOPA blackout pages · starwars.yahoo.com · Windows Technical Preview · World Wide Web
Teams Bibliotheca Anonoma · LibreTeam · URLTeam · Yahoo Video Warroom · WikiTeam
About Archive Team Introduction · Philosophy · Who We Are · Why Back Up? · Software · Films and documentaries about archiving · Formats · Cheap storage · Storage Media · Recommended Reading · FAQ
Personal tools