My Opera

From Archiveteam
Jump to: navigation, search
My Opera
My Opera logo
My Opera 2013-10-31 20-11-38.png
URL my.opera.com [IA] [WebCite] [archive.is], files.myopera.com [IA] [WebCite] [archive.is]
Project status Offline
Archiving status Saved!
Project source https://github.com/ArchiveTeam/myopera-grab
Project tracker http://tracker.archiveteam.org/myopera
IRC channel #fatlady

My Opera is a social media website for the Opera browser. Originally started as just a support forum, it later expanded to include blogging, image/file hosting, and email.

On October 31, 2013, Opera announced they would shut My Opera down on March 1, 2014. On February 19, 2014, this was changed to March 3, 2014.

Shutdown notice

... The explosion of these sites and the amount of resources we need to maintain our own service has changed our outlook on My Opera. We had a good run for many years, but we believe your content could have a better home elsewhere, so we have made the decision to shut down My Opera as of March 1, 2014. [1]

The shutdown affects blogs/comments, files, email, and their forums ('The most important existing threads will be moved...').

On February 24, 2014, this message was placed on each forum/topic:

The My Opera forums are being replaced by our new forums. Starting February 26th, the My Opera forums will be in read-only mode. On March 3rd, they will be removed along with the rest of My Opera.

Archives

There are two ways to browse the archives:

  • If you don't find anything, you can try extracting the files from the WARC files with some WARC tools.
    • This method requires power user skills. In essence, scan each CDX index file and then extract it from the appropriate WARC files. Ask us in IRC for help.

Links to WARC files:

Archiving and contributing

Phase 0: Initial crawl

  • Grab a seed list of users from the location pages. Done! 1,621,618 usernames (location pages + files.myopera.com initial grab + attempted forum grab + wallpaper grab)
  • Grab a list of links to all forum topics and all pages. (In progress by User:Mithrandir) Had some issues with this one (my fault), probably better to:
  • Crawl all forums topics and pages (Not in progress)
  • Grab user-uploaded Opera-themed wallpapers Done!

Phase 1: Username crawl

https://github.com/garyrh/myopera-username-grab

Fortunately, Opera was kind enough to provide us with a complete list of all non-banned users (thanks to Atluxity for setting this up!)

Full list in 7z archive

(Note: The original list had some encoding issues as well as some invalid usernames, but this was relatively easy to fix with the FTFY library. You can download the original, non-fixed list here and here.)

Phase 2: Content crawl

Please run your Warrior and select the My Opera project. Wait a while for it to install extra things and it should start downloading.

Alternatively, the scripts can be run manually and instructions are located here: https://github.com/ArchiveTeam/myopera-grab

Site structure notes

  • As of 2009, there were around 16 million users.
  • LOTS of old data, abandoned accounts, etc.
  • Forum topics go all the way back to Post #1 dated 7 September, 2001.
    • It looks like there are nearly 2 million topics.
  • There's a sub-domain that houses user uploaded data.
    • Initial Bing-crawled list here (5572 urls)
    • Tons of webpages, pdfs, images, archives, etc.
    • Some of this data is linked to in blog posts, so we should crawl blogs for this as well.
  • Pretty much any UTF-8 (?) character can be used in a username (e.g. §|-|€€PJU|)g³ or ္္္္္္္floppyeye)

Username discovery

  • This is a little tricky, as the only pages with a large list of usernames are the location pages.
    • Each country list is paginated, showing at most 72 users per page.
    • For large countries, going past page ~325 returns many 503 errors.
    • Assuming an average of 100 retrievable pages per country and 246 countries, that's 72*100*246 = 1,771,200 users.
    • Excludes users who haven't set their location, which seems to be a lot.
    • This should be good for a seed list.
  • Forum posts, blog comments, friend lists, and groups seem to be the best ways to get the most usernames. (Maybe throw in the recent visitors list too.)

References

  1. Important announcement about your My Opera account

External Links


[view]  [edit]                   Archive Team                  
Current events Alive... OR ARE THEY  · Deathwatch  · Projects
Archiveteam.jpg
Archiving projects APKMirror  · Archive.is  · BetaArchive  · Government Backup  · Gmane  · Internet Archive  · It Died  · Megalodon.jp  · OldApps.com  · OldVersion.com  · OSBetaArchive  · TEXTFILES.COM  · The Dead, the Dying & The Damned  · The Mail Archive  · UK Web Archive  · WebCite  · Vaporwave.me
Blogging Blog.pl  · Blogger  · Blogster  · Blogter.hu  · Freeblog.hu  · Fuelmyblog  · Jux  · LiveJournal  · My Opera  · Nolblog.hu  · Open Diary  · ownlog.com  · Posterous  · Powerblogs  · Proust  · Roon  · Splinder  · Tumblr  · Vox  · Weblog.nl  · Windows Live Spaces  · Wordpress.com  · Xanga  · Yahoo! Blog  · Zapd
Cloud hosting/file sharing aDrive  · AnyHub  · Box  · Dropbox  · Docstoc  · Google Drive  · Google Groups Files  · iCloud  · Fileplanet  · LayerVault  · MediaCrush  · MediaFire  · Mega  · MegaUpload  · MobileMe  · OneDrive  · Pomf.se  · RapidShare  · Ubuntu One  · Yahoo! Briefcase
Corporations Apple  · IBM  · Google  · Lycos Europe  · Microsoft  · Yahoo!
Events Arab Spring  · Great Ape-Snake War  · Spanish Revolution
Font Repos Google Web Fonts  · GNU FreeFont  · Fontspace
Forums/Message boards 4chan  · Captain Luffy Forums  · College Confidential  · DSLReports  · ESPN Forums  · forums.starwars.com  · HeavenGames  · Invisionfree  · The Classic Horror Film Board  · Yahoo! Messages  · Yahoo! Neighbors  · Yuku.com
Gaming Atomicgamer  · City of Heroes  · Club Nintendo  · CS:GO Lounge  · Desura  · Dota 2 Lounge  · Emulation Zone  · GameMaker Sandbox  · GameTrailers  · Halo  · HLTV.org  · Infinite Crisis  · Minecraft.net  · Player.me  · Playfire  · Steam  · SteamDB  · TF2 Outpost  · Warhammer  · Xfire
Image hosting 500px  · AOL Pictures  · Blipfoto  · Blingee  · Canv.as  · Camera+  · Cameroid  · DailyBooth  · Degree Confluence Project  · deviantART  · Demotivalo.net  · Flickr  · Fotoalbum.hu  · Fotolog.com  · Fotopedia  · Frontback  · Geograph Britain and Ireland  · GTF Képhost  · ImageShack  · Imgur  · Inkblazers  · Instagr.am  · Kepfeltoltes.hu  · Kephost.com  · Kephost.hu  · Kepkezelo.com  · Keptarad.hu  · Madden GIFERATOR  · MLKSHK  · Microsoft Clip Art  · Microsoft Photosynth  · Nokia Memories  · noob.hu  · Odysee  · Panoramio  · Photobucket  · Picasa  · Picplz  · Pixiv  · PSharing  · Ptch  · puu.sh  · Rawporter  · Relay.im  · ScreenshotsDatabase.com  · Snapjoy  · Streetfiles  · Tabblo  · Trovebox  · TwitPic  · Wallbase  · Wallhaven  · Webshots  · Wikimedia Commons
Knowledge/Wikis arXiv  · Citizendium  · Clipboard.com  · Deletionpedia  · EditThis  · Encyclopedia Dramatica  · Etherpad  · Everything2  · infoAnarchy  · GeoNames  · GNUPedia  · Google Books (Google Books Ngram)  · Horror Movie Database  · Insurgency Wiki  · Knol  · Library Genesis  · Lost Media Wiki  · Neoseeker.com  · Notepad.cc  · Nupedia  · OpenCourseWare  · OpenStreetMap  · Orain  · Pastebin  · Patch.com  · Project Gutenberg  · Puella Magi  · Referata  · Resedagboken  · SongMeanings  · ShoutWiki  · The Internet Movie Database  · TropicalWikis  · Uncyclopedia  · Urban Dictionary  · Webmonkey  · Wikia  · Wikidot  · WikiHow  · Wikkii  · WikiLeaks  · Wikipedia (Simple English Wikipedia)  · Wikispaces  · Wikispot  · Wik.is  · Wiki-Site  · WikiTravel  · Word Count Journal
Magazines/Blogs/News Cyberpunkreview.com  · Game Developer Magazine  · Gigaom  · Helium  · JPG Magazine  · Polygamia.pl  · San Fransisco Bay Guardian  · Scoop  · Regretsy  · Yahoo! Voices
Microblogging Heello  · Identi.ca  · Jaiku  · Mommo.hu  · Plurk  · Sina Weibo  · Twitter  · TwitLonger
Music/Audio AOL Music  · Audimated.com  · Cinch  · digCCmixter  · Dogmazic.net  · Earbits  · exfm  · Free Music Archive  · Gogoyoko  · Indaba Music  · Instacast  · Jamendo  · Last.fm  · Music Unlimited  · MOG  · PureVolume  · Reverbnation  · ShareTheMusic  · SoundCloud  · Soundpedia  · This Is My Jam  · TuneWiki  · Twaud.io  · WinAmp
People Aaron Swartz  · Michael S. Hart  · Steve Jobs  · Mark Pilgrim  · Dennis Ritchie  · Len Sassaman Project
Protocols/Infrastructure FTP  · Gopher  · IRC  · Usenet  · World Wide Web
Q&A Askville  · Answerbag  · Answers.com  · Ask.com  · Askalo  · Baidu Knows  · Blurtit  · ChaCha  · Experts Exchange  · Formspring  · GirlsAskGuys  · Google Answers  · Google Baraza  · JustAnswer  · MetaFilter  · Quora  · Retrospring  · StackExchange  · The AnswerBank  · The Internet Oracle  · Uclue  · WikiAnswers  · Yahoo! Answers
Recipes/Food Allrecipes  · Epicurious  · Food.com  · Foodily  · Food Network  · Punchfork  · ZipList
Social bookmarking Addinto  · Backflip  · Balatarin  · BibSonomy  · Bkmrx  · Blinklist  · BlogMarks  · BookmarkSync  · CiteULike  · Connotea  · Delicious  · Designer News  · Digg  · Diigo  · Dir.eccion.es  · Evernote  · Excite Bookmark  · Faves  · Favilous  · folkd  · Freelish  · Getboo  · GiveALink.org  · Gnolia  · Google Bookmarks  · Hacker News  · HeyStaks  · IndianPad  · Kippt  · Knowledge Plaza  · Licorize  · Linkwad  · Menéame  · Microsoft Developer Network  · myVIP  · Mister Wong  · My Web  · Mylink Vault  · Newsvine  · Oneview  · Pearltrees  · Pinboard  · Pocket  · Propeller.com  · Reddit  · sabros.us  · Scloog  · Scuttle  · Simpy  · SiteBar  · Slashdot  · Squidoo  · StumbleUpon  · Twine  · Vizited  · Yummymarks  · Xmarks  · Yahoo! Buzz  · Zootool  · Zotero
Social networks Bebo  · BlackPlanet  · Classmates.com  · Cyworld  · Dogster  · Dopplr  · douban  · Ello  · Facebook  · Flixster  · FriendFeed  · Friendster  · Friends Reunited  · Gaia Online  · Google+  · Habbo  · hi5  · Hyves  · iWiW  · LinkedIn  · Miiverse  · mixi  · MyHeritage  · MyLife  · Myspace  · myVIP  · Netlog  · Odnoklassniki  · Orkut  · Plaxo  · Qzone  · Renren  · Skyrock  · Sonico.com  · Storylane  · Tagged  · tvtag  · Upcoming  · Viadeo  · Vine  · Vkontakte  · WeeWorld  · Weibo  · Wretch  · Yahoo! Groups  · Yahoo! Stars India  · Yahoo! Upcoming  · more sites...
Shopping/Retail Alibaba  · AliExpress  · Amazon  · Apple Store  · eBay  · Printfection  · RadioShack  · Sears  · Target  · The Book Depository  · ThinkGeek  · Walmart
Software/code hosting Android Development  · Alioth  · Assembla  · BerliOS  · Betavine  · Bitbucket  · BountySource  · Codecademy  · CodePlex  · Freepository  · Free Software Foundation  · GNU Savannah  · GitHost  · GitHub  · GitHub Downloads  · Gitorious  · Gna!  · Google Code  · ibiblio  · java.net  · JavaForge  · KnowledgeForge  · Launchpad  · LuaForge  · Maemo  · mozdev  · OSOR.eu  · OW2 Consortium  · Openmoko  · OpenSolaris  · Ourproject.org  · Ovi Store  · Project Kenai  · RubyForge  · SEUL.org  · SourceForge  · Stypi  · TestFlight  · tigris.org  · Transifex  · TuxFamily  · Yahoo! Downloads
Torrenting/Piracy ExtraTorrent  · EZTV  · isoHunt  · KickassTorrents  · The Pirate Bay  · Torrentz
Video hosting Academic Earth  · Blip.tv  · Epic  · Google Video  · Justin.tv  · Niconico  · Nokia Trailers  · Qwiki  · Skillfeed  · Stickam  · TED Talks  · Ticker.tv  · Twitch.tv  · Ustream  · Videoplayer.hu  · Viddler  · Viddy  · Vimeo  · Vine  · Vstreamers  · Yahoo! Video  · YouTube  · Famous Internet videos (Me at the zoo)
Web hosting Angelfire  · Brace.io  · BT Internet  · CableAmerica Personal Web Space  · Claranet Netherlands Personal Web Pages  · Comcast Personal Web Pages  · Extra.hu  · FortuneCity  · Free ProHosting  · GeoCities (patch)  · Google Business Sitebuilder  · Google Sites  · Internet Centrum  · MBinternet  · MSN TV  · Nwnyet  · Parodius Networking  · Prodigy.net  · Saunalahti Iso G  · Swipnet  · Telenor  · Tripod  · University of Michigan personal webpages  · Verizon Mysite  · Verizon Personal Web Space  · Webzdarma  · Virgin Media
Web applications Mailman  · MediaWiki  · phpBB  · Simple Machines Forum  · vBulletin
Other 800notes  · AOL  · Akoha  · Ancestry.com  · April Fools' Day  · Amplicate  · AutoAdmit  · Bre.ad  · Circavie  · Cobook  · Co.mments  · Countdown  · Distill  · Dmoz  · Easel  · Eircode  · Electronic Frontier Foundation  · FanFiction.Net  · Feedly  · Ficlets  · Forrst  · FunnyExam.com  · FurAffinity  · Google Helpouts  · Google Moderator  · Google Reader  · ICQmail  · IFTTT  · Jajah  · JuniorNet  · Lulu Poetry  · Mobile Phone Applications  · Mochi Media  · Mozilla Firefox  · MyBlogLog  · NBII  · Neopets  · Quantcast  · Quizilla  · Salon Table Talk  · Shutdownify  · Slidecast  · SOPA blackout pages  · starwars.yahoo.com  · TechNet  · Toshiba Support  · USA-Gov  · Volán  · Widgetbox  · Windows Technical Preview  · Wunderlist  · Zoocasa
Information A Million Ways to Die on the Web  · Backup Tips  · Cheap storage  · Collecting items randomly  · Data compression algorithms and tools  · Dev  · Discovery Data  · DOS Floppies  · Fortress of Solitude  · Keywords  · Naughty List  · Nightmare Projects  · Rescuing floppy disks  · Rescuing optical media  · Site exploration  · The WARC Ecosystem  · Working with ARCHIVE.ORG
Projects ArchiveCorps  · Audit2014  · Emularity  · Faceoff  · FlickrFckr  · Froogle  · ftp-gov  · INTERNETARCHIVE.BAK (Internet Archive Census)  · IRC Quotes  · JSMESS  · JSVLC  · Just Solve the Problem  · NewsGrabber  · Project Newsletter  · Valhalla  · Web Roasting (ISP Hosting  · University Web Hosting)  · Woohoo
Tools ArchiveBot  · ArchiveTeam Warrior (Tracker)  · Google Takeout  · HTTrack  · Video downloaders  · Wget (Lua  · WARC)
Teams Bibliotheca Anonoma  · LibreTeam  · URLTeam  · Yahoo Video Warroom  · WikiTeam
About Archive Team Introduction  · Philosophy  · Who We Are  · Our stance on robots.txt  · Why Back Up?  · Software  · Formats  · Storage Media  · Recommended Reading  · Films and documentaries about archiving  · Talks  · In The Media  · FAQ