Difference between revisions of "Bitbucket"

From Archiveteam
Jump to: navigation, search
(Separate boxes for the two projects + minor cleanup)
(Update and cleanup/slight restructuring, move actual repositories to separate Mercurial page)
 
(2 intermediate revisions by 2 users not shown)
Line 4: Line 4:
 
| logo = bitbucket-atlassian-logo.png
 
| logo = bitbucket-atlassian-logo.png
 
| image = bitbucket-screenshot.png
 
| image = bitbucket-screenshot.png
| project_status = {{specialcase}}
+
| project_status = {{online}}
| archiving_status = {{inprogress}}
+
| archiving_status = {{notsavedyet}}
 
| irc = kickthebucket
 
| irc = kickthebucket
 
| irc_network = hackint
 
| irc_network = hackint
Line 13: Line 13:
  
 
== Mercurial repositories ==
 
== Mercurial repositories ==
It announced on 20 August 2019 that it would be ending Mercurial support to focus exclusively on Git.<ref>{{URL|https://bitbucket.org/blog/sunsetting-mercurial-support-in-bitbucket}}</ref> Creating new Mercurial repositories was disabled on 1 February 2020, and all Mercurial repositories and API will be removed on 1 July 2020.<ref>The original sunset date was 1 June, but on 21 April this was pushed back due to [[Coronavirus]].</ref>
+
It announced on 20 August 2019 that it would be ending Mercurial support to focus exclusively on Git.<ref>{{URL|https://bitbucket.org/blog/sunsetting-mercurial-support-in-bitbucket}}</ref> Creating new Mercurial repositories was disabled on 1 February 2020, and all Mercurial repositories and API were to be removed on 1 July 2020.<ref>The original sunset date was 1 June, but on 21 April this was pushed back due to [[Coronavirus]].</ref> The actual removal happened in mid-August 2020.
  
== Archival ==
+
=== Archival ===
 
{{Infobox project
 
{{Infobox project
 
| title = Bitbucket Mercurial web content
 
| title = Bitbucket Mercurial web content
| project_status = {{closing}}
+
| project_status = {{offline}}
| archiving_status = {{inprogress}}
+
| archiving_status = {{saved}}
 
| source = [https://github.com/ArchiveTeam/bitbucket-grab bitbucket-grab]
 
| source = [https://github.com/ArchiveTeam/bitbucket-grab bitbucket-grab]
 
| tracker = [https://tracker.archiveteam.org/bitbucket/ bitbucket]
 
| tracker = [https://tracker.archiveteam.org/bitbucket/ bitbucket]
Line 25: Line 25:
 
| irc_network = hackint
 
| irc_network = hackint
 
}}
 
}}
{{Infobox project
 
| title = Bitbucket Mercurial repositories
 
| project_status = {{closing}}
 
| archiving_status = {{inprogress}}
 
| source = [https://github.com/ArchiveTeam/mercurial-grab mercurial-grab]
 
| tracker = [https://tracker.archiveteam.org/mercurial/ mercurial]
 
| irc = kickthebucket
 
| irc_network = hackint
 
}}
 
 
There is Mercurial tooling to deal with the repositories themselves, but these do not include issues, wikis, and other metadata. (What is the status of PRs?) This will have to be separately scraped from Bitbucket's website and/or API.
 
  
We have an enumeration of existing Mercurial repositories (scraped from Bitbucket's search API after the February lockdown). These repositories will be writable up to the June deletion. The discovery has an <code>updated-on</code> field, but it's not clear whether this is the repository or its metadata.
+
Our archival was based on an enumeration of Mercurial repositories from Bitbucket's search API after the February lockdown. Repositories were still writable until they were made read-only in early July. Although the API returns an <code>updated-on</code> field, it is not clear whether this is the repository or its metadata.
  
Our archival is split into two projects: [https://tracker.archiveteam.org/mercurial/ mercurial] retrieves the actual hg repositories, and [https://tracker.archiveteam.org/bitbucket/ bitbucket] covers the web interface (issues, pull requests, wikis, etc.).
+
The project was split into two parts: the actual hg repositories were retrieved through the [[Mercurial]] project (developed for this but reusable for hg repositories in general), and [https://tracker.archiveteam.org/bitbucket/ bitbucket] covered the web interface (issues, pull requests, wikis, etc.). Apart from around 200 odd repositories, we managed to archive everything successfully.
  
==== Statistics ====
+
=== Statistics ===
 
* Total repos online: 245,068
 
* Total repos online: 245,068
 
* Total reported size (fairly accurate): 5.23 TiB (does this include hg compression?)
 
* Total reported size (fairly accurate): 5.23 TiB (does this include hg compression?)
Line 48: Line 37:
 
* Maximum reported size: 14.4 GiB
 
* Maximum reported size: 14.4 GiB
  
==== Repo cloning ====
+
=== Existing discussion and tooling ===
From the <code>hg</code> docs:
 
 
 
<blockquote>In normal clone mode, the remote normalizes repository data into a common exchange format and the receiving end translates this data into its local storage format. --stream activates a different clone mode that essentially copies repository files from the remote with minimal data processing. This significantly reduces the CPU cost of a clone both remotely and locally. However, it often increases the transferred data size by 30-40%. This can result in substantially faster clones where I/O throughput is plentiful, especially for larger repositories. A side-effect of --stream clones is that storage settings and requirements on the remote are applied locally: a modern client may inherit legacy or inefficient storage used by the remote or a legacy Mercurial client may not be able to clone from a modern Mercurial remote.</blockquote>
 
 
 
<code>hg clone</code> produces a directory with a working copy, plus the .hg directory containing version control data. However, this is internally sent as a bundle, which if captured can be unbundled normally.
 
 
 
hg network protocols:
 
* [https://www.mercurial-scm.org/repo/hg-stable/file/tip/mercurial/streamclone.py streamclone.py]
 
* [https://www.mercurial-scm.org/repo/hg-stable/file/tip/mercurial/statichttprepo.py statichttprepo.py]
 
* [https://www.mercurial-scm.org/repo/hg-stable/file/tip/mercurial/helptext/internals/wireprotocol.txt wireprotocol.txt]
 
Requests over HTTP can be WARCed.
 
 
 
==== Existing discussion and tooling ====
 
 
* [https://community.atlassian.com/t5/Bitbucket-articles/What-to-do-with-your-Mercurial-repos-when-Bitbucket-sunsets/ba-p/1155380 Forum thread]
 
* [https://community.atlassian.com/t5/Bitbucket-articles/What-to-do-with-your-Mercurial-repos-when-Bitbucket-sunsets/ba-p/1155380 Forum thread]
 
** [https://community.atlassian.com/t5/Bitbucket-articles/What-to-do-with-your-Mercurial-repos-when-Bitbucket-sunsets/ba-p/1155380/page/7#M321 Some user questions]
 
** [https://community.atlassian.com/t5/Bitbucket-articles/What-to-do-with-your-Mercurial-repos-when-Bitbucket-sunsets/ba-p/1155380/page/7#M321 Some user questions]
Line 73: Line 49:
 
== References ==
 
== References ==
 
<references/>
 
<references/>
 +
 +
{{Navigation box}}

Latest revision as of 16:23, 3 September 2020

Bitbucket
Bitbucket logo
Bitbucket-screenshot.png
URL https://bitbucket.org/
Project status Online!
Archiving status Not saved yet
Project source Unknown
Project tracker Unknown
IRC channel #kickthebucket (on hackint)
Project lead Unknown

Bitbucket is a version control repository hosting service, marketed mostly towards proprietary and enterprise software but with a substantial FLOSS presence.

Mercurial repositories

It announced on 20 August 2019 that it would be ending Mercurial support to focus exclusively on Git.[1] Creating new Mercurial repositories was disabled on 1 February 2020, and all Mercurial repositories and API were to be removed on 1 July 2020.[2] The actual removal happened in mid-August 2020.

Archival

Bitbucket Mercurial web content
Bitbucket logo
Employee captured tearing page.png
URL None
Project status Offline
Archiving status Saved!
Project source bitbucket-grab
Project tracker bitbucket
IRC channel #kickthebucket (on hackint)
Project lead Unknown

Our archival was based on an enumeration of Mercurial repositories from Bitbucket's search API after the February lockdown. Repositories were still writable until they were made read-only in early July. Although the API returns an updated-on field, it is not clear whether this is the repository or its metadata.

The project was split into two parts: the actual hg repositories were retrieved through the Mercurial project (developed for this but reusable for hg repositories in general), and bitbucket covered the web interface (issues, pull requests, wikis, etc.). Apart from around 200 odd repositories, we managed to archive everything successfully.

Statistics

  • Total repos online: 245,068
  • Total reported size (fairly accurate): 5.23 TiB (does this include hg compression?)
  • Mean reported size: 22.4 MiB
  • Median reported size: 205 KiB
  • Maximum reported size: 14.4 GiB

Existing discussion and tooling

Site structure

Some API requires auth, some does not. Rate limits are documented here.

References

  1. https://bitbucket.org/blog/sunsetting-mercurial-support-in-bitbucket[IAWcite.todayMemWeb]
  2. The original sunset date was 1 June, but on 21 April this was pushed back due to Coronavirus.


v · t · e         Archive Team
Current events

Alive... OR ARE THEY · Deathwatch · Projects

Archiveteam.jpg
Archiving projects

APKMirror · Archive.is · BetaArchive · Government Backup (#datarefuge · ftp-gov· Gmane · Internet Archive · It Died · Megalodon.jp · OldApps.com · OldVersion.com · OSBetaArchive · TEXTFILES.COM · The Dead, the Dying & The Damned · The Mail Archive · UK Web Archive · WebCite · Vaporwave.me

Blogging

Blog.pl · Blogger · Blogster · Blogter.hu · Freeblog.hu · Fuelmyblog · Jux · LiveJournal · My Opera · Nolblog.hu · Open Diary · ownlog.com · Posterous · Powerblogs · Proust · Roon · Splinder · Tumblr · Vox · Weblog.nl · Windows Live Spaces · Wordpress.com · Xanga · Yahoo! Blog · Zapd

Cloud hosting/file sharing

aDrive · AnyHub · Box · Dropbox · Docstoc · Google Drive · Google Groups Files · iCloud · Fileplanet · LayerVault · MediaCrush · MediaFire · Mega · MegaUpload · MobileMe · OneDrive · Pomf.se · RapidShare · Ubuntu One · Yahoo! Briefcase

Corporations

Apple · IBM · Google · Loblaw · Lycos Europe · Microsoft · Yahoo!

Events

Arab Spring · Great Ape-Snake War · Spanish Revolution

Font Repos

DaFont · Google Web Fonts · GNU FreeFont · Fontspace

Forums/Message boards

4chan · Captain Luffy Forums · College Confidential · DSLReports · ESPN Forums · forums.starwars.com · HeavenGames · Invisionfree · NeoGAF · The Classic Horror Film Board · Yahoo! Messages · Yahoo! Neighbors · Yuku.com

Gaming

Atomicgamer · Bazaar.tf · City of Heroes · Club Nintendo · Counter-Strike: Global Offensive · CS:GO Lounge · Desura · Dota 2 · Dota 2 Lounge · Emulation Zone · ESEA · GameBanana · GameMaker Sandbox · GameTrailers · Halo · HLTV.org · HQ Trivia · Infinite Crisis · joinDOTA · League of Legends · Liquipedia · Minecraft.net · Player.me · Playfire · Raptr · Steam · SteamDB · SteamGridDB · Team Fortress 2 · TF2 Outpost · Warhammer · Xfire

Image hosting

500px · AOL Pictures · Blipfoto · Blingee · Canv.as · Camera+ · Cameroid · DailyBooth · Degree Confluence Project · deviantART · Demotivalo.net · Flickr · Fotoalbum.hu · Fotolog.com · Fotopedia · Frontback · Geograph Britain and Ireland · Giphy · GTF Képhost · ImageShack · Imgh.us · Imgur · Inkblazers · Instagram · Kepfeltoltes.hu · Kephost.com · Kephost.hu · Kepkezelo.com · Keptarad.hu · Madden GIFERATOR · MLKSHK · Microsoft Clip Art · Microsoft Photosynth · Nokia Memories · noob.hu · Odysee · Panoramio · Photobucket · Picasa · Picplz · Pixiv · Portalgraphics.net · PSharing · Ptch · puu.sh · Rawporter · Relay.im · ScreenshotsDatabase.com · Snapjoy · Streetfiles · Tabblo · Tinypic · Trovebox · TwitPic · Wallbase · Wallhaven · Webshots · Wikimedia Commons

Knowledge/Wikis

arXiv · Citizendium · Clipboard.com · Deletionpedia · EditThis · Encyclopedia Dramatica · Etherpad · Everything2 · infoAnarchy · GeoNames · GNUPedia · Google Books (Google Books Ngram· Horror Movie Database · Insurgency Wiki · Knol · Lost Media Wiki · Neoseeker.com · Notepad.cc · Nupedia · OpenCourseWare · OpenStreetMap · Orain · Pastebin · Patch.com · Project Gutenberg · Puella Magi · Referata · Resedagboken · SongMeanings · ShoutWiki · The Internet Movie Database · TropicalWikis · Uncyclopedia · Urban Dictionary · Urban Exploration Resource · Webmonkey · Wikia · Wikidot · WikiHow · Wikkii · WikiLeaks · Wikipedia (Simple English Wikipedia· Wikispaces · Wikispot · Wik.is · Wiki-Site · WikiTravel · Word Count Journal

Magazines/Blogs/News

Cyberpunkreview.com · Game Developer Magazine · Gigaom · Hardware Canucks · Helium · JPG Magazine · Make Magazine · Polygamia.pl · San Fransisco Bay Guardian · Scoop · Regretsy · Yahoo! Voices

Microblogging

Heello · Identi.ca · Jaiku · Mommo.hu · Plurk · Sina Weibo · Twitter · TwitLonger

Music/Audio

AOL Music · Audimated.com · Cinch · digCCmixter · Dogmazic.net · Earbits · exfm · Free Music Archive · Gogoyoko · Indaba Music · Instacast · Jamendo · Last.fm · Music Unlimited · MOG · PureVolume · Reverbnation · ShareTheMusic · SoundCloud · Soundpedia · This Is My Jam · TuneWiki · Twaud.io · WinAmp

People

Aaron Swartz · Michael S. Hart · Steve Jobs · Mark Pilgrim · Dennis Ritchie · Len Sassaman Project

Protocols/Infrastructure

FTP · Gopher · IRC · Usenet · World Wide Web
BitTorrent DHT

Q&A

Askville · Answerbag · Answers.com · Ask.com · Askalo · Baidu Knows · Blurtit · ChaCha · Experts Exchange · Formspring · GirlsAskGuys · Google Answers · Google Baraza · JustAnswer · MetaFilter · Quora · Retrospring · StackExchange · The AnswerBank · The Internet Oracle · Uclue · WikiAnswers · Yahoo! Answers

Recipes/Food

Allrecipes · Epicurious · Food.com · Foodily · Food Network · Punchfork · ZipList

Social bookmarking

Addinto · Backflip · Balatarin · BibSonomy · Bkmrx · Blinklist · BlogMarks · BookmarkSync · CiteULike · Connotea · Delicious · Designer News · Digg · Diigo · Dir.eccion.es · Evernote · Excite Bookmark · Faves · Favilous · folkd · Freelish · Getboo · GiveALink.org · Gnolia · Google Bookmarks · Hacker News · HeyStaks · IndianPad · Kippt · Knowledge Plaza · Licorize · Linkwad · Menéame · Microsoft Developer Network · myVIP · Mister Wong · My Web · Mylink Vault · Newsvine · Oneview · Pearltrees · Pinboard · Pocket · Propeller.com · Reddit · sabros.us · Scloog · Scuttle · Simpy · SiteBar · Slashdot · Squidoo · StumbleUpon · Twine · Vizited · Yummymarks · Xmarks · Yahoo! Buzz · Zootool · Zotero

Social networks

Bebo · BlackPlanet · Classmates.com · Cyworld · Dogster · Dopplr · douban · Ello · Facebook · Flixster · FriendFeed · Friendster · Friends Reunited · Gaia Online · Google+ · Habbo · hi5 · Hyves · iWiW · LinkedIn · Miiverse · mixi · MyHeritage · MyLife · Myspace · myVIP · Netlog · Odnoklassniki · Orkut · Plaxo · Qzone · Renren · Skyrock · Sonico.com · Storylane · Tagged · tvtag · Upcoming · Viadeo · Vine · Vkontakte · WeeWorld · Weibo · Wretch · Yahoo! Groups · Yahoo! Stars India · Yahoo! Upcoming · more sites...

Shopping/Retail

Alibaba · AliExpress · Amazon · Apple Store · Barnes & Noble · DirectCanada · eBay · Kmart · NCIX · Printfection · RadioShack · Sears · Sears Canada · Target · The Book Depository · ThinkGeek · Toys "R" Us · Walmart

Software/code hosting

Android Development · Alioth · Assembla · BerliOS · Betavine · Bitbucket · BountySource · Codecademy · CodePlex · Freepository · Free Software Foundation · GNU Savannah · GitHost  · GitHub · GitHub Downloads · Gitorious · Gna! · Google Code · ibiblio · java.net · JavaForge · KnowledgeForge · Launchpad · LuaForge · Maemo · mozdev · OSOR.eu · OW2 Consortium · Openmoko · OpenSolaris · Ourproject.org · Ovi Store · Project Kenai · RubyForge · SEUL.org · SourceForge · Stypi · TestFlight · tigris.org · Transifex · TuxFamily · Yahoo! Downloads

Television/Radio

ABC · Austin City Limits · BBC · CBC · CBS · Computer Chronicles · CTV · Fox · G4 · Global TV · Jeopardy! · NBC · NHK · PBS · Penn & Teller: Bullshit! · The Howard Stern Show · TV News Archive (Understanding 9/11)

Torrenting/Piracy

ExtraTorrent · EZTV · isoHunt · KickassTorrents · The Pirate Bay · Torrentz · Library Genesis

Video hosting

Academic Earth · Bambuser · Blip.tv · Epic · Google Video · Justin.tv · Niconico · Nokia Trailers · Oddshot.tv · Plays.tv · Qwiki · Skillfeed · Stickam · TED Talks · Ticker.tv · Twitch.tv · Ustream · Videoplayer.hu · Viddler · Viddy · Vidme · Vimeo · Vine · Vstreamers · Yahoo! Video · YouTube · Famous Internet videos (Me at the zoo)

Web hosting

Angelfire · Brace.io · BT Internet · CableAmerica Personal Web Space · Claranet Netherlands Personal Web Pages · Comcast Personal Web Pages · Extra.hu · FortuneCity · Free ProHosting · GeoCities (patch· Google Business Sitebuilder · Google Sites · Internet Centrum · MBinternet · MSN TV · Nifty · Nwnyet · Parodius Networking · Prodigy.net · Saunalahti Iso G · Swipnet · Telenor · Tripod · University of Michigan personal webpages · Verizon Mysite · Verizon Personal Web Space · Webzdarma · Virgin Media

Web applications

Mailman · MediaWiki · phpBB · Simple Machines Forum · vBulletin

Information

A Million Ways to Die on the Web · Backup Tips · Cheap storage · Collecting items randomly · Data compression algorithms and tools · Dev · Discovery Data · DOS Floppies · Fortress of Solitude · Keywords · Naughty List · Nightmare Projects · Rescuing floppy disks · Rescuing optical media · Site exploration · The WARC Ecosystem · Working with ARCHIVE.ORG

Projects

ArchiveCorps · Audit2014 · Emularity · Faceoff · FlickrFckr · Froogle · INTERNETARCHIVE.BAK (Internet Archive Census· IRC Quotes · JSMESS · JSVLC · Just Solve the Problem · NewsGrabber · Project Newsletter · Valhalla · Web Roasting (ISP Hosting · University Web Hosting· Woohoo

Tools

ArchiveBot · ArchiveTeam Warrior (Tracker· Google Takeout · HTTrack · Video downloaders · Wget (Lua · WARC)

Teams

Bibliotheca Anonoma · LibreTeam · URLTeam · Yahoo Video Warroom · WikiTeam

Other

800notes · AOL · Akoha · Ancestry.com · April Fools' Day · Amplicate · AutoAdmit · Bre.ad · Circavie · Cobook · Co.mments · Countdown · Discourse · Distill · Dmoz · Easel · Eircode · Electronic Frontier Foundation · FanFiction.Net · Feedly · Ficlets · Forrst · FunnyExam.com · FurAffinity · Google Helpouts · Google Moderator · Google Reader · ICQmail · IFTTT · Jajah · JuniorNet · Lulu Poetry · Mobile Phone Applications · Mochi Media · Mozilla Firefox · MyBlogLog · NBII · Neopets · Quantcast · Quizilla · Salon Table Talk · Shutdownify · Slidecast · Stack Overflow · SOPA blackout pages · starwars.yahoo.com · TechNet · Toshiba Support · USA-Gov · Volán · Widgetbox · Windows Technical Preview · Wunderlist · YTMND · Zoocasa

About Archive Team

Introduction · Philosophy · Who We Are · Our stance on robots.txt · Why Back Up? · Software · Formats · Storage Media · Recommended Reading · Films and documentaries about archiving · Talks · In The Media · FAQ