Puu.sh

From Archiveteam
Jump to: navigation, search
puu.sh
Puu.sh logo
That puush could not be found
That puush could not be found
URL http://puush.me/
Project status Special case
Archiving status Saved! ~17 TB of data
Project source https://github.com/ArchiveTeam/puush-grab
Project tracker http://chfoo-d1.mooo.com:8031/puush/
IRC channel #pushharder

puu.sh is a file sharing service that was created in 2010.

Image expiry

Early on June 7th, 2013, the following email was sent out to users:

Hey guys,

We're making some important changes to puush and want to inform you of how it will affect our service.

When we first conceived puush in 2010, we wanted to create a straightforward way to help us quickly share what was on our screens. Soon after, we extended puush to allow us to throw small files around too. Since then, we’ve seen a massive uptake and tremendous support from our users. The problem is that a tremendous majority of puushes aren’t being accessed again after 24 hours - in fact, only 10% of puushes are accessed after a month.

puush to us is a quick way to share things. puush is not a data warehouse.

We do not wish to become a file locker, file storage or backup service. There are plenty of other solutions out there that do a much better job of this (e.g. Dropbox), so what we want to do is this:

  • Remove the 200mb storage limit for free users
  • Stop offering permanent storage, and files will expire after not being accessed for:
    • Free users: 1 month
    • Pro users: up to 6 months
  • Offer an optional Dropbox “sync” for pro users (i.e. automatically save a copy to dropbox)

How this will affect you after the 1st of August 2013:

  • You will no longer have an account storage limits. Feel free to puush as much as you want!
  • We are going to start expiring files. At this point, any files which haven't been recently viewed by anyone will be automatically deleted after 1 month, or up to 6 months for pro users.
  • If you wish to grab a copy of your files before this begins, you can download an archive from your My Account page (Account -> Settings -> Pools -> Export).

As an example, if you have puush'd images which are being used on a forum, as long as that thread is visited at least once a month (or up to 6 months as a pro user) your files will *always be accessible*.

This notice is also visible on the puu.sh site, where it was announced even earlier.

How to Help

If you are comfortable running scripts manually (i.e., outside the Warrior) go to the GitHub repo for information how to run the scripts.

Where can I find a file?

If you know the item ID, go the the Wayback Machine and enter the URL as http://puu.sh/XXXXX without any filename extension. The Wayback Machine treats the URL as case-insensitive so you may need to explore which URL is the one you are looking for.

If the Puush is private, it is unlikely archived as we do not guess the access code (the bunch of characters after the item ID). You can, however, use wildcards as a way of browsing the Wayback Machine. Here's an example.

Archives

Archives are uploaded to the Archive Team Puush collection. These are the original WARC files. They are 10GB in size instead of the typical 50GB because the project is staged on cloud hosting with small disk space.

Tracker information

  • The tracker and rsync target is being run by User:Chfoo.
  • On 2013-08-22, Redis was unable to background save due to failed fork().
  • On 2013-08-27, an attempt was made to clear out the tracker log. Redis crashed.
  • On 2014-01-01, an old, vulnerable auto-queue script attempted to load 36,000,000 items. Redis was killed by OOM killer. (Offending Tweet).
  • On 2014-04-27, the IP addresses of the tracker and IP of regular clients were banned. Puush also switched the default pool to private and a robots.txt file was added.
  • On 2014-05-28, the tracker is officially decommissioned.

Logs

Ranges

Date Loaded Start (Base 10) End (Base 10) Alphabet Notes
2013-08-06 0 (0) 3UXX3 (51607749) Legacy At most 10 URLs per item
2013-08-27 10 (62) 3UXX3 (51607749) Legacy At most 13 URLs per item (unlucky 13)
2013-09-08 3UXX4 (51607750) 49999 (61285459) Legacy At most 13 URLs per item
2013-09-13 4999a (61285460) 4mPOO (64547754) Puush At most 13 URLs per item
2013-09-15 4mPOP (64547755) 4rrrr (65645689) Puush At most 13 URLs per item
2013-09-16 4rrrs (65645690) 4sQ00 (65978416) Puush At most 13 URLs per item
4sQ01 (65978417) Puush At most 13 URLs per item. Auto-queues using a script that checks Twitter.

Statistics are occasionally updated on a Puush ID Increment Stats spreadsheet.

Ideas

  • Keep accessing each and every file - likely unsustainable in the long run in the event that expiry times are shortened
  • Grab everything - the site appears to use incremental images IDs

Shortcode Stats

Number of shortcodes:	 526
Number of string lengths:	 3
3 	 5 	   0.951%
4 	 125 	  23.764%
5 	 396 	  75.285%
Number of unique characters:	 62
Number of characters used:	 2495
0 	 24 	   0.962%
1 	 155 	   6.212%
2 	 234 	   9.379%
3 	 121 	   4.850%
4 	 24 	   0.962%
5 	 45 	   1.804%
6 	 26 	   1.042%
7 	 37 	   1.483%
8 	 25 	   1.002%
9 	 34 	   1.363%
A 	 46 	   1.844%
B 	 37 	   1.483%
C 	 46 	   1.844%
D 	 38 	   1.523%
E 	 36 	   1.443%
F 	 42 	   1.683%
G 	 33 	   1.323%
H 	 31 	   1.242%
I 	 37 	   1.483%
J 	 32 	   1.283%
K 	 38 	   1.523%
L 	 35 	   1.403%
M 	 28 	   1.122%
N 	 39 	   1.563%
O 	 31 	   1.242%
P 	 44 	   1.764%
Q 	 28 	   1.122%
R 	 36 	   1.443%
S 	 31 	   1.242%
T 	 26 	   1.042%
U 	 29 	   1.162%
V 	 32 	   1.283%
W 	 45 	   1.804%
X 	 30 	   1.202%
Y 	 29 	   1.162%
Z 	 30 	   1.202%
a 	 34 	   1.363%
b 	 39 	   1.563%
c 	 32 	   1.283%
d 	 46 	   1.844%
e 	 27 	   1.082%
f 	 30 	   1.202%
g 	 39 	   1.563%
h 	 38 	   1.523%
i 	 30 	   1.202%
j 	 34 	   1.363%
k 	 24 	   0.962%
l 	 29 	   1.162%
m 	 40 	   1.603%
n 	 40 	   1.603%
o 	 38 	   1.523%
p 	 25 	   1.002%
q 	 26 	   1.042%
r 	 34 	   1.363%
s 	 23 	   0.922%
t 	 45 	   1.804%
u 	 36 	   1.443%
v 	 27 	   1.082%
w 	 32 	   1.283%
x 	 45 	   1.804%
y 	 26 	   1.042%
z 	 22 	   0.882%

How many items are there?

<chfoo> [...] using the decentralized script i wrote, i've grabbed [randomly] 3824 items (totalling 785M) out of 6409 requests (a 60% hit rate at a max id of "40000" or 59,105,344). so, in theory, there's 35,463,206 items based on this sample and max id.



[view]  [edit]                   Archive Team                  
Current events Alive... OR ARE THEY · Deathwatch · Projects
Archiveteam.jpg
Archiving projects APKMirror · Archive.is · BetaArchive · Gmane · Internet Archive · It Died · Megalodon.jp · OldApps.com · OldVersion.com · OSBetaArchive · TEXTFILES.COM · The Dead, the Dying & The Damned · The Mail Archive · UK Web Archive · WebCite · Vaporwave.me
Blogging Blog.pl · Blogger · Blogster · Blogter.hu · Freeblog.hu · Fuelmyblog · Jux · LiveJournal · My Opera · Nolblog.hu · Open Diary · ownlog.com · Posterous · Powerblogs · Proust · Roon · Splinder · Tumblr · Vox · Weblog.nl · Windows Live Spaces · Wordpress.com · Xanga · Yahoo! Blog · Zapd
Cloud hosting/file sharing aDrive · AnyHub · Box · Dropbox · Docstoc · Google Drive · Google Groups Files · iCloud · Fileplanet · LayerVault · MediaCrush · MediaFire · Mega · MegaUpload · MobileMe · OneDrive · Pomf.se · RapidShare · Ubuntu One · Yahoo! Briefcase
Corporations Apple · IBM · Google · Lycos Europe · Microsoft · Yahoo!
Events Arab Spring · Occupy movement · Spanish Revolution
Font Repos Google Web Fonts · GNU FreeFont · Fontspace
Forums/Message boards 4chan · Captain Luffy Forums · College Confidential · DSLReports · ESPN Forums · forums.starwars.com · HeavenGames · Invisionfree · The Classic Horror Film Board · Yahoo! Messages · Yahoo! Neighbors · Yuku.com
Gaming Atomicgamer · City of Heroes · Club Nintendo · CS:GO Lounge · Desura · Dota 2 Lounge · Emulation Zone · GameMaker Sandbox · GameTrailers · Halo · HLTV.org · Infinite Crisis · Minecraft.net · Player.me · Playfire · Steam · SteamDB · Warhammer · Xfire
Image hosting 500px · AOL Pictures · Blipfoto · Blingee · Canv.as · Camera+ · Cameroid · DailyBooth · Degree Confluence Project · deviantART · Demotivalo.net · Flickr · Fotoalbum.hu · Fotolog.com · Fotopedia · Frontback · Geograph Britain and Ireland · GTF Képhost · ImageShack · Imgur · Inkblazers · Instagr.am · Kepfeltoltes.hu · Kephost.com · Kephost.hu · Kepkezelo.com · Keptarad.hu · Madden GIFERATOR · MLKSHK · Microsoft Clip Art · Nokia Memories · noob.hu · Odysee · Panoramio · Photobucket · Picasa · Picplz · PSharing · Ptch · puu.sh · Rawporter · Relay.im · ScreenshotsDatabase.com · Snapjoy · Streetfiles · Tabblo · Trovebox · TwitPic · Wallbase · Wallhaven · Webshots · Wikimedia Commons
Knowledge/Wikis arXiv · Citizendium · Clipboard.com · Deletionpedia · EditThis · Encyclopedia Dramatica · Etherpad · Everything2 · infoAnarchy · GeoNames · GNUPedia · Google Books (Google Books Ngram) · Horror Movie Database · Insurgency Wiki · Knol · Library Genesis · Lost Media Wiki · Neoseeker.com · Notepad.cc · Nupedia · OpenCourseWare · OpenStreetMap · Orain · Pastebin · Patch.com · Project Gutenberg · Puella Magi · Referata · Resedagboken · SongMeanings · ShoutWiki · The Internet Movie Database · TropicalWikis · Uncyclopedia · Urban Dictionary · Webmonkey · Wikia · Wikidot · WikiHow · Wikkii · WikiLeaks · Wikipedia (Simple English Wikipedia) · Wikispaces · Wikispot · Wik.is · Wiki-Site · WikiTravel · Word Count Journal
Magazines/Blogs/News Cyberpunkreview.com · Game Developer Magazine · Gigaom · Helium · JPG Magazine · Polygamia.pl · San Fransisco Bay Guardian · Scoop · Regretsy · Yahoo! Voices
Microblogging Heello · Identi.ca · Jaiku · Mommo.hu · Plurk · Sina Weibo · Twitter · TwitLonger
Music/Audio AOL Music · Audimated.com · Cinch · digCCmixter · Dogmazic.net · Earbits · exfm · Free Music Archive · Gogoyoko · Indaba Music · Instacast · Jamendo · Last.fm · Music Unlimited · MOG · PureVolume · Reverbnation · ShareTheMusic · SoundCloud · Soundpedia · This Is My Jam · TuneWiki · Twaud.io · WinAmp
People Aaron Swartz · Michael S. Hart · Steve Jobs · Mark Pilgrim · Dennis Ritchie · Len Sassaman Project
Protocols/Infrastructure FTP · Gopher · IRC · Usenet · World Wide Web
Q&A Askville · Answerbag · Answers.com · Ask.com · Askalo · Baidu Knows · Blurtit · ChaCha · Experts Exchange · Formspring · GirlsAskGuys · Google Answers · Google Baraza · JustAnswer · MetaFilter · Quora · Retrospring · StackExchange · The AnswerBank · The Internet Oracle · Uclue · WikiAnswers · Yahoo! Answers
Recipes/Food Allrecipes · Epicurious · Food.com · Foodily · Food Network · Punchfork · ZipList
Social bookmarking Addinto · Backflip · Balatarin · BibSonomy · Bkmrx · Blinklist · BlogMarks · BookmarkSync · CiteULike · Connotea · Delicious · Designer News · Digg · Diigo · Dir.eccion.es · Evernote · Excite Bookmark · Faves · Favilous · folkd · Freelish · Getboo · GiveALink.org · Gnolia · Google Bookmarks · Hacker News · HeyStaks · IndianPad · Kippt · Knowledge Plaza · Licorize · Linkwad · Menéame · Microsoft Developer Network · myVIP · Mister Wong · My Web · Mylink Vault · Newsvine · Oneview · Pearltrees · Pinboard · Pocket · Propeller.com · Reddit · sabros.us · Scloog · Scuttle · Simpy · SiteBar · Slashdot · Squidoo · StumbleUpon · Twine · Vizited · Yummymarks · Xmarks · Yahoo! Buzz · Zootool · Zotero
Social networks Bebo · BlackPlanet · Classmates.com · Cyworld · Dogster · Dopplr · douban · Ello · Facebook · Flixster · FriendFeed · Friendster · Friends Reunited · Gaia Online · Google+ · Habbo · hi5 · Hyves · iWiW · LinkedIn · Miiverse · mixi · MyHeritage · MyLife · Myspace · myVIP · Netlog · Odnoklassniki · Orkut · Plaxo · Qzone · Renren · Skyrock · Sonico.com · Storylane · Tagged · tvtag · Upcoming · Viadeo · Vkontakte · WeeWorld · Weibo · Wretch · Yahoo! Groups · Yahoo! Stars India · Yahoo! Upcoming · more sites...
Shopping/Retail Alibaba · AliExpress · Amazon · Apple Store · eBay · Printfection · RadioShack · Sears · Target · The Book Depository · ThinkGeek · Walmart
Software/code hosting Android Development · Alioth · Assembla · BerliOS · Betavine · Bitbucket · BountySource · Codecademy · CodePlex · Freepository · Free Software Foundation · GNU Savannah · GitHost · GitHub · GitHub Downloads · Gitorious · Gna! · Google Code · ibiblio · java.net · JavaForge · KnowledgeForge · Launchpad · LuaForge · Maemo · mozdev · OSOR.eu · OW2 Consortium · Openmoko · OpenSolaris · Ourproject.org · Ovi Store · Project Kenai · RubyForge · SEUL.org · SourceForge · Stypi · TestFlight · tigris.org · Transifex · TuxFamily · Yahoo! Downloads
Torrenting/Piracy ExtraTorrent · EZTV · isoHunt · KickassTorrents · The Pirate Bay · Torrentz
Video hosting Academic Earth · Blip.tv · Epic · Google Video · Justin.tv · Niconico · Nokia Trailers · Qwiki · Skillfeed · Stickam · TED Talks · Ticker.tv · Twitch.tv · Ustream · Videoplayer.hu · Viddler · Viddy · Vimeo · Vstreamers · Yahoo! Video · YouTube · Famous Internet videos (Me at the zoo)
Web hosting Angelfire · Brace.io · BT Internet · CableAmerica Personal Web Space · Claranet Netherlands Personal Web Pages · Comcast Personal Web Pages · Extra.hu · FortuneCity · Free ProHosting · GeoCities (patch) · Google Business Sitebuilder · Google Sites · Internet Centrum · MBinternet · MSN TV · Nwnyet · Parodius Networking · Prodigy.net · Saunalahti Iso G · Swipnet · Telenor · Tripod · University of Michigan personal webpages · Verizon Mysite · Verizon Personal Web Space · Webzdarma · Virgin Media
Web applications Mailman · MediaWiki · phpBB · Simple Machines Forum · vBulletin
Other 800notes · AOL · Akoha · Ancestry.com · April Fools' Day · Amplicate · AutoAdmit · Bre.ad · Circavie · Cobook · Co.mments · Countdown · Distill · Dmoz · Easel · Eircode · Electronic Frontier Foundation · FanFiction.Net · Feedly · Ficlets · Forrst · FunnyExam.com · FurAffinity · Google Helpouts · Google Moderator · Google Reader · ICQmail · IFTTT · Jajah · JuniorNet · Lulu Poetry · Mobile Phone Applications · Mochi Media · Mozilla Firefox · MyBlogLog · NBII · Neopets · Quantcast · Quizilla · Salon Table Talk · Shutdownify · Slidecast · SOPA blackout pages · starwars.yahoo.com · TechNet · Toshiba Support · Volán · Widgetbox · Windows Technical Preview · Wunderlist · Zoocasa
Information A Million Ways to Die on the Web · Backup Tips · Cheap storage · Collecting items randomly · Data compression algorithms and tools · Dev · Discovery Data · DOS Floppies · Fortress of Solitude · Keywords · Naughty List · Nightmare Projects · Rescuing floppy disks · Rescuing optical media · Site exploration · The WARC Ecosystem · Working with ARCHIVE.ORG
Projects ArchiveCorps · Audit2014 · Emularity · Faceoff · FlickrFckr · Froogle · INTERNETARCHIVE.BAK (Internet Archive Census) · IRC Quotes · JSMESS · JSVLC · Just Solve the Problem · NewsGrabber · Project Newsletter · Valhalla · Web Roasting (ISP Hosting · University Web Hosting) · Woohoo
Tools ArchiveBot · ArchiveTeam Warrior (Tracker) · Google Takeout · HTTrack · Video downloaders · Wget (Lua · WARC)
Teams Bibliotheca Anonoma · LibreTeam · URLTeam · Yahoo Video Warroom · WikiTeam
About Archive Team Introduction · Philosophy · Who We Are · Our stance on robots.txt · Why Back Up? · Software · Formats · Storage Media · Recommended Reading · Films and documentaries about archiving · Talks · In The Media · FAQ