Difference between revisions of "User:Start"

Latest revision as of 06:52, 29 November 2015

I like preserving the web.

I also go by Start+Select and Pressstart.

Website Crawls

cache.lego.com

Easel

Public HTTP/FTP Server List

Searching intitle:"index of /" inurl:"ftp" on Google gives millions of results.

ftp://ftp.3drealms.com/ - 3D Realms
ftp://ftp.adobe.com/ - Adobe
ftp://ftp.amanda.org/ - Amanda Network Backup
http://staticky.com/mirrors/ftp.apple.com/developer/ - Apple's former developer FTP (mirror)
ftp://ftp.atari.com/ - Atari
http://ftp.blizzard.com/pub/ - Blizzard (only works through HTTP)
ftp://ftp.mrunix.net/ - Borg: The Collective
http://media.codeweavers.com/ - CodeWeavers
ftp://ftp.debian.org/ - Debian
ftp://ftp.eggheads.org/ - EggDrop
ftp://ftp.ea.com/ - Electronic Arts
- http://largedownloads.ea.com - Electronic Arts (large downloads)
ftp://ftp.gnu.org/ - GNU
ftp://ftp.gnus.org/ - GNUS
ftp://ftp.software.ibm.com/ - IBM
ftp://ftp.idsoftware.com/ - iD Software
ftp://ftp.isc.org/ - Internet Systems Consortium
ftp://ftp.kochmedia.com/ - Koch Media
ftp://ftp.kernel.org/ - Linux Kernel Archives
ftp://ftp.lyx.org/ - LyX
ftp://ftp.microsoft.com/ - Microsoft (sometimes up, sometimes down)
- ftp://ftp.research.microsoft.com/ - Microsoft Research
  - ftp://ftp.research.microsoft.com/downloads - hidden directory
http://assets.minecraft.net/ - Minecraft (no longer used)
[1] - Mozilla
- http://releases.mozilla.org/pub/mozilla.org/
- http://download.cdn.mozilla.net/pub/ - Mozilla (older software)
ftp://ftp.ncftp.com/ - NcFTP
ftp://ftp.netscape.com/ - Netscape
ftp://ftp.oldskool.org/ - Oldskool PC Network
ftp://ftp.opera.com/pub/ - Opera
- http://get.geo.opera.com/ - Opera (alt)
ftp://pingus.seul.org - Pingus
ftp://ftp.pgpi.com/ - PGP
ftp://ftp.iso.pld-linux.org/ - PLD Linux
ftp://ftp.povray.org/ - POV-Ray
ftp://ftp.sangoma.com/ - Sangoma
ftp://ftp.scriptics.com/ - Scriptics
ftp://ftp.slackware.com/ - Slackware Linux
http://download.sonymediasoftware.com/ - Sony Creative Software
ftp://ftp.sunet.se/ - Sunet
ftp://ftp.suse.com/ - SUSE Linux
ftp://ftp.ubisoft.com/ - Ubisoft
- ftp://ftp.bluebyte.com/ - Ubisoft Blue Byte
http://releases.ubuntu.com/ - Ubuntu
- http://cdimage.ubuntu.com/ - "Unsupported Ubuntu Images"
ftp://ftp.snt.utwente.nl/ - University of Twente
ftp://ftp.westwood.com/ - Westwood
http://wdl2.winworldpc.com - WinWorld

blah blah blah ignore

Items

TODO: Scrape Google
TODO: Scrape Bing
TODO: Scrape DuckDuckGo
TODO: Scrape Twitter
TODO: Scrape Reddit
TODO: Scrape links from MediaWiki wikis
TODO: Scrape the Open Directory Project
TODO: Scrape the Common Crawl Index
TODO: Scrape the Wayback Machine
TODO: Scrape URLTeam dumps
TODO: Scrape a list of subdomains from DNSdumpster.com (if applicable)

@@ Line 1: / Line 1: @@
 I like preserving the web.
+I also go by Start+Select and Pressstart.
 ==Archives==
@@ Line 5: / Line 7: @@
 *[https://archive.org/details/safeway.ca-panicgrab-20140707 safeway.ca]
 *[https://archive.org/details/emulation-zone-archive Emulation Zone]
-*[https://archive.org/details/www.battleforthenet.com-panicgrab-20140718 Battle for the Net]
+*Battle for the Net ([https://archive.org/details/www.battleforthenet.com-panicgrab-20140718 July 18, 2014], [https://archive.org/details/www.battleforthenet.com-panicgrab-20140912 September 12, 2014])
 *[https://archive.org/details/theopeninter.net-panicgrab-20140718 The Open Internet]
 *[https://archive.org/details/startupsfornetneutrality.org-panicgrab-20140718 Startups for Net Neutrality]
@@ Line 20: / Line 22: @@
 *<nowiki>https://archive.org/details/bmf.*rustedmagick.com-cr-panicgrab-20140808</nowiki> (remove asterisk, spam filter doesn't like this link) - The Original Cutting Room Floor
 *[https://archive.org/details/tppx.herokuapp.com-panicgrab-20140808 TPPX logs]
+*[https://archive.org/details/nintendo-warcs Misc. Nintendo sites]
+*[https://archive.org/details/mojang.com-notch-panicgrab-20140912 mojang.com/notch]
+*[https://archive.org/details/legowracers.4t2portfolio.co.uk-panicgrab-20141007 legowracers.4t2portfolio.co.uk]
 ==Website Crawls==
@@ Line 26: / Line 31: @@
 **[http://paste.archivingyoursh.it/vosoqudavo.avrasm google crawl]
 **[http://paste.archivingyoursh.it/dagacapovu.avrasm combined crawl]
+*[[Easel]]
+**[http://paste.archivingyoursh.it/lojasegeke.avrasm bing crawl]
+**[http://paste.archivingyoursh.it/warisukoka.avrasm google crawl]
+**[http://paste.archivingyoursh.it/xitoxufuki.avrasm combined crawl]
 ==Public HTTP/FTP Server List==
@@ Line 31: / Line 41: @@
 Searching <code>intitle:"index of /" inurl:"ftp"</code> on Google gives millions of results.
+*[ftp://ftp.3drealms.com/ ftp://ftp.3drealms.com/] - 3D Realms
 *[ftp://ftp.adobe.com/ ftp://ftp.adobe.com/] - Adobe
 *[ftp://ftp.amanda.org/ ftp://ftp.amanda.org/] - Amanda Network Backup
 *[http://staticky.com/mirrors/ftp.apple.com/developer/ http://staticky.com/mirrors/ftp.apple.com/developer/] - Apple's former developer FTP (mirror)
+*[ftp://ftp.atari.com/ ftp://ftp.atari.com/] - Atari
 *[http://ftp.blizzard.com/pub/ http://ftp.blizzard.com/pub/] - Blizzard (only works through HTTP)
 *[ftp://ftp.mrunix.net/ ftp://ftp.mrunix.net/] - Borg: The Collective
@@ Line 53: / Line 65: @@
 ***[ftp://ftp.research.microsoft.com/downloads ftp://ftp.research.microsoft.com/downloads] - hidden directory
 *[http://assets.minecraft.net/ http://assets.minecraft.net/] - Minecraft (no longer used)
-*[http://releases.mozilla.org/pub/mozilla.org/ http://releases.mozilla.org/pub/mozilla.org/] - Mozilla
+*[ftp://ftp.mozilla.org/] - Mozilla
+**[http://releases.mozilla.org/pub/mozilla.org/ http://releases.mozilla.org/pub/mozilla.org/]
 **[http://download.cdn.mozilla.net/pub/ http://download.cdn.mozilla.net/pub/] - Mozilla (older software)
 *[ftp://ftp.ncftp.com/ ftp://ftp.ncftp.com/] - NcFTP
@@ Line 77: / Line 90: @@
 *[ftp://ftp.westwood.com/ ftp://ftp.westwood.com/] - Westwood
 *[http://wdl2.winworldpc.com http://wdl2.winworldpc.com] - WinWorld
+== blah blah blah ignore ==
+=== Items ===
+* TODO: Scrape Google
+* TODO: Scrape Bing
+* TODO: Scrape DuckDuckGo
+* TODO: Scrape Twitter
+* TODO: Scrape Reddit
+* TODO: Scrape links from MediaWiki wikis
+* TODO: Scrape the Open Directory Project
+* TODO: Scrape the Common Crawl Index
+* TODO: Scrape the Wayback Machine
+* TODO: Scrape URLTeam dumps
+* TODO: Scrape a list of subdomains from DNSdumpster.com (if applicable)

Difference between revisions of "User:Start"

Latest revision as of 06:52, 29 November 2015

Contents

Archives

Website Crawls

Public HTTP/FTP Server List

blah blah blah ignore

Items

Navigation menu