Difference between revisions of "GeoCities URL Lists"

From Archiveteam
Jump to navigation Jump to search
m
 
(17 intermediate revisions by 9 users not shown)
Line 1: Line 1:
* swebb's current url list: http://badcheese.com/~steve/only_geocities.txt.bz2 (updated twice daily - generated via a crawl to find all linked pages)
* swebb's current url list: http://badcheese.com/~steve/ALL-GEO-SEEDS-20090730.txt.bz2 (259M) Same url list that archive.org is using
* sods list : [http://blog.odonnell.nu/static/sites.tar.bz2] - over 500,000 unique geocities sites (not pages), I don't have the ability to download them, hopefully some of the downloaders can make use of this.
* sods list : [http://blog.odonnell.nu/static/sites.tar.bz2] - over 700,000 unique GeoCities sites (not pages), I don't have the ability to download them, hopefully some of the downloaders can make use of this.


== URLs drawn from specific sources ==
(Probably no real use for these anymore, but they'll be good historical references for anyone interested.)
It is especially important to back up URLs linked from news sites and other project that cared about the quality of the sites they link too. The following URL lists are all extracted from dumps of/crawling these sites:


* [http://soultcer.net/geocities-urls/digg.bz2 Links to geocities from digg.com] (thanks berticus)
* [[Geocities biglist]]
* [http://soultcer.net/geocities-urls/dmoz.bz2 Links to geocities from the Open Directory Project (dmoz.org)]
 
* [http://soultcer.net/geocities-urls/slashdot.bz2 Links to geocities from slashdot.org] (thanks berticus)
{{Navigation box}}
* [http://soultcer.net/geocities-urls/wikipedia_de.bz2 Links to geocities from the German language wikipedia]
[[Category:GeoCities]]

Latest revision as of 23:47, 25 June 2015

(Probably no real use for these anymore, but they'll be good historical references for anyone interested.)