Nifty

From Archiveteam
Jump to: navigation, search
Nifty
Nifty logo
Japanese ISP with web hosting
Japanese ISP with web hosting
URL homepage.nifty.com
Project status Offline
Archiving status Saved!
Project source https://github.com/ArchiveTeam/nifty-discovery
Project tracker Unknown
IRC channel #niftyjanai

Japanese ISP providing web hosting. Will be closing about 140,000 unclaimed homepages by 2016-11-10 15:00. Termination notice [IA] [WebCite] [archive.is] (Japanese)

http://homepage1.nifty.com/USERNAME/
http://homepage2.nifty.com/USERNAME/
http://homepage3.nifty.com/USERNAME/

URL harvesting

Let's follow Site exploration.

<polm> One thing I would recommend is searching Hatena Bookmarks, which is like a Japanese free Pinboard
<polm> Like so: http://b.hatena.ne.jp/entrylist?url=homepage2.nifty.com
<polm> the "of" query parameter paginates like so: http://b.hatena.ne.jp/entrylist?url=homepage2.nifty.com&of=20
<zout> there's some here. https://archive.is/homepage2.nifty.com

Progress

Next steps

  • GoogleScraper is no good. Make attempts at scraping, Bing, Twitter using hints on Site exploration
  • Put chunks of up to 100k URLs onto high speed (20160911.01) ArchiveBot pipelines