Nifty

From Archiveteam
Revision as of 14:21, 16 September 2016 by Sanqui (talk | contribs) (archive.is)
Jump to navigation Jump to search
Nifty
Japanese ISP with web hosting
Japanese ISP with web hosting
URL homepage.nifty.com
Status Closing
Archiving status Not saved yet
Archiving type Unknown
IRC channel #archiveteam-bs (on hackint)

Japanese ISP providing web hosting. Will be closing about 140,000 unclaimed homepages by 2016-09-29. Termination notice[IAWcite.todayMemWeb] (Japanese)

http://homepage1.nifty.com/USERNAME/
http://homepage2.nifty.com/USERNAME/
http://homepage3.nifty.com/USERNAME/

URL harvesting

Let's follow Site exploration.

<polm> One thing I would recommend is searching Hatena Bookmarks, which is like a Japanese free Pinboard
<polm> Like so: http://b.hatena.ne.jp/entrylist?url=homepage2.nifty.com
<polm> the "of" query parameter paginates like so: http://b.hatena.ne.jp/entrylist?url=homepage2.nifty.com&of=20
<zout> there's some here. https://archive.is/homepage2.nifty.com

Progress

Next steps

  • GoogleScraper is no good. Make attempts at scraping, Bing, Twitter using hints on Site exploration
  • Scrape http://e-shuushuu.net/ (DoomTay)
  • Put chunks of up to 100k URLs onto high speed (20160911.01) ArchiveBot pipelines