Difference between revisions of "Saunalahti Iso G"
Jump to navigation
Jump to search
m |
m (Add link to wayback cdx scrape results) |
||
Line 33: | Line 33: | ||
* [http://paste.nerds.io/raw/mevevoripi Open Directory Project scrape] | * [http://paste.nerds.io/raw/mevevoripi Open Directory Project scrape] | ||
* [http://paste.archivingyoursh.it/raw/wiqalafima Common Crawl scrape] | * [http://paste.archivingyoursh.it/raw/wiqalafima Common Crawl scrape] | ||
* | * Scrape the Wayback Machine [https://github.com/chpwssn/saunalahti-iso-g/tree/master/discovery/wayback Wayback cdx scrape results] | ||
* [http://paste.archivingyoursh.it/raw/soxixugaja URLTeam scrape] | * [http://paste.archivingyoursh.it/raw/soxixugaja URLTeam scrape] | ||
* [http://paste.archivingyoursh.it/raw/ridubeqeto Start's list of sequential sites] | * [http://paste.archivingyoursh.it/raw/ridubeqeto Start's list of sequential sites] |
Revision as of 16:27, 9 April 2015
Saunalahti Iso G | |
URL | pp.fi, saunalahti.fi |
Status | Closing |
Archiving status | Upcoming... |
Archiving type | Unknown |
IRC channel | #isohno (on hackint) |
Finnish ISP hosting shutting down on an unspecified date.
Discovery
Sites follow several patterns:
- http://www.saunalahti.fi/*****/ (sequential, no padding)
- http://www.saunalahti.fi/~*****/ (sequential, no padding)
- http://www.saunalahti.fi/voas****/ (sequential, 4 characters padded with zeros)
- http://www.saunalahti.fi/~voas****/ (sequential, 4 characters padded with zeros)
- http://www.saunalahti.fi/nl*****/ (sequential, 5 characters padded with zeros)
- http://www.saunalahti.fi/~nl*****/ (sequential, 5 characters padded with zeros)
- http://www.saunalahti.fi/USERNAME/
- http://www.saunalahti.fi/~USERNAME/
- http://USERNAME.pp.fi
- http://www.USERNAME.pp.fi
Items
- Google scrape (pp.fi, saunalahti.fi)
- Bing scrape
- DuckDuckGo scrape
- TODO: Scrape Twitter
- Reddit scrape
- MediaWiki scrape
- Open Directory Project scrape
- Common Crawl scrape
- Scrape the Wayback Machine Wayback cdx scrape results
- URLTeam scrape
- Start's list of sequential sites
Combined list of results from Chip's scrapes here.