Difference between revisions of "Web Roasting"
Jump to navigation
Jump to search
(placeholder page, will add more later) |
|||
(One intermediate revision by one other user not shown) | |||
Line 1: | Line 1: | ||
{{Infobox project | {{Infobox project | ||
| title = Web Roasting - Save all the web hosting sites! | | title = Web Roasting - Save all the web hosting sites! | ||
| project_status = {{online}} | | project_status = {{online}} | ||
| archiving_status = scraping {{inprogress}}, downloading {{upcoming}} | | archiving_status = scraping {{inprogress}}, downloading {{upcoming}} | ||
Line 12: | Line 11: | ||
There are two ways you can help right now: | There are two ways you can help right now: | ||
* Add more web hosting sites to the [[ISP Hosting]] or [[University Web Hosting]] pages. | * Add more web hosting sites to the [[ISP Hosting]] or [[University Web Hosting]] pages. | ||
* Scrape the following | * Scrape the following for hosted web sites: | ||
**[[Site exploration#Google|Google]] (site:webhost.com) | **[[Site exploration#Google|Google]] (site:webhost.com) | ||
**[[Site exploration#Bing API|Bing]] (site:webhost.com) | **[[Site exploration#Bing API|Bing]] (site:webhost.com) | ||
Line 27: | Line 26: | ||
**DNSdumpster.com (only for hosts that use subdomains) | **DNSdumpster.com (only for hosts that use subdomains) | ||
**pentest-tools.com (only for hosts that use subdomains) | **pentest-tools.com (only for hosts that use subdomains) | ||
**Sitemaps or other types of indexes, if the web host provides any. | |||
== See also == | == See also == |
Latest revision as of 23:06, 29 March 2018
Web Roasting - Save all the web hosting sites! | |
Status | Online! |
Archiving status | scraping In progress..., downloading Upcoming... |
Archiving type | Unknown |
IRC channel | #webroasting (on hackint) |
Web Roasting is a project to save old web hosting sites before they shut down. Currently scraping for hosted websites, grab coming soon.
How can I help?
There are two ways you can help right now:
- Add more web hosting sites to the ISP Hosting or University Web Hosting pages.
- Scrape the following for hosted web sites:
- Google (site:webhost.com)
- Bing (site:webhost.com)
- DuckDuckGo (site:webhost.com)
- Yandex (site:webhost.com)
- Baidu (site:webhost.com)
- Twitter (litterapi preferred)
- Reddit (http://www.reddit.com/domain/webhost.com/)
- Links from MediaWiki wikis
- The Open Directory Project
- The Common Crawl Index
- The Wayback Machine
- URLTeam crawls
- DNSdumpster.com (only for hosts that use subdomains)
- pentest-tools.com (only for hosts that use subdomains)
- Sitemaps or other types of indexes, if the web host provides any.
See also