Difference between revisions of "IRC Quotes"
Jump to navigation
Jump to search
(Scraped I-Rox.com, improved the page layout) |
|||
Line 1: | Line 1: | ||
__NOTOC__ | |||
== What's this, then? == | |||
[[User:Auguste|Auguste]], [[User:BlueMax|BlueMax]] and [[User:Dr-spangle|Dr-Spangle]] are currently scraping IRC quote databases (e.g. [http://www.bash.org Bash.org]). If you can help out or suggest other quote databases to scrape, please join them in #bashup. | [[User:Auguste|Auguste]], [[User:BlueMax|BlueMax]] and [[User:Dr-spangle|Dr-Spangle]] are currently scraping IRC quote databases (e.g. [http://www.bash.org Bash.org]). If you can help out or suggest other quote databases to scrape, please join them in #bashup. | ||
Auguste is currently hosting scrapes [http://www.deaddyingdamned.com/archives/ here]. | == Project Hosting == | ||
Auguste is currently hosting scrapes [http://www.deaddyingdamned.com/archives/ here]. Everybody is encouraged to help mirror. | |||
== Helping Out == | |||
Scraping doesn't take a lot of work; the QDBs are all more or less the same. You only need to write one script, then make a few changes to adapt it to any other QDB you want to scrape. The actual scraping process should easily take under 10 minutes. | |||
If you do want to help with the scraping, please follow the existing scrape format: | |||
* Each quote has its own file | |||
* Each file is named 'n.txt', where 'n' is the quote's ID number | |||
* All quotes should be compressed into an archive | |||
* The archive name should identify the original location and date of scraping (e.g. 'QuoteIRC.com Quote Collection 2011-04-04.7z', or 'DOMAIN.TLD Quote Collection YYYY-MM-DD.EXT') | |||
== Project Status == | |||
{| border="1" width="100%" class="wikitable sortable" | {| border="1" width="100%" class="wikitable sortable" | ||
!Database | !Database | ||
Line 9: | Line 22: | ||
!Notes | !Notes | ||
|- | |- | ||
|Bash.org | |[http://www.bash.org Bash.org] | ||
|Yes | |Yes | ||
|Dr-Spangle | |Dr-Spangle | ||
|The quote database that pretty much created all others. | |The quote database that pretty much created all others. | ||
|- | |- | ||
|QDB. | |[http://www.deaddyingdamned.com/qdb/ DeadDyingDamned.com/QDB/] | ||
|No | |||
| | |||
|The unofficial ArchiveTeam QDB. I'll have the server automatically save these somewhere. --[[User:Auguste|Auguste]] 13:36, 7 April 2011 (UTC) | |||
|- | |||
|[http://www.i-rox.com I-Rox.com] | |||
|Yes | |Yes | ||
|Auguste | |Auguste | ||
| | | | ||
|- | |- | ||
| | |[http://www.mandaliet.com/furcqdb/ Mandaliet.com/furcqdb/] | ||
|Yes | |Yes | ||
|Auguste | |Auguste | ||
| | |The Furcadia quote database | ||
|- | |- | ||
|QDB.MIT.edu | |[http://qdb.mit.edu QDB.MIT.edu] | ||
|Yes | |Yes | ||
|Auguste | |Auguste | ||
|The MIT quote database | |The MIT quote database | ||
|- | |- | ||
| | |[http://www.qdb.us QDB.us] | ||
|Yes | |Yes | ||
|Auguste | |Auguste | ||
| | | | ||
|- | |||
|[http://www.quoteirc.com QuoteIRC.com] | |||
|Yes | |||
|Auguste | |||
| | |||
|- | |- | ||
| | |[http://quotes.burntelectrons.org Quotes.BurntElectrons.org] | ||
|Yes | |Yes | ||
|Auguste | |Auguste | ||
|The | |The IRC.Mozilla.org quote database | ||
|- | |- | ||
| | |[http://www.warpdrive.se WarpDrive.se] | ||
|Yes | |Yes | ||
|Auguste | |Auguste | ||
| | |Quotes are in Swedish | ||
|- | |- | ||
| | |[http://www.wqdb.org WQDB.org] | ||
|Yes | |Yes | ||
|Auguste | |Auguste | ||
|The | |The Worms quote database | ||
|- | |- | ||
| | |[http://www.xkcdb.com xkcdb.com] | ||
|Yes | |Yes | ||
|Auguste | |Auguste | ||
|The | |The xkcd quote database | ||
|} | |} | ||
[[Category:Archive Team]] | [[Category:Archive Team]] |
Revision as of 13:36, 7 April 2011
What's this, then?
Auguste, BlueMax and Dr-Spangle are currently scraping IRC quote databases (e.g. Bash.org). If you can help out or suggest other quote databases to scrape, please join them in #bashup.
Project Hosting
Auguste is currently hosting scrapes here. Everybody is encouraged to help mirror.
Helping Out
Scraping doesn't take a lot of work; the QDBs are all more or less the same. You only need to write one script, then make a few changes to adapt it to any other QDB you want to scrape. The actual scraping process should easily take under 10 minutes.
If you do want to help with the scraping, please follow the existing scrape format:
- Each quote has its own file
- Each file is named 'n.txt', where 'n' is the quote's ID number
- All quotes should be compressed into an archive
- The archive name should identify the original location and date of scraping (e.g. 'QuoteIRC.com Quote Collection 2011-04-04.7z', or 'DOMAIN.TLD Quote Collection YYYY-MM-DD.EXT')
Project Status
Database | Has been scraped | Scraper | Notes |
---|---|---|---|
Bash.org | Yes | Dr-Spangle | The quote database that pretty much created all others. |
DeadDyingDamned.com/QDB/ | No | The unofficial ArchiveTeam QDB. I'll have the server automatically save these somewhere. --Auguste 13:36, 7 April 2011 (UTC) | |
I-Rox.com | Yes | Auguste | |
Mandaliet.com/furcqdb/ | Yes | Auguste | The Furcadia quote database |
QDB.MIT.edu | Yes | Auguste | The MIT quote database |
QDB.us | Yes | Auguste | |
QuoteIRC.com | Yes | Auguste | |
Quotes.BurntElectrons.org | Yes | Auguste | The IRC.Mozilla.org quote database |
WarpDrive.se | Yes | Auguste | Quotes are in Swedish |
WQDB.org | Yes | Auguste | The Worms quote database |
xkcdb.com | Yes | Auguste | The xkcd quote database |