Difference between revisions of "Patch.com"

From Archiveteam
Jump to navigation Jump to search
(5 intermediate revisions by 4 users not shown)
Line 4: Line 4:
| description = Your neighborhood. Your news.
| description = Your neighborhood. Your news.
| URL = <nowiki>http://www.patch.com/</nowiki>
| URL = <nowiki>http://www.patch.com/</nowiki>
| project_status = {{closing}}
| project_status = {{specialcase}}
| source = https://github.com/ArchiveTeam/patch-grab
| source = https://github.com/ArchiveTeam/patch-grab
| archiving_status = {{inprogress}}
| archiving_status = {{saved}} - [https://archive.org/details/archiveteam_patch archives]
| irc = cabbagepatch
| irc = cabbagepatch
| tracker = [http://quilt.at.ninjawedding.org/patchy here]
| tracker = [http://quilt.at.ninjawedding.org/patchy here]
Line 13: Line 13:
'''Patch.com''' is a "hyperlocal" news community which is [http://www.webcitation.org/6IrUArBiV being downsized] from its current ~900 sites to ~500.
'''Patch.com''' is a "hyperlocal" news community which is [http://www.webcitation.org/6IrUArBiV being downsized] from its current ~900 sites to ~500.


=== Current status ===
== Current status ==


A combo archive/spider script now exists, and is being tested.  If it all goes well, we'll put this out on the WarriorPop in the IRC channel if you'd like to be notified when we're ready to go.
In progress.  Warrior integration coming soon.
 
== Patch.com will rate-limit you across all sites ==
 
Patch.com institutes a rate-limit (some unknown hundreds of requests/hour) across all sites.  If you exceed this, all of your requests will be met with HTTP 420s.
 
If the patch-grab script detects these, it hard-abortsA kinder solution would be to sleep for some period of time (an hour?) and try again; suggestions appreciated.
 
{{Navigation box}}

Revision as of 02:13, 30 December 2014

Patch.com
Your neighborhood. Your news.
Your neighborhood. Your news.
URL http://www.patch.com/
Status Special case
Archiving status Saved! - archives
Archiving type Unknown
Project source https://github.com/ArchiveTeam/patch-grab
Project tracker here
IRC channel #cabbagepatch (on hackint)

Patch.com is a "hyperlocal" news community which is being downsized from its current ~900 sites to ~500.

Current status

In progress. Warrior integration coming soon.

Patch.com will rate-limit you across all sites

Patch.com institutes a rate-limit (some unknown hundreds of requests/hour) across all sites. If you exceed this, all of your requests will be met with HTTP 420s.

If the patch-grab script detects these, it hard-aborts. A kinder solution would be to sleep for some period of time (an hour?) and try again; suggestions appreciated.