Difference between revisions of "Chromebot"
(Providing some answers on most common questions) |
(Information enrichened.) |
||
Line 2: | Line 2: | ||
By default the bot only grabs a single URL. However it supports recursion, which is rather slow, since every single page needs to be loaded and rendered by a browser. A [https://6xq.net/chromebot/ dashboard] is available for watching the progress of such jobs. | By default the bot only grabs a single URL. However it supports recursion, which is rather slow, since every single page needs to be loaded and rendered by a browser. A [https://6xq.net/chromebot/ dashboard] is available for watching the progress of such jobs. | ||
== Usage<ref name=usage>[https://github.com/PromyLOPh/crocoite/blob/184189f0a535996edca01a68182ed07d32e26e9c/README.rst#IRC-bot ChromeBot usage documentation on GitHub]</ref> == | |||
You can call ''chromebot'' on the {{IRC|archivebot}} IRC channel, which chromebot shares with it's parent [[ArchiveBot]]. Both “<code>chromebot</code>” and “<code>chromebot:</code>” work, with or without the colon. The username can be autocompleted using the “Tab” key in the EFNet web chat interface or IRC client. | |||
{| class="wikitable" | |||
|- | |||
! Command !! Description | |||
|- | |||
| <code>chromebot: a <uuid><code><br /><code>chromebot a <uuid><code> || Archive <url> with <concurrency> processes according to recursion <policy>. | |||
|- | |||
| <code>chromebot: s <uuid></code><br /><code>chromebot s <uuid></code> || Get job status for <uuid>. | |||
|- | |||
| <code>chromebot: r <uuid></code><br /><code>chromebot r <uuid></code> || Revoke or abort running job with <uuid>. | |||
|} | |||
Please note that the commands are case-sensitive. | |||
== Restrictions == | |||
ChromeBot has been blacklisted by [[Instagram]], a website infamous for being an archival loophole. | |||
== References == | |||
<references /> |
Revision as of 21:46, 26 April 2019
chromebot is an IRC bot parallel to ArchiveBot that uses Google Chrome and thus is able to archive JavaScript-heavy websites. Both, software and bot, are maintained by User:PurpleSymphony. WARCs are uploaded daily to the chromebot collection on archive.org.
By default the bot only grabs a single URL. However it supports recursion, which is rather slow, since every single page needs to be loaded and rendered by a browser. A dashboard is available for watching the progress of such jobs.
Usage[1]
You can call chromebot on the #archivebot (on hackint) IRC channel, which chromebot shares with it's parent ArchiveBot. Both “chromebot
” and “chromebot:
” work, with or without the colon. The username can be autocompleted using the “Tab” key in the EFNet web chat interface or IRC client.
Command | Description |
---|---|
chromebot: a <uuid> |
Archive <url> with <concurrency> processes according to recursion <policy>. |
chromebot: s <uuid> chromebot s <uuid> |
Get job status for <uuid>. |
chromebot: r <uuid> chromebot r <uuid> |
Revoke or abort running job with <uuid>. |
Please note that the commands are case-sensitive.
Restrictions
ChromeBot has been blacklisted by Instagram, a website infamous for being an archival loophole.