Difference between revisions of "Talk:ArchiveBot"

Revision as of 20:40, 28 April 2019

ArchiveBot has star quality ☺ : ArchiveBot/ArchiveBot/commit/566aa53 --Chfoo 23:06, 23 July 2014 (EDT)

Full original Job ID stored inside the saved .json files for each job. (At the moment, the file name stores the 5 first letters of the original JobID).
If a pipeline is running/has run low/out on/of space, alert to #ArchiveBot (on hackint)'s IRC channel.
- …and move job to different pipeline, if technically possible.
Maybe, operate the ArchiveBot Twitter account again.
- Ability to intercept URL's tweeted to it.
Make PhantomJS work again (non-urgent because chromebot already exists for this purpose.
!a scans sitemaps (i.e. sitemap.xml file) and Robots.txt for more URL's.
!ao < http://example.com/URL-List.txt does not only save the URL's within the list but also the URL list itself.
!a < http://example.com/Website-List.txt for big archivals (if not already possible.)

@@ Line 1: / Line 1: @@
 ArchiveBot has star quality ☺ : [https://web.archive.org/web/20140211000949/https://github.com/ArchiveBot/ArchiveBot/commit/566aa53 ArchiveBot/ArchiveBot/commit/566aa53] --[[User:Chfoo|Chfoo]] 23:06, 23 July 2014 (EDT)
+== Feature Suggestions: ==
+* Full original Job ID stored inside the saved [https://ia801506.us.archive.org/10/items/archiveteam_archivebot_go_20190409170001/circumcisionmovie.com-inf-20190409-071623-dysua.json .json file]s for each job. (At the moment, the file name stores the 5 first letters of the original JobID).
+* If a pipeline is running/has run low/out on/of space, alert to {{IRC|ArchiveBot}}'s IRC channel.
+** …and move job to different pipeline, if technically possible.
+* Maybe, operate the [https://twitter.com/ArchiveBot ArchiveBot Twitter account] again.
+** Ability to intercept URL's tweeted to it.
+* Make [https://archivebot.readthedocs.io/en/latest/search.html?q=phantomjs&check_keywords=yes&area=default PhantomJS] work again (non-urgent because <code>[[chromebot]]</code> already exists for this purpose.
+* <code>!a</code> scans [[Wikipedia:sitemap|sitemaps]] (i.e. ''sitemap.xml'' file) and [[Robots.txt]] for more URL's.
+* <code>!ao < http://example.com/URL-List.txt</code> does not only save the URL's within the list but also the URL list '''itself'''.
+* <code>!a < http://example.com/Website-List.txt</code> for big archivals (if not already possible.)