Talk:Internet Archive Census

From Archiveteam
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Tools

The jq command line for parsing the census json was not obvious to me, so here are two examples to get you started. To get the id and total_size for each item on the same row, separated by spaces:

jq -r '[.id, " ", .total_size | tostring] | add'

To get the hash and name for each file, you have to split up the "files" array and get the info from each element:

jq -r '.files | .[] | [.md5, " ", .name | tostring] | add'

--Sep332 10:01, 12 March 2015 (EDT)

2012 census

On August 2012 I did a "census" using the search engine exporting capabilities. Internet Archive had 4.9 million items on that date. Emijrp (talk) 06:42, 20 November 2016 (EST)