Quizlet

From Archiveteam
Revision as of 02:46, 21 June 2018 by Adinbied (talk | contribs) (Formatting)
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.
Quizlet
Quizlet Home Page.png
URL https://quizlet.com/
Status Online!
Archiving status Not saved yet
Archiving type Unknown
IRC channel #archiveteam-bs (on hackint)

Quizlet is a mobile and web-based study application that allows students to study information via learning tools and games. It is currently used by 1-in-2 high school students and 1-in-3 college students in the United States. Quizlet trains students via flashcards and various games and tests. As of April 30, 2018, Quizlet has over 200 million user-generated flashcard sets and more than 30 million active users. It now ranks among the top 50 websites in the U.S.

Archival

Quizlet ‘sets’ are incremental, with the earliest public set having the id ‘173’ and one of the more recent sets being above ‘300000000’. They do have an open API (see https://quizlet.com/api/2.0/docs) that returns a JSON copy of each set. An example API result can be seen here. Back of the napkin math shows that 300,000,000 public sets would take about 400 GB to store uncompressed.

Grabbing the Data

As of now, I have been unsuccessful in finding a reliable way to get everything downloaded. The initial python script I wrote to incrementally grab all of the sets via the API and save them as txt files works, but is painfully slow (after a week of running it on three machines, I only got about 3 million downloaded). I have tried multithreading and multiprocessing, but have been unable to get the same amount downloaded using those methods. Maybe someone else might have some more luck.