500px

From Archiveteam
Revision as of 06:25, 29 June 2018 by Adinbied (talk | contribs) (Added lots of info about Archival, API, IRC, etc.)
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

{{Infobox project | title = 500px | image = 500pxdotcom screenshot.png | description = High-quality photo sharing & selling site | URL = http://www.500px.com[IAWcite.todayMemWeb] | project_status = {{endangered} | archiving_status = Not saved yet | irc = 500pieces }} 500px is a photo sharing site, that caters to high-quality photos. It provides ways to photographers to sell their images, as well as providing a large collection of images to view. On June 30th, they are removing all Creative Commons images from their site (see https://support.500px.com/hc/en-us/articles/360005097533)


Archival

My method of getting API info: Using the BurpSuite Pro Network Security tools, I set up a MITM attack in between a VM with a custom SSL CA certificate installed and the server. After intercepting a request to api.500px.com, I cloned the request and sent it to the "Intruder" Tool, where I set the page string in the GET request to the API as a 'payload', then had it auto-increment numbers while processing the requests and saving the responses. I set the limit to be 1000, although I ended up stopping it at around 900 because I noticed the responses were turning empty (and theres a total pages number in the api info). I 7zipped all of the responses and threw them up on the IA for someone to have a go at if they want, because after writing this I'm heading to bed. Attribution License 3.0 All API Info: https://archive.org/details/AttributionLicense3APISeverResponses.7z Example of one of the responses: https://pastebin.com/TygNSTSu

I also had a go at writing a python script that once given a list of URLs, it would parse and download all of the metadata and photos from those URLS: https://github.com/adinbied/500pxBU

Hopefully someone can pick up where I left off using what I've posted - I should be back around 3PM UTC on 6/29/18.

~adinbied