Difference between revisions of "500px"
m (Reverted edits by Megalanya0 (talk) to last revision by Start) |
(Added lots of info about Archival, API, IRC, etc.) |
||
Line 4: | Line 4: | ||
| description = High-quality photo sharing & selling site | | description = High-quality photo sharing & selling site | ||
| URL = {{url|1=http://www.500px.com}} | | URL = {{url|1=http://www.500px.com}} | ||
| project_status = {{ | | project_status = {{endangered} | ||
| archiving_status = {{notsavedyet}} | | archiving_status = {{notsavedyet}} | ||
| irc = 500pieces | |||
}} | }} | ||
'''500px''' is a photo sharing site, that caters to high-quality photos. It provides ways to photographers to sell their images, as well as providing a large collection of images to view. | '''500px''' is a photo sharing site, that caters to high-quality photos. It provides ways to photographers to sell their images, as well as providing a large collection of images to view. On June 30th, they are removing all Creative Commons images from their site (see https://support.500px.com/hc/en-us/articles/360005097533) | ||
{{Navigation box}} | |||
== Archival == | |||
My method of getting API info: Using the BurpSuite Pro Network Security tools, I set up a MITM attack in between a VM with a custom SSL CA certificate installed and the server. After intercepting a request to api.500px.com, I cloned the request and sent it to the "Intruder" Tool, where I set the page string in the GET request to the API as a 'payload', then had it auto-increment numbers while processing the requests and saving the responses. I set the limit to be 1000, although I ended up stopping it at around 900 because I noticed the responses were turning empty (and theres a total pages number in the api info). I 7zipped all of the responses and threw them up on the IA for someone to have a go at if they want, because after writing this I'm heading to bed. | |||
Attribution License 3.0 All API Info: https://archive.org/details/AttributionLicense3APISeverResponses.7z | |||
Example of one of the responses: https://pastebin.com/TygNSTSu | |||
I also had a go at writing a python script that once given a list of URLs, it would parse and download all of the metadata and photos from those URLS: https://github.com/adinbied/500pxBU | |||
Hopefully someone can pick up where I left off using what I've posted - I should be back around 3PM UTC on 6/29/18. | |||
~adinbied |
Revision as of 06:25, 29 June 2018
{{Infobox project | title = 500px | image = 500pxdotcom screenshot.png | description = High-quality photo sharing & selling site | URL = http://www.500px.com[IA•Wcite•.today•MemWeb] | project_status = {{endangered} | archiving_status = Not saved yet | irc = 500pieces }} 500px is a photo sharing site, that caters to high-quality photos. It provides ways to photographers to sell their images, as well as providing a large collection of images to view. On June 30th, they are removing all Creative Commons images from their site (see https://support.500px.com/hc/en-us/articles/360005097533)
Archival
My method of getting API info: Using the BurpSuite Pro Network Security tools, I set up a MITM attack in between a VM with a custom SSL CA certificate installed and the server. After intercepting a request to api.500px.com, I cloned the request and sent it to the "Intruder" Tool, where I set the page string in the GET request to the API as a 'payload', then had it auto-increment numbers while processing the requests and saving the responses. I set the limit to be 1000, although I ended up stopping it at around 900 because I noticed the responses were turning empty (and theres a total pages number in the api info). I 7zipped all of the responses and threw them up on the IA for someone to have a go at if they want, because after writing this I'm heading to bed. Attribution License 3.0 All API Info: https://archive.org/details/AttributionLicense3APISeverResponses.7z Example of one of the responses: https://pastebin.com/TygNSTSu
I also had a go at writing a python script that once given a list of URLs, it would parse and download all of the metadata and photos from those URLS: https://github.com/adinbied/500pxBU
Hopefully someone can pick up where I left off using what I've posted - I should be back around 3PM UTC on 6/29/18.
~adinbied