Difference between revisions of "Posterous"

From Archiveteam
Jump to: navigation, search
(Goal)
(Goal)
Line 57: Line 57:
 
63.4 days (1 fetch a second)/50 days left = 1.268 and round that up to 2 accounts per second actually needed.
 
63.4 days (1 fetch a second)/50 days left = 1.268 and round that up to 2 accounts per second actually needed.
  
Now taking into account that not all accounts are the same size and the previous outages we have had the safe number would be 3x the above answer. So we need to download 6 full accounts per second to positively get all of posterous before it shuts down.
+
Now taking into account that not all accounts are the same size and the previous outages we have had the safe number would be 3x the above answer. So we need to download 6 full accounts per second to positively get all of posterous before it shuts down. This is also based on the assumption that we will not have to redownload any accounts at the end.

Revision as of 22:38, 10 March 2013

Posterous
Posterous logo
Posterous home.png
URL http://posterous.com
Project status Closing
Archiving status In progress...
Project source Unknown
Project tracker here
IRC channel #preposterus (on EFnet)
Project lead Unknown

Posterous is a blogging platform started in May 2008. It was acquired by Twitter on March 12, 2012 and will shut down April 30, 2013. Announcement

Warrior

You can help by installing and running the ArchiveTeam Warrior and selecting the "posterous" project.

Seesaw script (for advanced users)

Download:

git clone https://github.com/ArchiveTeam/posterous-grab.git

Follow instructions to install seesaw and edit script for IP address.

For wget: run ./get-wget-lua.sh

Commands:

Make sure you place an IP address after --bind-address= on line 175. Example: "--bind-address=192.168.1.1",

git clone http://github.com/ArchiveTeam/posterous-grab.git
cd posterous-grab
git clone http://github.com/ArchiveTeam/seesaw-kit
cd seesaw-kit
sudo pip install -r requirements.txt
sudo pip install seesaw
cd ../
chmod +x get-wget-lua.sh && ./get-wget-lua.sh
run-pipeline --concurrent 1 --address <your_ip_address> pipeline.py <your_username>

Site List Grab

We have assembled a list of Posterous sites that need grabbing. Total found: 9898986

http://archive.org/details/2013-02-22-posterous-hostname-list

Tools: git

Goal

We found 9.8 million possible posterous accounts. After filtering out the banned/spam accounts we have 6,677,720 left.

They close April 30th, 2013. We have 50 days left and 1,200,000 accounts downloaded.

60 sec * 60 min * 24 hours = 86,400 seconds a day

(6,677,720 - 1,200,000)/ 86,400 = 63.4 days at 1 account a second.

63.4 days (1 fetch a second)/50 days left = 1.268 and round that up to 2 accounts per second actually needed.

Now taking into account that not all accounts are the same size and the previous outages we have had the safe number would be 3x the above answer. So we need to download 6 full accounts per second to positively get all of posterous before it shuts down. This is also based on the assumption that we will not have to redownload any accounts at the end.