https://wiki.archiveteam.org/api.php?action=feedcontributions&user=DukeNukem&feedformat=atomArchiveteam - User contributions [en]2024-03-28T11:52:31ZUser contributionsMediaWiki 1.37.1https://wiki.archiveteam.org/index.php?title=ArchiveTeam_Warrior&diff=17821ArchiveTeam Warrior2013-10-17T20:03:04Z<p>DukeNukem: /* Warrior FAQ */</p>
<hr />
<div>==What is the Archive Team Warrior?==<br />
<br />
[[Image:Archive_team.png|100px|left]]<br />
[[Image:Warrior-vm-screenshot.png|right]]<br />
[[Image:Warrior-web-screenshot.png|right]]<br />
<br />
The Archive Team Warrior is a virtual archiving appliance. You can run it to help with the ArchiveTeam archiving efforts. It will download sites and upload them to our archive — and it’s really easy to do!<br />
<br />
The warrior is a virtual machine, so there is no risk to your computer. The warrior will only use your bandwidth and some of your disk space. It will get tasks from and report progress to the [[Tracker]].<br />
<br />
==Basic usage==<br />
<br />
The warrior runs on Windows, OS X and Linux. You’ll need [https://www.virtualbox.org/ VirtualBox] (recommended), VMware workstation/player, or a similar program to run the virtual machine.<br />
<br />
Instructions for VirtualBox:<br />
<ol><br />
<li>Download the [http://archive.org/download/archiveteam-warrior/archiveteam-warrior-v2-20121008.ova appliance] (174MB).</li><br />
<li>In VirtualBox, click File > Import Appliance and open the file.</li><br />
<li>Start the virtual machine. It will fetch the latest updates and will eventually tell you to start your web browser.</li><br />
</ol><br />
<br />
Once you’ve started your warrior:<br />
<ol><br />
<li>Go to http://localhost:8001/ and check the Settings page.</li><br />
<li>Choose a username — we’ll show your progress on the leaderboard.</li><br />
<li>Go to the All projects tab and pick a project to work on. Even better: select ArchiveTeam’s Choice to let your warrior work on the most urgent project.</li><br />
</ol><br />
<br />
==Warrior FAQ==<br />
__TOC__<br />
=== Why am I seeing a message about that no item was received? ===<br />
<br />
It means that there is no work available. This happens for several because:<br />
<br />
* There project has just finished and someone is inspecting the work done. If a problem is discovered, items may be re-queued and more work is available.<br />
* In the rare case, you have been banned by a tracker administrator because you were requesting too much work or your internet connection is "unclean". We prefer connections from many public IP addresses, use of non-captive DNS servers, and no proxies/firewalls.<br />
<br />
=== Why am I seeing a message about rate limiting? ===<br />
<br />
Keep in mind that although downloading the internet for digital preservation and fun are the primary goals of all Archive Team activities, serious stress on the target's server may occur. The rate limit is imposed by a [[Tracker#People|tracker administrator]] and should not be subverted.<br />
<br />
===Help! The warrior is eating all my bandwidth!===<br />
<br />
You can limit the warriors bandwidth quite easily for virtualbox as long as you are running a relatively recent version. The option is not offered with a GUI however.<br />
<br />
The command <pre>VBoxManage bandwidthctl archiveteam-warrior-2 --name Limit --add network --limit 3</pre> will limit the warrior instance called archiveteam-warrior-2 (The default name of the warrior vm currently) to 3Mb/s. Adjust as needed.<br />
<br />
In the latest version of VirtualBox on Windows, the syntax appears to have changed. The correct command now seems to be:<br />
<br />
<pre>VBoxManage bandwidthctl archiveteam-warrior-2 add netlimit --type network --limit 3</pre><br />
<br />
For more info, consult the [http://www.virtualbox.org/manual/ch06.html#network_bandwidth_limit VirtualBox manual (Chapter 6, Section 9)].<br />
<br />
=== I turned my warrior off, will those tasks be lost? ===<br />
<br />
If you've killed your warrior instances then the work your warrior did has been lost, however the tasks will be returned to the pool after a period of time. If you want you can alert the admins via IRC of whats happened, and they can clear the claims your username may of made however this isn't very important on most projects.<br />
<br />
=== I need to disconnect my internet / reboot my PC but I don't want to lose work ===<br />
<br />
If you pause/suspend the warrior instance, most projects will allow resuming of work in progress when you unsuspend the warrior instance.<br />
<br />
=== I told the warrior to shutdown from the interface but nothing has changed! what gives? ===<br />
<br />
The warrior will attempt to finish the current running tasks before shutting down. If you need to shut down right away; go ahead, your progress will be lost however the jobs will eventually cycle out to another user.<br />
<br />
=== How much disk space will the warrior use? ===<br />
<br />
Short answer: it depends on the project.<br />
<br />
Long answer: because the way each project defines an item differently, the warrior may be downloading a small file to downloading a whole subsection of a website. The virtual machine is configured by default to use 60GB as an absolute maximum. Any unused virtual machine disk space is not used on the host computer. You may, however, run the virtual machine on less than 60GB if you like to live dangerously. We're downloading the internet after all!<br />
<br />
=== The secondary disk is using up space even though it's not running a project. ===<br />
<br />
Virtual machine disk images do not behave like a regular file. There are several ways to reclaim space:<br />
<br />
* Delete the second disk and put back an empty disk. The warrior should reformat the second disk.<br />
* Delete the entire warrior application and re-import it.<br />
* Use the zerofree program and then clone the disk image. Reattach the cloned disk image.<br />
<br />
=== I can't connect to localhost? ===<br />
<br />
The application includes a configuration to set up port forwarding to the guest machine on port 8001 so you can access the interface through your web browser. If this does not happen, you may need to double check your machine's network settings.<br />
<br />
=== The warrior can't connect to the internet? ===<br />
<br />
It may be possible that the virtual machine has picked up the address of the local DNS cache on your computer which the virtual machine does not have access to. <br />
<br />
If you experience this on Virtual Box, see [http://askubuntu.com/questions/204953/virtualbox-dns-stopped-working-on-upgrade-to-12-10 this question and answer].<br />
<br />
=== I'm looking at the text scrolling by and I notice some errors? Rsync is not working? ===<br />
<br />
Uh-oh! Something is not right. Notify us immediately in the appropriate [[IRC]] channel.<br />
<br />
=== I'm looking at the leaderboard. What's that icon beside the username? ===<br />
<br />
That's just the warrior logo: [[File:Archive_team.png|42px]] (click on the image for a larger version). It means that person is using the warrior. Those without the icon are running the scripts manually.<br />
<br />
=== What's that guy doing in the logo? ===<br />
<br />
The place is on fire! But don't worry, he safely escaped with the rescued data in his arms.<br />
<br />
=== I want to log in to the virtual machine. How do I do this? ===<br />
<br />
Unless you know what you are doing, you should not need to do this. But if you want to, the username is <code>root</code> and the password is <code>archiveteam</code>.<br />
<br />
Press ALT+F3 to switch to virtual console number 3. Use ALT+Left or ALT+Right to switch between virtual consoles. There are 6 virtual consoles in total. Number 1 and 2 are reserved for the warrior.<br />
<br />
=== The warrior seems to have too much overhead. I can't run a VM in a VPS! ===<br />
<br />
You don't need to run a virtual machine. If you are managing a VPS, it's likely you are comfortable with some Linux stuff. Projects can be run manually. Consult the project wiki page or the source code repository readme file.<br />
<br />
=== Why a virtual machine in the first place? ===<br />
<br />
The virtual machine is a quick, safe, and easy way for newcomers to help us out. It offers many features:<br />
<br />
* Graphical interface<br />
* Automatically selects which project is important to run<br />
* Self-updating software infrastructure<br />
* Allows for unattended use<br />
* In case of software faults, your machine is not ruined<br />
* Restarts itself in case of runaway programs<br />
* Runs on Windows, Mac OS, Linux painlessly<br />
<br />
If you have suggestions for improving this system, please talk to us as described below.<br />
<br />
=== I just imported the ova image and the warrior is stuck on "Preparing the data partition" ===<br />
<br />
This issue has cropped up before and we do not know what causes it. It is recommended to just delete the warrior image and import the ova again. Testing shows the import works the majority of the time.<br />
<br />
=== I still have a question! ===<br />
<br />
Talk to us on [[IRC]]. Use [irc://irc.efnet.org/warrior #warrior] for specific warrior questions or [irc://irc.efnet.org/archiveteam #archiveteam] for general questions.<br />
<br />
== Projects ==<br />
<br />
Previous and current warrior projects:<br />
<br />
{| class="wikitable"<br />
! Project !! Status !! Began !! Finished !! Result !! Archive Location<br />
|-<br />
| [[MobileMe]] || '''Archive Posted''' || April 3, 2012 || Aug 8, 2012 || Success || <br />
[http://archive.org/details/archiveteam-mobileme-hero archive] [http://archive.org/details/archiveteam-mobileme-index index] [http://archive.org/download/archiveteam-mobileme-index/mobileme-20120817.html user lookup]<br />
|-<br />
| [[FortuneCity]] || '''Archive Posted''' || April 4, 2012 || April 11, 2012 || Partial Success || [http://archive.org/details/archiveteam-fortunecity archive] [http://archive.org/download/test-memac-index-test/fortunecity.html user lookup]<br />
|-<br />
| [[Tabblo]] || '''Archive Posted''' || May 23, 2012 || May 26, 2012 || Success || [http://archive.org/details/tabblo-archive archive] [http://archive.org/download/test-memac-index-test/tabblo.html user lookup]<br />
|-<br />
| [[Picplz]] || '''Archive Posted''' || June 3, 2012 || June 15, 2012 || || [http://archive.org/details/archiveteam-picplz archive] [http://archive.org/details/archiveteam-picplz-index index] [http://archive.org/download/archiveteam-picplz-index/picplz-20120823.html user lookup]<br />
|-<br />
| [[Tumblr]] (test project) || '''Archive Posted''' || August 9, 2012 || August 19, 2012 || || [http://archive.org/details/archiveteam-tumblr-test archive (tar)] [http://archive.org/details/archiveteam-tumblr-test-warc archive (warc)]<br />
|-<br />
| [[Cinch]].FM || '''Archive Posted''' || August 20, 2012 || August 22, 2012 || Success || [http://archive.org/details/archiveteam-cinch archive]<br />
|-<br />
| [[City of Heroes]] || '''Archive Posted''' || September 3, 2012 || December 1, 2012 || Success || [http://archive.org/details/archiveteam-city-of-heroes-www www] [http://archive.org/details/archiveteam-city-of-heroes-main forums] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-1 1] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-2 2] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-3 3] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-4 4] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-5 5]<br />
|-<br />
| [[Webshots]] || '''Archive Posted''' || October 4, 2012 || November 18, 2012 || || [http://archive.org/download/webshots-freeze-frame-index/index.html index]<br />
|-<br />
| [[BT Internet]] || '''Archive Posted''' || October 10, 2012 || November 2, 2012 || Success || [http://archive.org/details/archiveteam-btinternet archive]<br />
|-<br />
| [[DailyBooth| Daily Booth]] || '''Archive Posted''' || November 19, 2012 || December 29, 2012 || || [http://archive.org/details/archiveteam_dailybooth archive] [http://archive.org/download/dailybooth-freeze-frame-index/index.html lookup]<br />
|-<br />
| [[GitHub Downloads]] || '''Archive Posted''' || December 13, 2012 || December 17, 2012 || Success || [http://archive.org/details/github-downloads-2012-12 archive] [http://archive.org/details/archiveteam-github-repository-index-201212 index]<br />
|-<br />
| [[Yahoo! Blog]] || '''Archive Posted''' || January 8, 2013 || January 19, 2013 || || [http://archive.org/details/yahoo_korea_blogs archive]<br />
|-<br />
| [[weblog.nl]] || '''Archive Posted''' || January 19, 2013 || February 2, 2013 || || [http://archive.org/details/archiveteam_weblognl archive] [http://archive.org/download/archiveteam_weblognl-index/ lookup]<br />
|-<br />
| [[URLTeam]] || Active || || || || [http://urlte.am/releases/ all releases]<br />
|-<br />
| [[Punchfork]] || '''Archive Posted''' || January 11, 2013 || March 6, 2013 || || [http://archive.org/details/archiveteam_punchfork archive] [http://archive.org/download/archiveteam_punchfork_index/ user lookup]<br />
|-<br />
| [[Xanga]] || Downloads Paused || January 22, 2013 || February 16, 2013 || || [http://archive.org/details/archiveteam_xanga archive] [http://archive.org/download/archiveteam_xanga_index/ user lookup] [http://archive.org/details/archiveteam-xanga-userlist-20130142 user list]<br />
|-<br />
| [[Posterous]] || Downloads Finished || February 23, 2013 || June 29, 2013 || || [http://archive.org/details/archiveteam_posterous archive]<br />
|-<br />
| [[Storylane]] || Downloads Finished || March 8, 2013 || March 15, 2013 || ||<br />
|-<br />
| [[Yahoo! Messages]] || Downloads Finished || March 20, 2013 || March 31, 2013 || || [http://archive.org/details/archiveteam_yahoo_messages archive]<br />
|-<br />
| [[Formspring]] || Downloads Finished || March 24, 2013 || September 19, 2013 || Success || [http://archive.org/details/archiveteam_formspring archive]<br />
|-<br />
| [[Yahoo Upcoming]] || '''Archive Posted''' || April 20, 2013 || April 25, 2013 || || [http://archive.org/details/archiveteam archive]<br />
|-<br />
| [[Streetfiles]].org || Downloads Finished || April 28, 2013 || April 30, 2013 || Partial || [http://archive.org/details/archiveteam archive]<br />
|-<br />
| [[Xanga]] || Downloads Paused || June 21, 2013 || August 31, 2013 || || [http://archive.org/details/archiveteam_xanga archive] <br />
|-<br />
| [[Zapd]] || '''Archive Posted''' || October 1, 2013 || October 8, 2013 || Success || [https://archive.org/details/archiveteam_zapd archive]<br />
|-<br />
| [[Blip.tv]] || Active || October 11, 2013 || || || <br />
|}<br />
<br />
=== Status ===<br />
:; In Development : a future project<br />
:; Active : start up a Warrior and join the fun; this one is in progress right now<br />
:; Downloads Finished : we've finished downloading the data<br />
:; Archived : the collected data has been properly archived<br />
:; Archive Posted : the archive is available for download<br />
<br />
=== Result ===<br />
:; Success : downloaded all of the data and posted the archive publicly<br />
:; Qualified Success : either we couldn't get all of the data, or the archive can't be made public<br />
:; Failure : the site closed before we could download anything<br />
<br />
== Testing pre-production code ==<br />
<br />
(Don't do this unless you really need or want to.) If you are developing a warrior script, you can test it by switching your warrior from the <code>master</code> branch to the <code>development</code> branch or create another branch.<br />
<br />
<ol><br />
<li>Start the warrior.</li><br />
<li>Press Alt+F2 and log in as <code>root</code> , password <code>archiveteam</code></li><br />
<li><code>cd /home/warrior/warrior-code2</code></li><br />
<li><code>sudo -u warrior git checkout development</code></li><br />
<li><code>reboot</code></li><br />
</ol><br />
<br />
By the same route you can return your warrior to the <code>master</code> branch.<br />
<br />
== How the warrior works ==<br />
The warrior image is built off Debian 6.0.5 (squeeze). Here are the basics:<br />
<br />
* kernel 2.6.32-5-686 (released 2009-03-12)<br />
* Python 2.6.6, pip 1.1<br />
* Perl v5.10.1, cpan 1.9402 (still needs config)<br />
* gcc 4.4.5, make 3.81, bash 4.1.5<br />
* nano 2.2.4 with color syntax highlighting<br />
* curl 7.21.0<br />
<br />
The code for each project is stored in <code>/home/warrior/projects/<PROJECTNAME>/</code><br />
<br />
# Start the virtual machine<br />
# Linux boots<br />
# The user <code>warrior</code> is automatically logged in.<br />
# <code>/etc/inittab</code> kicks off <code>/home/warrior/warrior-code2/boot.sh</code>.<br />
## This will <code>git pull https://github.com/ArchiveTeam/warrior-code2</code> into <code>/home/warrior/warrior-code2/</code>.<br />
## <code>/home/warrior/warrior-code2/warrior-runner.sh</code> sets up a process which monitors <code>/dev/shm/ready-for-warrior</code> and launches <code>run-warrior</code> when the state changes.<br />
# <code>boot.sh</code> launches <code>/home/warrior/warrior-code/boot-part-2.sh</code><br />
# <code>boot-part-2.sh</code> is a short script that does the following:<br />
## <code>./warrior-install.sh</code><br />
### install/update seesaw, check branch, version<br />
### install framebuffer support, DNS caching<br />
### sets up <code>/data</code><br />
## <code>sudo ./make-data-disk.sh</code><br />
### cleans up<br />
### creates and prepares the partition<br />
### <code>mkdir -p /home/warrior/projects</code><br />
## <code>touch /dev/shm/ready-for-warrior</code><br />
### triggers the launch of <code>/usr/local/bin/run-warrior</code> which launches <code>/home/warrior/warrior-code2/src/seesaw/run-warrior</code><br />
### contacts warriorhq.archiveteam.org and requests the <code>projects.json</code> file. This file contains the projects you see in the Available Projects page.<br />
## <code>./say-hello.sh</code><br />
### setup vmware port forwarding<br />
### show splash screen<br />
# Point your web browser to http://localhost:8001 and go.</div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=Nwnyet&diff=17820Nwnyet2013-10-17T19:19:55Z<p>DukeNukem: </p>
<hr />
<div>{{Infobox project<br />
| title = nwnyet<br />
| URL = {{url|1=http://www.telinco.co.uk/}}<br />
| project_status = {{closed}}<br />
| archiving_status = {{saved}}<br />
| irc = nwnyet<br />
}}<br />
<br />
{{update_me}}<br />
<br />
{{expand}}<br />
<br />
== The story ==<br />
Over the years UK ISPs have been bought and sold so many times trying to keep track is a mess. Someone decided it was time to delete pages from the 1990s so http://www.telinco.co.uk and http://www.nwnet.co.uk/ and the user pages they host are being deleted.<br />
<br />
== The Plan ==<br />
We are searching for links to these sites by going through the common crawl corpus, wayback machine, google, and bing. In addition, brute force methods are being employed for url discovery.<br />
<br />
* The found urls are on our [http://pad.archivingyoursh.it/p/nwnet.co.uk etherpad]<br />
* telinco.co.uk [http://www.privatepaste.com/6a4e3ea6f1 subdomain list]<br />
<br />
The collected urls were downloaded using wget and uploaded to 3 items on the Internet Archive.</div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=Blip.tv&diff=17725Blip.tv2013-10-11T14:17:05Z<p>DukeNukem: </p>
<hr />
<div>{{Infobox project<br />
| title = Blip.tv<br />
| logo = Blip_web_logo.png<br />
| image = Blip.tv_1303512711518.png<br />
| description = <br />
| URL = http://blip.tv/<br />
| project_status = {{online}} ('''special case''')<br />
| archiving_status = {{inprogress}}<br />
| source = [https://github.com/ArchiveTeam/blip.tv-grab-video-only blip.tv-grab-video-only]<br />
| tracker = [http://tracker.archiveteam.org/bloopertv/ bloopertv]<br />
| irc = blooper.tv<br />
}}<br />
<br />
'''Blip.tv''' is a [[Video hostings|video sharing]] website.<br />
<br />
{{expand}}<br />
<br />
== Blip.tv 2.0 ==<br />
<br />
[[File:Blip.jpg]]<br />
<br />
<blockquote><br />
<p>Blip.tv Acquired By Video-Blog Killers Maker Studios</p><br />
<p>''Posted on Oct 8 2013 - 3:12pm by Zennie Abraham</p><br />
<p>Blip.tv, the video sharing site that was created by a team lead by Mike Hudack and Dina Kaplan, is dead. It’s now called “the old Blip.tv” and has been replaced by something owned by that horrible video-channel eating company network Maker Studios. (Mike and Dina left Blip in 2012.)</p><br />
<br />
<p>And if you’re asking “Is that the same Maker Studios that took YouTube Partner Ray William Johnson’s Google AdSense account and never gave it back to him? The same Maker Studios that pushed Pew Die Pie at us on YouTube? The same Maker Studios that was founded by Danny Zappin, Lisa Donovan (LisaNova on YouTube), Scott Katz, Derek Jones and Will Watkin? The same Maker Studios that’s involved in a nasty lawsuit between Mr. Zappin and the others, including his ex-girlfriend Lisa Donovan?</p><br />
<br />
<p>The answer is yes.<ref>http://www.zennie62blog.com/2013/10/08/blip-tv-acquired-by-video-blog-killers-maker-studios-94233/</ref></p><br />
</blockquote><br />
<br />
== Site Structure ==<br />
* On pages like http://blip.tv/comedy-videos there is pagination but the page links are only '#' and need js to work<br />
* Each anchor tag does have a data-results_page value that appears to carry the url information. Here is an example:<br />
<a class='currentResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=1">1</a><br />
<a class='advanceResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=2">2</a><br />
<a class='advanceResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=3">3</a><br />
* Only 3 links of pagination are shown at a time. To find out how many pages a category has you must click the "double arrow right" at the bottom of the page, which requires javascript.<br />
* Each page of results in a category is only 8 shows at a time.<br />
* Each show has x many episodes<br />
* To video an episode list for a show you must have javascript enabled. This is also true for pagination on these pages.<br />
* Some shows have rss feeds. Example http://blip.tv/ylse/rss<br />
* RSS feeds only show a partial list of episodes. http://blip.tv/schlomo/ has more episodes than listed in the RSS feed.<br />
* If you have the url of a video's page http://blip.tv/zomblogalypse/zomblogalypse-series-trailer-5617646 for example get-flash-video can download the video in a nice mp4 file.<br />
* robots.txt crawl delay is 1 second.<br />
* Here's some 20,000 blip.tv urls from [[URLTeam]]: http://paste.archivingyoursh.it/raw/gidaqotimo.sm<br />
<br />
=== sitemap ===<br />
<br />
* The sitemap is http://blip.tv/sitemap/xml/bliptv-sitemap-index.xml which links to more sitemaps.<br />
* Contains 3,397 shows that consist of 1 or more episodes.<br />
* Here is a pretty printed example of one of the sitemap files. http://paste.archivingyoursh.it/vomilosedu.xml<br />
<br />
== URL Discovery ==<br />
<br />
Here are all the video urls for blip.tv from the sitemap files, sorted and de-duplicated. Total count 228,133. https://archive.org/details/2013_10_09_bliptv_urls<br />
<br />
== External links ==<br />
* {{url|1=http://blip.tv/|2=Blip.tv}}<br />
* [http://support.blip.tv/entries/23277196-An-Important-Update-from-Blip-Regarding-Account-Removals An Important Update from Blip Regarding Account Removals]<br />
<br />
== References ==<br />
<br />
<references/><br />
<br />
{{Navigation box}}<br />
<br />
[[Category:Video hostings]]</div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=ArchiveTeam_Warrior&diff=17721ArchiveTeam Warrior2013-10-10T19:58:59Z<p>DukeNukem: /* Testing pre-production code */</p>
<hr />
<div>----<br />
==What is the Archive Team Warrior?==<br />
<br />
[[Image:Archive_team.png|100px|left]]<br />
[[Image:Warrior-vm-screenshot.png|right]]<br />
[[Image:Warrior-web-screenshot.png|right]]<br />
<br />
The Archive Team Warrior is a virtual archiving appliance. You can run it to help with the ArchiveTeam archiving efforts. It will download sites and upload them to our archive — and it’s really easy to do!<br />
<br />
The warrior is a virtual machine, so there is no risk to your computer. The warrior will only use your bandwidth and some of your disk space. It will get tasks from and report progress to the [[Tracker]].<br />
<br />
==Basic usage==<br />
<br />
The warrior runs on Windows, OS X and Linux. You’ll need [https://www.virtualbox.org/ VirtualBox] (recommended), VMware workstation/player, or a similar program to run the virtual machine.<br />
<br />
Instructions for VirtualBox:<br />
<ol><br />
<li>Download the [http://archive.org/download/archiveteam-warrior/archiveteam-warrior-v2-20121008.ova appliance] (174MB).</li><br />
<li>In VirtualBox, click File > Import Appliance and open the file.</li><br />
<li>Start the virtual machine. It will fetch the latest updates and will eventually tell you to start your web browser.</li><br />
</ol><br />
<br />
Once you’ve started your warrior:<br />
<ol><br />
<li>Go to http://localhost:8001/ and check the Settings page.</li><br />
<li>Choose a username — we’ll show your progress on the leaderboard.</li><br />
<li>Go to the All projects tab and pick a project to work on. Even better: select ArchiveTeam’s Choice to let your warrior work on the most urgent project.</li><br />
</ol><br />
<br />
<br />
==Warrior FAQ==<br />
<br />
=== Why am I seeing a message about that no item was received? ===<br />
<br />
It means that there is no work available. This happens for several because:<br />
<br />
* There project has just finished and someone is inspecting the work done. If a problem is discovered, items may be re-queued and more work is available.<br />
* In the rare case, you have been banned by a tracker administrator because you were requesting too much work or your internet connection is "unclean". We prefer connections from many public IP addresses, use of non-captive DNS servers, and no proxies/firewalls.<br />
<br />
=== Why am I seeing a message about rate limiting? ===<br />
<br />
Keep in mind that although downloading the internet for digital preservation and fun are the primary goals of all Archive Team activities, serious stress on the target's server may occur. The rate limit is imposed by a [[Tracker#People|tracker administrator]] and should not be subverted.<br />
<br />
===Help! The warrior is eating all my bandwidth!===<br />
<br />
You can limit the warriors bandwidth quite easily for virtualbox as long as you are running a relatively recent version. The option is not offered with a GUI however.<br />
<br />
The command <pre>VBoxManage bandwidthctl archiveteam-warrior-2 --name Limit --add network --limit 3</pre> will limit the warrior instance called archiveteam-warrior-2 (The default name of the warrior vm currently) to 3Mb/s. Adjust as needed.<br />
<br />
In the latest version of VirtualBox on Windows, the syntax appears to have changed. The correct command now seems to be:<br />
<br />
<pre>VBoxManage bandwidthctl archiveteam-warrior-2 add netlimit --type network --limit 3</pre><br />
<br />
=== I turned my warrior off, will those tasks be lost? ===<br />
<br />
If you've killed your warrior instances then the work your warrior did has been lost, however the tasks will be returned to the pool after a period of time. If you want you can alert the admins via IRC of whats happened, and they can clear the claims your username may of made however this isn't very important on most projects.<br />
<br />
=== I need to disconnect my internet / reboot my PC but I don't want to lose work ===<br />
<br />
If you pause/suspend the warrior instance, most projects will allow resuming of work in progress when you unsuspend the warrior instance.<br />
<br />
=== I told the warrior to shutdown from the interface but nothing has changed! what gives? ===<br />
<br />
The warrior will attempt to finish the current running tasks before shutting down. If you need to shut down right away; go ahead, your progress will be lost however the jobs will eventually cycle out to another user.<br />
<br />
=== How much disk space will the warrior use? ===<br />
<br />
Short answer: it depends on the project.<br />
<br />
Long answer: because the way each project defines an item differently, the warrior may be downloading a small file to downloading a whole subsection of a website. The virtual machine is configured by default to use 60GB as an absolute maximum. Any unused virtual machine disk space is not used on the host computer. You may, however, run the virtual machine on less than 60GB if you like to live dangerously. We're downloading the internet after all!<br />
<br />
=== The secondary disk is using up space even though it's not running a project. ===<br />
<br />
Virtual machine disk images do not behave like a regular file. There are several ways to reclaim space:<br />
<br />
* Delete the second disk and put back an empty disk. The warrior should reformat the second disk.<br />
* Delete the entire warrior application and re-import it.<br />
* Use the zerofree program and then clone the disk image. Reattach the cloned disk image.<br />
<br />
=== I can't connect to localhost? ===<br />
<br />
The application includes a configuration to set up port forwarding to the guest machine on port 8001 so you can access the interface through your web browser. If this does not happen, you may need to double check your machine's network settings.<br />
<br />
=== I'm looking at the text scrolling by and I notice some errors? Rsync is not working? ===<br />
<br />
Uh-oh! Something is not right. Notify us immediately in the appropriate [[IRC]] channel.<br />
<br />
=== The warrior seems to have too much overhead. I can't run a VM in a VPS! ===<br />
<br />
You don't need to run a virtual machine. If you are managing a VPS, it's likely you are comfortable with some Linux stuff. Projects can be run manually. Consult the project wiki page or the source code repository readme file.<br />
<br />
=== Why a virtual machine in the first place? ===<br />
<br />
The virtual machine is a quick, safe, and easy way for newcomers to help us out. It offers many features:<br />
<br />
* Graphical interface<br />
* Automatically selects which project is important to run<br />
* Self-updating software infrastructure<br />
* Allows for unattended use<br />
* In case of software faults, your machine is not ruined<br />
* Restarts itself in case of runaway programs<br />
* Runs on Windows, Mac OS, Linux painlessly<br />
<br />
If you have suggestions for improving this system, please talk to us as described below.<br />
<br />
=== I still have a question! ===<br />
<br />
Talk to us on [[IRC]]. Use [irc://irc.efnet.org/warrior #warrior] for specific warrior questions or [irc://irc.efnet.org/archiveteam #archiveteam] for general questions.<br />
<br />
== Projects ==<br />
<br />
Previous and current warrior projects:<br />
<br />
{| class="wikitable"<br />
! Project !! Status !! Began !! Finished !! Result !! Archive Location<br />
|-<br />
| [[MobileMe]] || '''Archive Posted''' || April 3, 2012 || Aug 8, 2012 || Success || <br />
[http://archive.org/details/archiveteam-mobileme-hero archive] [http://archive.org/details/archiveteam-mobileme-index index] [http://archive.org/download/archiveteam-mobileme-index/mobileme-20120817.html user lookup]<br />
|-<br />
| [[FortuneCity]] || '''Archive Posted''' || April 4, 2012 || April 11, 2012 || Partial Success || [http://archive.org/details/archiveteam-fortunecity archive] [http://archive.org/download/test-memac-index-test/fortunecity.html user lookup]<br />
|-<br />
| [[Tabblo]] || '''Archive Posted''' || May 23, 2012 || May 26, 2012 || Success || [http://archive.org/details/tabblo-archive archive] [http://archive.org/download/test-memac-index-test/tabblo.html user lookup]<br />
|-<br />
| [[Picplz]] || '''Archive Posted''' || June 3, 2012 || June 15, 2012 || || [http://archive.org/details/archiveteam-picplz archive] [http://archive.org/details/archiveteam-picplz-index index] [http://archive.org/download/archiveteam-picplz-index/picplz-20120823.html user lookup]<br />
|-<br />
| [[Tumblr]] (test project) || '''Archive Posted''' || August 9, 2012 || August 19, 2012 || || [http://archive.org/details/archiveteam-tumblr-test archive (tar)] [http://archive.org/details/archiveteam-tumblr-test-warc archive (warc)]<br />
|-<br />
| [[Cinch]].FM || '''Archive Posted''' || August 20, 2012 || August 22, 2012 || Success || [http://archive.org/details/archiveteam-cinch archive]<br />
|-<br />
| [[City of Heroes]] || '''Archive Posted''' || September 3, 2012 || December 1, 2012 || Success || [http://archive.org/details/archiveteam-city-of-heroes-www www] [http://archive.org/details/archiveteam-city-of-heroes-main forums] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-1 1] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-2 2] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-3 3] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-4 4] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-5 5]<br />
|-<br />
| [[Webshots]] || '''Archive Posted''' || October 4, 2012 || November 18, 2012 || || [http://archive.org/download/webshots-freeze-frame-index/index.html index]<br />
|-<br />
| [[BT Internet]] || '''Archive Posted''' || October 10, 2012 || November 2, 2012 || Success || [http://archive.org/details/archiveteam-btinternet archive]<br />
|-<br />
| [[DailyBooth| Daily Booth]] || '''Archive Posted''' || November 19, 2012 || December 29, 2012 || || [http://archive.org/details/archiveteam_dailybooth archive] [http://archive.org/download/dailybooth-freeze-frame-index/index.html lookup]<br />
|-<br />
| [[GitHub Downloads]] || '''Archive Posted''' || December 13, 2012 || December 17, 2012 || Success || [http://archive.org/details/github-downloads-2012-12 archive] [http://archive.org/details/archiveteam-github-repository-index-201212 index]<br />
|-<br />
| [[Yahoo! Blog]] || '''Archive Posted''' || January 8, 2013 || January 19, 2013 || || [http://archive.org/details/yahoo_korea_blogs archive]<br />
|-<br />
| [[weblog.nl]] || '''Archive Posted''' || January 19, 2013 || February 2, 2013 || || [http://archive.org/details/archiveteam_weblognl archive] [http://archive.org/download/archiveteam_weblognl-index/ lookup]<br />
|-<br />
| [[URLTeam]] || Active || || || || [http://urlte.am/releases/2013-01-02/urlteam.torrent latest]<br />
|-<br />
| [[Punchfork]] || '''Archive Posted''' || January 11, 2013 || March 6, 2013 || || [http://archive.org/details/archiveteam_punchfork archive] [http://archive.org/download/archiveteam_punchfork_index/ user lookup]<br />
|-<br />
| [[Xanga]] || Downloads Paused || January 22, 2013 || February 16, 2013 || || [http://archive.org/details/archiveteam_xanga archive] [http://archive.org/download/archiveteam_xanga_index/ user lookup] [http://archive.org/details/archiveteam-xanga-userlist-20130142 user list]<br />
|-<br />
| [[Posterous]] || Downloads Finished || February 23, 2013 || June 29, 2013 || || [http://archive.org/details/archiveteam_posterous archive]<br />
|-<br />
| [[Storylane]] || Downloads Finished || March 8, 2013 || March 15, 2013 || ||<br />
|-<br />
| [[Yahoo! Messages]] || Downloads Finished || March 20, 2013 || March 31, 2013 || || [http://archive.org/details/archiveteam_yahoo_messages archive]<br />
|-<br />
| [[Formspring]] || Downloads Finished || March 24, 2013 || September 19, 2013 || Success || [http://archive.org/details/archiveteam_formspring archive]<br />
|-<br />
| [[Yahoo Upcoming]] || '''Archive Posted''' || April 20, 2013 || April 25, 2013 || || [http://archive.org/details/archiveteam archive]<br />
|-<br />
| [[Streetfiles]].org || Downloads Finished || April 28, 2013 || April 30, 2013 || Partial || [http://archive.org/details/archiveteam archive]<br />
|-<br />
| [[Xanga]] || Downloads Paused || June 21, 2013 || August 31, 2013 || || [http://archive.org/details/archiveteam_xanga archive] <br />
|-<br />
| [[Zapd]] || '''Archive Posted''' || October 1, 2013 || October 8, 2013 || Success || [https://archive.org/details/archiveteam_zapd archive]<br />
|}<br />
<br />
=== Status ===<br />
:; In Development : a future project<br />
:; Active : start up a Warrior and join the fun; this one is in progress right now<br />
:; Downloads Finished : we've finished downloading the data<br />
:; Archived : the collected data has been properly archived<br />
:; Archive Posted : the archive is available for download<br />
<br />
=== Result ===<br />
:; Success : downloaded all of the data and posted the archive publicly<br />
:; Qualified Success : either we couldn't get all of the data, or the archive can't be made public<br />
:; Failure : the site closed before we could download anything<br />
<br />
== Testing pre-production code ==<br />
<br />
(Don't do this unless you really need or want to.) If you are developing a warrior script, you can test it by switching your warrior from the <code>master</code> branch to the <code>development</code> branch or create another branch.<br />
<br />
<ol><br />
<li>Start the warrior.</li><br />
<li>Press Alt+F2 and log in as <code>root</code> , password <code>archiveteam</code></li><br />
<li><code>cd /home/warrior/warrior-code2</code></li><br />
<li><code>sudo -u warrior git checkout development</code></li><br />
<li><code>reboot</code></li><br />
</ol><br />
<br />
By the same route you can return your warrior to the <code>master</code> branch.<br />
<br />
== How the warrior works ==<br />
The warrior image is built off Debian 6.0.5 (squeeze). Here are the basics:<br />
<br />
* kernel 2.6.32-5-686 (released 2009-03-12)<br />
* Python 2.6.6, pip 1.1<br />
* Perl v5.10.1, cpan 1.9402 (still needs config)<br />
* gcc 4.4.5, make 3.81, bash 4.1.5<br />
* nano 2.2.4 with color syntax highlighting<br />
* curl 7.21.0<br />
<br />
The code for each project is stored in /home/warrior/projects/<PROJECTNAME>/<br />
<br />
1. Start the virtual machine<br />
2. Linux boots<br />
3. The user warrior is automatically logged in.<br />
4. /etc/inittab kicks off /home/warrior/warrior-code2/boot.sh. This will git pull https://github.com/ArchiveTeam/warrior-code2 <br />
into /home/warrior/warrior-code2/. /home/warrior/warrior-code2/warrior-runner.sh sets up a process which monitors /dev/shm/ready-for-warrior<br />
and launches run-warrior when the state changes.<br />
5. boot.sh launches /home/warrior/warrior-code/boot-part-2.sh<br />
6. boot-part-2.sh is a short script that does the following:<br />
./warrior-install.sh<br />
* install/update seesaw, check branch, version<br />
* install framebuffer support, DNS caching<br />
* sets up /data<br />
sudo ./make-data-disk.sh<br />
* cleans up<br />
* creates and prepares the partition<br />
mkdir -p /home/warrior/projects<br />
touch /dev/shm/ready-for-warrior<br />
* triggers the launch of /usr/local/bin/run-warrior which launches /home/warrior/warrior-code2/src/seesaw/run-warrior<br />
./say-hello.sh<br />
* setup vmware port forwarding<br />
* show splash screen<br />
7. Point your web browser to http://localhost:8001 and go.</div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=ArchiveTeam_Warrior&diff=17720ArchiveTeam Warrior2013-10-10T08:14:38Z<p>DukeNukem: </p>
<hr />
<div>----<br />
==What is the Archive Team Warrior?==<br />
<br />
[[Image:Archive_team.png|100px|left]]<br />
[[Image:Warrior-vm-screenshot.png|right]]<br />
[[Image:Warrior-web-screenshot.png|right]]<br />
<br />
The Archive Team Warrior is a virtual archiving appliance. You can run it to help with the ArchiveTeam archiving efforts. It will download sites and upload them to our archive — and it’s really easy to do!<br />
<br />
The warrior is a virtual machine, so there is no risk to your computer. The warrior will only use your bandwidth and some of your disk space. It will get tasks from and report progress to the [[Tracker]].<br />
<br />
==Basic usage==<br />
<br />
The warrior runs on Windows, OS X and Linux. You’ll need [https://www.virtualbox.org/ VirtualBox] (recommended), VMware workstation/player, or a similar program to run the virtual machine.<br />
<br />
Instructions for VirtualBox:<br />
<ol><br />
<li>Download the [http://archive.org/download/archiveteam-warrior/archiveteam-warrior-v2-20121008.ova appliance] (174MB).</li><br />
<li>In VirtualBox, click File > Import Appliance and open the file.</li><br />
<li>Start the virtual machine. It will fetch the latest updates and will eventually tell you to start your web browser.</li><br />
</ol><br />
<br />
Once you’ve started your warrior:<br />
<ol><br />
<li>Go to http://localhost:8001/ and check the Settings page.</li><br />
<li>Choose a username — we’ll show your progress on the leaderboard.</li><br />
<li>Go to the All projects tab and pick a project to work on. Even better: select ArchiveTeam’s Choice to let your warrior work on the most urgent project.</li><br />
</ol><br />
<br />
<br />
==Warrior FAQ==<br />
<br />
=== Why am I seeing a message about that no item was received? ===<br />
<br />
It means that there is no work available. This happens for several because:<br />
<br />
* There project has just finished and someone is inspecting the work done. If a problem is discovered, items may be re-queued and more work is available.<br />
* In the rare case, you have been banned by a tracker administrator because you were requesting too much work or your internet connection is "unclean". We prefer connections from many public IP addresses, use of non-captive DNS servers, and no proxies/firewalls.<br />
<br />
=== Why am I seeing a message about rate limiting? ===<br />
<br />
Keep in mind that although downloading the internet for digital preservation and fun are the primary goals of all Archive Team activities, serious stress on the target's server may occur. The rate limit is imposed by a [[Tracker#People|tracker administrator]] and should not be subverted.<br />
<br />
===Help! The warrior is eating all my bandwidth!===<br />
<br />
You can limit the warriors bandwidth quite easily for virtualbox as long as you are running a relatively recent version. The option is not offered with a GUI however.<br />
<br />
The command <pre>VBoxManage bandwidthctl archiveteam-warrior-2 --name Limit --add network --limit 3</pre> will limit the warrior instance called archiveteam-warrior-2 (The default name of the warrior vm currently) to 3Mb/s. Adjust as needed.<br />
<br />
In the latest version of VirtualBox on Windows, the syntax appears to have changed. The correct command now seems to be:<br />
<br />
<pre>VBoxManage bandwidthctl archiveteam-warrior-2 add netlimit --type network --limit 3</pre><br />
<br />
=== I turned my warrior off, will those tasks be lost? ===<br />
<br />
If you've killed your warrior instances then the work your warrior did has been lost, however the tasks will be returned to the pool after a period of time. If you want you can alert the admins via IRC of whats happened, and they can clear the claims your username may of made however this isn't very important on most projects.<br />
<br />
=== I need to disconnect my internet / reboot my PC but I don't want to lose work ===<br />
<br />
If you pause/suspend the warrior instance, most projects will allow resuming of work in progress when you unsuspend the warrior instance.<br />
<br />
=== I told the warrior to shutdown from the interface but nothing has changed! what gives? ===<br />
<br />
The warrior will attempt to finish the current running tasks before shutting down. If you need to shut down right away; go ahead, your progress will be lost however the jobs will eventually cycle out to another user.<br />
<br />
=== How much disk space will the warrior use? ===<br />
<br />
Short answer: it depends on the project.<br />
<br />
Long answer: because the way each project defines an item differently, the warrior may be downloading a small file to downloading a whole subsection of a website. The virtual machine is configured by default to use 60GB as an absolute maximum. Any unused virtual machine disk space is not used on the host computer. You may, however, run the virtual machine on less than 60GB if you like to live dangerously. We're downloading the internet after all!<br />
<br />
=== The secondary disk is using up space even though it's not running a project. ===<br />
<br />
Virtual machine disk images do not behave like a regular file. There are several ways to reclaim space:<br />
<br />
* Delete the second disk and put back an empty disk. The warrior should reformat the second disk.<br />
* Delete the entire warrior application and re-import it.<br />
* Use the zerofree program and then clone the disk image. Reattach the cloned disk image.<br />
<br />
=== I can't connect to localhost? ===<br />
<br />
The application includes a configuration to set up port forwarding to the guest machine on port 8001 so you can access the interface through your web browser. If this does not happen, you may need to double check your machine's network settings.<br />
<br />
=== I'm looking at the text scrolling by and I notice some errors? Rsync is not working? ===<br />
<br />
Uh-oh! Something is not right. Notify us immediately in the appropriate [[IRC]] channel.<br />
<br />
=== The warrior seems to have too much overhead. I can't run a VM in a VPS! ===<br />
<br />
You don't need to run a virtual machine. If you are managing a VPS, it's likely you are comfortable with some Linux stuff. Projects can be run manually. Consult the project wiki page or the source code repository readme file.<br />
<br />
=== Why a virtual machine in the first place? ===<br />
<br />
The virtual machine is a quick, safe, and easy way for newcomers to help us out. It offers many features:<br />
<br />
* Graphical interface<br />
* Automatically selects which project is important to run<br />
* Self-updating software infrastructure<br />
* Allows for unattended use<br />
* In case of software faults, your machine is not ruined<br />
* Restarts itself in case of runaway programs<br />
* Runs on Windows, Mac OS, Linux painlessly<br />
<br />
If you have suggestions for improving this system, please talk to us as described below.<br />
<br />
=== I still have a question! ===<br />
<br />
Talk to us on [[IRC]]. Use [irc://irc.efnet.org/warrior #warrior] for specific warrior questions or [irc://irc.efnet.org/archiveteam #archiveteam] for general questions.<br />
<br />
== Projects ==<br />
<br />
Previous and current warrior projects:<br />
<br />
{| class="wikitable"<br />
! Project !! Status !! Began !! Finished !! Result !! Archive Location<br />
|-<br />
| [[MobileMe]] || '''Archive Posted''' || April 3, 2012 || Aug 8, 2012 || Success || <br />
[http://archive.org/details/archiveteam-mobileme-hero archive] [http://archive.org/details/archiveteam-mobileme-index index] [http://archive.org/download/archiveteam-mobileme-index/mobileme-20120817.html user lookup]<br />
|-<br />
| [[FortuneCity]] || '''Archive Posted''' || April 4, 2012 || April 11, 2012 || Partial Success || [http://archive.org/details/archiveteam-fortunecity archive] [http://archive.org/download/test-memac-index-test/fortunecity.html user lookup]<br />
|-<br />
| [[Tabblo]] || '''Archive Posted''' || May 23, 2012 || May 26, 2012 || Success || [http://archive.org/details/tabblo-archive archive] [http://archive.org/download/test-memac-index-test/tabblo.html user lookup]<br />
|-<br />
| [[Picplz]] || '''Archive Posted''' || June 3, 2012 || June 15, 2012 || || [http://archive.org/details/archiveteam-picplz archive] [http://archive.org/details/archiveteam-picplz-index index] [http://archive.org/download/archiveteam-picplz-index/picplz-20120823.html user lookup]<br />
|-<br />
| [[Tumblr]] (test project) || '''Archive Posted''' || August 9, 2012 || August 19, 2012 || || [http://archive.org/details/archiveteam-tumblr-test archive (tar)] [http://archive.org/details/archiveteam-tumblr-test-warc archive (warc)]<br />
|-<br />
| [[Cinch]].FM || '''Archive Posted''' || August 20, 2012 || August 22, 2012 || Success || [http://archive.org/details/archiveteam-cinch archive]<br />
|-<br />
| [[City of Heroes]] || '''Archive Posted''' || September 3, 2012 || December 1, 2012 || Success || [http://archive.org/details/archiveteam-city-of-heroes-www www] [http://archive.org/details/archiveteam-city-of-heroes-main forums] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-1 1] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-2 2] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-3 3] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-4 4] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-5 5]<br />
|-<br />
| [[Webshots]] || '''Archive Posted''' || October 4, 2012 || November 18, 2012 || || [http://archive.org/download/webshots-freeze-frame-index/index.html index]<br />
|-<br />
| [[BT Internet]] || '''Archive Posted''' || October 10, 2012 || November 2, 2012 || Success || [http://archive.org/details/archiveteam-btinternet archive]<br />
|-<br />
| [[DailyBooth| Daily Booth]] || '''Archive Posted''' || November 19, 2012 || December 29, 2012 || || [http://archive.org/details/archiveteam_dailybooth archive] [http://archive.org/download/dailybooth-freeze-frame-index/index.html lookup]<br />
|-<br />
| [[GitHub Downloads]] || '''Archive Posted''' || December 13, 2012 || December 17, 2012 || Success || [http://archive.org/details/github-downloads-2012-12 archive] [http://archive.org/details/archiveteam-github-repository-index-201212 index]<br />
|-<br />
| [[Yahoo! Blog]] || '''Archive Posted''' || January 8, 2013 || January 19, 2013 || || [http://archive.org/details/yahoo_korea_blogs archive]<br />
|-<br />
| [[weblog.nl]] || '''Archive Posted''' || January 19, 2013 || February 2, 2013 || || [http://archive.org/details/archiveteam_weblognl archive] [http://archive.org/download/archiveteam_weblognl-index/ lookup]<br />
|-<br />
| [[URLTeam]] || Active || || || || [http://urlte.am/releases/2013-01-02/urlteam.torrent latest]<br />
|-<br />
| [[Punchfork]] || '''Archive Posted''' || January 11, 2013 || March 6, 2013 || || [http://archive.org/details/archiveteam_punchfork archive] [http://archive.org/download/archiveteam_punchfork_index/ user lookup]<br />
|-<br />
| [[Xanga]] || Downloads Paused || January 22, 2013 || February 16, 2013 || || [http://archive.org/details/archiveteam_xanga archive] [http://archive.org/download/archiveteam_xanga_index/ user lookup] [http://archive.org/details/archiveteam-xanga-userlist-20130142 user list]<br />
|-<br />
| [[Posterous]] || Downloads Finished || February 23, 2013 || June 29, 2013 || || [http://archive.org/details/archiveteam_posterous archive]<br />
|-<br />
| [[Storylane]] || Downloads Finished || March 8, 2013 || March 15, 2013 || ||<br />
|-<br />
| [[Yahoo! Messages]] || Downloads Finished || March 20, 2013 || March 31, 2013 || || [http://archive.org/details/archiveteam_yahoo_messages archive]<br />
|-<br />
| [[Formspring]] || Downloads Finished || March 24, 2013 || September 19, 2013 || Success || [http://archive.org/details/archiveteam_formspring archive]<br />
|-<br />
| [[Yahoo Upcoming]] || '''Archive Posted''' || April 20, 2013 || April 25, 2013 || || [http://archive.org/details/archiveteam archive]<br />
|-<br />
| [[Streetfiles]].org || Downloads Finished || April 28, 2013 || April 30, 2013 || Partial || [http://archive.org/details/archiveteam archive]<br />
|-<br />
| [[Xanga]] || Downloads Paused || June 21, 2013 || August 31, 2013 || || [http://archive.org/details/archiveteam_xanga archive] <br />
|-<br />
| [[Zapd]] || '''Archive Posted''' || October 1, 2013 || October 8, 2013 || Success || [https://archive.org/details/archiveteam_zapd archive]<br />
|}<br />
<br />
=== Status ===<br />
:; In Development : a future project<br />
:; Active : start up a Warrior and join the fun; this one is in progress right now<br />
:; Downloads Finished : we've finished downloading the data<br />
:; Archived : the collected data has been properly archived<br />
:; Archive Posted : the archive is available for download<br />
<br />
=== Result ===<br />
:; Success : downloaded all of the data and posted the archive publicly<br />
:; Qualified Success : either we couldn't get all of the data, or the archive can't be made public<br />
:; Failure : the site closed before we could download anything<br />
<br />
== Testing pre-production code ==<br />
<br />
(Don't do this unless you really need or want to.) If you are developing a warrior script, you can test it by switching your warrior from the <code>production</code> branch to the <code>master</code> branch.<br />
<br />
<ol><br />
<li>Start the warrior.</li><br />
<li>Press Alt+F2 and log in with username <code>root</code> and password <code>archiveteam</code>.</li><br />
<li><code>cd /home/warrior/warrior-code</code></li><br />
<li><code>sudo -u warrior git checkout master</code></li><br />
<li><code>reboot</code></li><br />
</ol><br />
<br />
By the same route you can return your warrior to the <code>production</code> branch.<br />
<br />
== How the warrior works ==<br />
The warrior image is built off Debian 6.0.5 (squeeze). Here are the basics:<br />
<br />
* kernel 2.6.32-5-686 (released 2009-03-12)<br />
* Python 2.6.6, pip 1.1<br />
* Perl v5.10.1, cpan 1.9402 (still needs config)<br />
* gcc 4.4.5, make 3.81, bash 4.1.5<br />
* nano 2.2.4 with color syntax highlighting<br />
* curl 7.21.0<br />
<br />
The code for each project is stored in /home/warrior/projects/<PROJECTNAME>/<br />
<br />
1. Start the virtual machine<br />
2. Linux boots<br />
3. The user warrior is automatically logged in.<br />
4. /etc/inittab kicks off /home/warrior/warrior-code2/boot.sh. This will git pull https://github.com/ArchiveTeam/warrior-code2 <br />
into /home/warrior/warrior-code2/. /home/warrior/warrior-code2/warrior-runner.sh sets up a process which monitors /dev/shm/ready-for-warrior<br />
and launches run-warrior when the state changes.<br />
5. boot.sh launches /home/warrior/warrior-code/boot-part-2.sh<br />
6. boot-part-2.sh is a short script that does the following:<br />
./warrior-install.sh<br />
* install/update seesaw, check branch, version<br />
* install framebuffer support, DNS caching<br />
* sets up /data<br />
sudo ./make-data-disk.sh<br />
* cleans up<br />
* creates and prepares the partition<br />
mkdir -p /home/warrior/projects<br />
touch /dev/shm/ready-for-warrior<br />
* triggers the launch of /usr/local/bin/run-warrior which launches /home/warrior/warrior-code2/src/seesaw/run-warrior<br />
./say-hello.sh<br />
* setup vmware port forwarding<br />
* show splash screen<br />
7. Point your web browser to http://localhost:8001 and go.</div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=Blip.tv&diff=17719Blip.tv2013-10-09T22:46:17Z<p>DukeNukem: </p>
<hr />
<div>{{Infobox project<br />
| title = Blip.tv<br />
| logo = Blip_web_logo.png<br />
| image = Blip.tv_1303512711518.png<br />
| description = <br />
| URL = http://blip.tv/<br />
| project_status = {{online}}<br />
| archiving_status = {{nosavedyet}}<br />
| irc = blooper.tv<br />
}}<br />
<br />
'''Blip.tv''' is a [[Video hostings|video sharing]] website.<br />
<br />
{{expand}}<br />
<br />
== Blip.tv 2.0 ==<br />
<br />
[[File:Blip.jpg]]<br />
<br />
<blockquote><br />
<p>Blip.tv Acquired By Video-Blog Killers Maker Studios</p><br />
<p>''Posted on Oct 8 2013 - 3:12pm by Zennie Abraham</p><br />
<p>Blip.tv, the video sharing site that was created by a team lead by Mike Hudack and Dina Kaplan, is dead. It’s now called “the old Blip.tv” and has been replaced by something owned by that horrible video-channel eating company network Maker Studios. (Mike and Dina left Blip in 2012.)</p><br />
<br />
<p>And if you’re asking “Is that the same Maker Studios that took YouTube Partner Ray William Johnson’s Google AdSense account and never gave it back to him? The same Maker Studios that pushed Pew Die Pie at us on YouTube? The same Maker Studios that was founded by Danny Zappin, Lisa Donovan (LisaNova on YouTube), Scott Katz, Derek Jones and Will Watkin? The same Maker Studios that’s involved in a nasty lawsuit between Mr. Zappin and the others, including his ex-girlfriend Lisa Donovan?</p><br />
<br />
<p>The answer is yes.<ref>http://www.zennie62blog.com/2013/10/08/blip-tv-acquired-by-video-blog-killers-maker-studios-94233/</ref></p><br />
</blockquote><br />
<br />
== Site Structure ==<br />
* On pages like http://blip.tv/comedy-videos there is pagination but the page links are only '#' and need js to work<br />
* Each anchor tag does have a data-results_page value that appears to carry the url information. Here is an example:<br />
<a class='currentResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=1">1</a><br />
<a class='advanceResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=2">2</a><br />
<a class='advanceResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=3">3</a><br />
* Only 3 links of pagination are shown at a time. To find out how many pages a category has you must click the "double arrow right" at the bottom of the page, which requires javascript.<br />
* Each page of results in a category is only 8 shows at a time.<br />
* Each show has x many episodes<br />
* To video an episode list for a show you must have javascript enabled. This is also true for pagination on these pages.<br />
* Some shows have rss feeds. Example http://blip.tv/ylse/rss<br />
* RSS feeds only show a partial list of episodes. http://blip.tv/schlomo/ has more episodes than listed in the RSS feed.<br />
* If you have the url of a video's page http://blip.tv/zomblogalypse/zomblogalypse-series-trailer-5617646 for example get-flash-video can download the video in a nice mp4 file.<br />
* robots.txt crawl delay is 1 second.<br />
* Here's some 20,000 blip.tv urls from [[URLTeam]]: http://paste.archivingyoursh.it/raw/gidaqotimo.sm<br />
<br />
=== sitemap ===<br />
<br />
* The sitemap is http://blip.tv/sitemap/xml/bliptv-sitemap-index.xml which links to more sitemaps.<br />
* Contains 3,397 shows that consist of 1 or more episodes.<br />
* Here is a pretty printed example of one of the sitemap files. http://paste.archivingyoursh.it/vomilosedu.xml<br />
<br />
== URL Discovery ==<br />
<br />
Here are all the video urls for blip.tv from the sitemap files, sorted and de-duplicated. Total count 228,133. https://archive.org/details/2013_10_09_bliptv_urls<br />
<br />
== External links ==<br />
* {{url|1=http://blip.tv/|2=Blip.tv}}<br />
* [http://support.blip.tv/entries/23277196-An-Important-Update-from-Blip-Regarding-Account-Removals An Important Update from Blip Regarding Account Removals]<br />
<br />
== References ==<br />
<br />
<references/><br />
<br />
{{Navigation box}}<br />
<br />
[[Category:Video hostings]]</div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=Blip.tv&diff=17718Blip.tv2013-10-09T21:43:53Z<p>DukeNukem: /* sitemap */</p>
<hr />
<div>{{Infobox project<br />
| title = Blip.tv<br />
| logo = Blip_web_logo.png<br />
| image = Blip.tv_1303512711518.png<br />
| description = <br />
| URL = http://blip.tv/<br />
| project_status = {{online}}<br />
| archiving_status = {{nosavedyet}}<br />
| irc = blooper.tv<br />
}}<br />
<br />
'''Blip.tv''' is a [[Video hostings|video sharing]] website.<br />
<br />
{{expand}}<br />
<br />
== Blip.tv 2.0 ==<br />
<br />
[[File:Blip.jpg]]<br />
<br />
<blockquote><br />
<p>Blip.tv Acquired By Video-Blog Killers Maker Studios</p><br />
<p>''Posted on Oct 8 2013 - 3:12pm by Zennie Abraham</p><br />
<p>Blip.tv, the video sharing site that was created by a team lead by Mike Hudack and Dina Kaplan, is dead. It’s now called “the old Blip.tv” and has been replaced by something owned by that horrible video-channel eating company network Maker Studios. (Mike and Dina left Blip in 2012.)</p><br />
<br />
<p>And if you’re asking “Is that the same Maker Studios that took YouTube Partner Ray William Johnson’s Google AdSense account and never gave it back to him? The same Maker Studios that pushed Pew Die Pie at us on YouTube? The same Maker Studios that was founded by Danny Zappin, Lisa Donovan (LisaNova on YouTube), Scott Katz, Derek Jones and Will Watkin? The same Maker Studios that’s involved in a nasty lawsuit between Mr. Zappin and the others, including his ex-girlfriend Lisa Donovan?</p><br />
<br />
<p>The answer is yes.<ref>http://www.zennie62blog.com/2013/10/08/blip-tv-acquired-by-video-blog-killers-maker-studios-94233/</ref></p><br />
</blockquote><br />
<br />
== Site Structure ==<br />
* On pages like http://blip.tv/comedy-videos there is pagination but the page links are only '#' and need js to work<br />
* Each anchor tag does have a data-results_page value that appears to carry the url information. Here is an example:<br />
<a class='currentResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=1">1</a><br />
<a class='advanceResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=2">2</a><br />
<a class='advanceResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=3">3</a><br />
* Only 3 links of pagination are shown at a time. To find out how many pages a category has you must click the "double arrow right" at the bottom of the page, which requires javascript.<br />
* Each page of results in a category is only 8 shows at a time.<br />
* Each show has x many episodes<br />
* To video an episode list for a show you must have javascript enabled. This is also true for pagination on these pages.<br />
* Some shows have rss feeds. Example http://blip.tv/ylse/rss<br />
* RSS feeds only show a partial list of episodes. http://blip.tv/schlomo/ has more episodes than listed in the RSS feed.<br />
* If you have the url of a video's page http://blip.tv/zomblogalypse/zomblogalypse-series-trailer-5617646 for example get-flash-video can download the video in a nice mp4 file.<br />
* robots.txt crawl delay is 1 second.<br />
* Here's some 20,000 blip.tv urls from [[URLTeam]]: http://paste.archivingyoursh.it/raw/gidaqotimo.sm<br />
<br />
<br />
=== sitemap ===<br />
<br />
* The sitemap is http://blip.tv/sitemap/xml/bliptv-sitemap-index.xml which links to more sitemaps.<br />
* Contains 3,397 shows that consist of 1 or more episodes.<br />
* Here is a pretty printed example of one of the sitemap files. http://paste.archivingyoursh.it/vomilosedu.xml<br />
<br />
== External links ==<br />
* {{url|1=http://blip.tv/|2=Blip.tv}}<br />
* [http://support.blip.tv/entries/23277196-An-Important-Update-from-Blip-Regarding-Account-Removals An Important Update from Blip Regarding Account Removals]<br />
<br />
== References ==<br />
<br />
<references/><br />
<br />
{{Navigation box}}<br />
<br />
[[Category:Video hostings]]</div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=ArchiveTeam_Warrior&diff=17715ArchiveTeam Warrior2013-10-09T15:04:15Z<p>DukeNukem: /* How the warrior works */</p>
<hr />
<div>[[Image:Archive_team.png|100px|left]]<br />
[[Image:Warrior-vm-screenshot.png|right]]<br />
[[Image:Warrior-web-screenshot.png|right]]<br />
<br />
The ArchiveTeam Warrior is a virtual archiving appliance. You can run it to help with the ArchiveTeam archiving efforts. It will download sites and upload them to our archive — and it’s really easy to do!<br />
<br />
The warrior is a virtual machine, so there is no risk to your computer. The warrior will only use your bandwidth and some of your disk space. It will get tasks from and report progress to the [[Tracker]].<br />
<br />
The warrior runs on Windows, OS X and Linux. You’ll need [https://www.virtualbox.org/ VirtualBox] (recommended), VMware or a similar program to run the virtual machine.<br />
<br />
Instructions for VirtualBox:<br />
<ol><br />
<li>Download the [http://archive.org/download/archiveteam-warrior/archiveteam-warrior-v2-20121008.ova appliance] (174MB).</li><br />
<li>In VirtualBox, click File > Import Appliance and open the file.</li><br />
<li>Start the virtual machine. It will fetch the latest updates and will eventually tell you to start your web browser.</li><br />
</ol><br />
<br />
Once you’ve started your warrior:<br />
<ol><br />
<li>Go to http://localhost:8001/ and check the Settings page.</li><br />
<li>Choose a username — we’ll show your progress on the leaderboard.</li><br />
<li>Go to the All projects tab and pick a project to work on. Even better: select ArchiveTeam’s Choice to let your warrior work on the most urgent project.</li><br />
</ol><br />
<br />
<br />
<br />
<br />
----<br />
<br />
<br />
==Warrior FAQ==<br />
<br />
=== Why am I seeing a message about that no item was received? ===<br />
<br />
It means that there is no work available. This happens for several because:<br />
<br />
* There project has just finished and someone is inspecting the work done. If a problem is discovered, items may be re-queued and more work is available.<br />
* In the rare case, you have been banned by a tracker administrator because you were requesting too much work or your internet connection is "unclean". We prefer connections from many public IP addresses, use of non-captive DNS servers, and no proxies/firewalls.<br />
<br />
=== Why am I seeing a message about rate limiting? ===<br />
<br />
Keep in mind that although downloading the internet for digital preservation and fun are the primary goals of all Archive Team activities, serious stress on the target's server may occur. The rate limit is imposed by a [[Tracker#People|tracker administrator]] and should not be subverted.<br />
<br />
===Help! The warrior is eating all my bandwidth!===<br />
<br />
You can limit the warriors bandwidth quite easily for virtualbox as long as you are running a relatively recent version. The option is not offered with a GUI however.<br />
<br />
The command <pre>VBoxManage bandwidthctl archiveteam-warrior-2 --name Limit --add network --limit 3</pre> will limit the warrior instance called archiveteam-warrior-2 (The default name of the warrior vm currently) to 3Mb/s. Adjust as needed.<br />
<br />
In the latest version of VirtualBox on Windows, the syntax appears to have changed. The correct command now seems to be:<br />
<br />
<pre>VBoxManage bandwidthctl archiveteam-warrior-2 add netlimit --type network --limit 3</pre><br />
<br />
=== I turned my warrior off, will those tasks be lost? ===<br />
<br />
If you've killed your warrior instances then the work your warrior did has been lost, however the tasks will be returned to the pool after a period of time. If you want you can alert the admins via IRC of whats happened, and they can clear the claims your username may of made however this isn't very important on most projects.<br />
<br />
=== I need to disconnect my internet / reboot my PC but I don't want to lose work ===<br />
<br />
If you pause/suspend the warrior instance, most projects will allow resuming of work in progress when you unsuspend the warrior instance.<br />
<br />
=== I told the warrior to shutdown from the interface but nothing has changed! what gives? ===<br />
<br />
The warrior will attempt to finish the current running tasks before shutting down. If you need to shut down right away; go ahead, your progress will be lost however the jobs will eventually cycle out to another user.<br />
<br />
=== How much disk space will the warrior use? ===<br />
<br />
Short answer: it depends on the project.<br />
<br />
Long answer: because the way each project defines an item differently, the warrior may be downloading a small file to downloading a whole subsection of a website. The virtual machine is configured by default to use 60GB as an absolute maximum. Any unused virtual machine disk space is not used on the host computer. You may, however, run the virtual machine on less than 60GB if you like to live dangerously. We're downloading the internet after all!<br />
<br />
=== The secondary disk is using up space even though it's not running a project. ===<br />
<br />
Virtual machine disk images do not behave like a regular file. There are several ways to reclaim space:<br />
<br />
* Delete the second disk and put back an empty disk. The warrior should reformat the second disk.<br />
* Delete the entire warrior application and re-import it.<br />
* Use the zerofree program and then clone the disk image. Reattach the cloned disk image.<br />
<br />
=== I can't connect to localhost? ===<br />
<br />
The application includes a configuration to set up port forwarding to the guest machine on port 8001 so you can access the interface through your web browser. If this does not happen, you may need to double check your machine's network settings.<br />
<br />
=== I'm looking at the text scrolling by and I notice some errors? Rsync is not working? ===<br />
<br />
Uh-oh! Something is not right. Notify us immediately in the appropriate [[IRC]] channel.<br />
<br />
=== The warrior seems to have too much overhead. I can't run a VM in a VPS! ===<br />
<br />
You don't need to run a virtual machine. If you are managing a VPS, it's likely you are comfortable with some Linux stuff. Projects can be run manually. Consult the project wiki page or the source code repository readme file.<br />
<br />
=== Why a virtual machine in the first place? ===<br />
<br />
The virtual machine is a quick, safe, and easy way for newcomers to help us out. It offers many features:<br />
<br />
* Graphical interface<br />
* Automatically selects which project is important to run<br />
* Self-updating software infrastructure<br />
* Allows for unattended use<br />
* In case of software faults, your machine is not ruined<br />
* Restarts itself in case of runaway programs<br />
* Runs on Windows, Mac OS, Linux painlessly<br />
<br />
If you have suggestions for improving this system, please talk to us as described below.<br />
<br />
=== I still have a question! ===<br />
<br />
Talk to us on [[IRC]]. Use [irc://irc.efnet.org/warrior #warrior] for specific warrior questions or [irc://irc.efnet.org/archiveteam #archiveteam] for general questions.<br />
<br />
== Projects ==<br />
<br />
Previous and current warrior projects:<br />
<br />
{| class="wikitable"<br />
! Project !! Status !! Began !! Finished !! Result !! Archive Location<br />
|-<br />
| [[MobileMe]] || '''Archive Posted''' || April 3, 2012 || Aug 8, 2012 || Success || <br />
[http://archive.org/details/archiveteam-mobileme-hero archive] [http://archive.org/details/archiveteam-mobileme-index index] [http://archive.org/download/archiveteam-mobileme-index/mobileme-20120817.html user lookup]<br />
|-<br />
| [[FortuneCity]] || '''Archive Posted''' || April 4, 2012 || April 11, 2012 || Partial Success || [http://archive.org/details/archiveteam-fortunecity archive] [http://archive.org/download/test-memac-index-test/fortunecity.html user lookup]<br />
|-<br />
| [[Tabblo]] || '''Archive Posted''' || May 23, 2012 || May 26, 2012 || Success || [http://archive.org/details/tabblo-archive archive] [http://archive.org/download/test-memac-index-test/tabblo.html user lookup]<br />
|-<br />
| [[Picplz]] || '''Archive Posted''' || June 3, 2012 || June 15, 2012 || || [http://archive.org/details/archiveteam-picplz archive] [http://archive.org/details/archiveteam-picplz-index index] [http://archive.org/download/archiveteam-picplz-index/picplz-20120823.html user lookup]<br />
|-<br />
| [[Tumblr]] (test project) || '''Archive Posted''' || August 9, 2012 || August 19, 2012 || || [http://archive.org/details/archiveteam-tumblr-test archive (tar)] [http://archive.org/details/archiveteam-tumblr-test-warc archive (warc)]<br />
|-<br />
| [[Cinch]].FM || '''Archive Posted''' || August 20, 2012 || August 22, 2012 || Success || [http://archive.org/details/archiveteam-cinch archive]<br />
|-<br />
| [[City of Heroes]] || '''Archive Posted''' || September 3, 2012 || December 1, 2012 || Success || [http://archive.org/details/archiveteam-city-of-heroes-www www] [http://archive.org/details/archiveteam-city-of-heroes-main forums] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-1 1] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-2 2] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-3 3] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-4 4] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-5 5]<br />
|-<br />
| [[Webshots]] || '''Archive Posted''' || October 4, 2012 || November 18, 2012 || || [http://archive.org/download/webshots-freeze-frame-index/index.html index]<br />
|-<br />
| [[BT Internet]] || '''Archive Posted''' || October 10, 2012 || November 2, 2012 || Success || [http://archive.org/details/archiveteam-btinternet archive]<br />
|-<br />
| [[DailyBooth| Daily Booth]] || '''Archive Posted''' || November 19, 2012 || December 29, 2012 || || [http://archive.org/details/archiveteam_dailybooth archive] [http://archive.org/download/dailybooth-freeze-frame-index/index.html lookup]<br />
|-<br />
| [[GitHub Downloads]] || '''Archive Posted''' || December 13, 2012 || December 17, 2012 || Success || [http://archive.org/details/github-downloads-2012-12 archive] [http://archive.org/details/archiveteam-github-repository-index-201212 index]<br />
|-<br />
| [[Yahoo! Blog]] || '''Archive Posted''' || January 8, 2013 || January 19, 2013 || || [http://archive.org/details/yahoo_korea_blogs archive]<br />
|-<br />
| [[weblog.nl]] || '''Archive Posted''' || January 19, 2013 || February 2, 2013 || || [http://archive.org/details/archiveteam_weblognl archive] [http://archive.org/download/archiveteam_weblognl-index/ lookup]<br />
|-<br />
| [[URLTeam]] || Active || || || || [http://urlte.am/releases/2013-01-02/urlteam.torrent latest]<br />
|-<br />
| [[Punchfork]] || '''Archive Posted''' || January 11, 2013 || March 6, 2013 || || [http://archive.org/details/archiveteam_punchfork archive] [http://archive.org/download/archiveteam_punchfork_index/ user lookup]<br />
|-<br />
| [[Xanga]] || Downloads Paused || January 22, 2013 || February 16, 2013 || || [http://archive.org/details/archiveteam_xanga archive] [http://archive.org/download/archiveteam_xanga_index/ user lookup] [http://archive.org/details/archiveteam-xanga-userlist-20130142 user list]<br />
|-<br />
| [[Posterous]] || Downloads Finished || February 23, 2013 || June 29, 2013 || || [http://archive.org/details/archiveteam_posterous archive]<br />
|-<br />
| [[Storylane]] || Downloads Finished || March 8, 2013 || March 15, 2013 || ||<br />
|-<br />
| [[Yahoo! Messages]] || Downloads Finished || March 20, 2013 || March 31, 2013 || || [http://archive.org/details/archiveteam_yahoo_messages archive]<br />
|-<br />
| [[Formspring]] || Downloads Finished || March 24, 2013 || September 19, 2013 || Success || [http://archive.org/details/archiveteam_formspring archive]<br />
|-<br />
| [[Yahoo Upcoming]] || '''Archive Posted''' || April 20, 2013 || April 25, 2013 || || [http://archive.org/details/archiveteam archive]<br />
|-<br />
| [[Streetfiles]].org || Downloads Finished || April 28, 2013 || April 30, 2013 || Partial || [http://archive.org/details/archiveteam archive]<br />
|-<br />
| [[Xanga]] || Downloads Paused || June 21, 2013 || August 31, 2013 || || [http://archive.org/details/archiveteam_xanga archive] <br />
|-<br />
| [[Zapd]] || '''Archive Posted''' || October 1, 2013 || October 8, 2013 || Success || [https://archive.org/details/archiveteam_zapd archive]<br />
|}<br />
<br />
=== Status ===<br />
:; In Development : a future project<br />
:; Active : start up a Warrior and join the fun; this one is in progress right now<br />
:; Downloads Finished : we've finished downloading the data<br />
:; Archived : the collected data has been properly archived<br />
:; Archive Posted : the archive is available for download<br />
<br />
=== Result ===<br />
:; Success : downloaded all of the data and posted the archive publicly<br />
:; Qualified Success : either we couldn't get all of the data, or the archive can't be made public<br />
:; Failure : the site closed before we could download anything<br />
<br />
== Testing pre-production code ==<br />
<br />
(Don't do this unless you really need or want to.) If you are developing a warrior script, you can test it by switching your warrior from the <code>production</code> branch to the <code>master</code> branch.<br />
<br />
<ol><br />
<li>Start the warrior.</li><br />
<li>Press Alt+F2 and log in with username <code>root</code> and password <code>archiveteam</code>.</li><br />
<li><code>cd /home/warrior/warrior-code</code></li><br />
<li><code>sudo -u warrior git checkout master</code></li><br />
<li><code>reboot</code></li><br />
</ol><br />
<br />
By the same route you can return your warrior to the <code>production</code> branch.<br />
<br />
== How the warrior works ==<br />
The warrior image is built off Debian 6.0.5 (squeeze). Here are the basics:<br />
<br />
* kernel 2.6.32-5-686 (released 2009-03-12)<br />
* Python 2.6.6, pip 1.1<br />
* Perl v5.10.1, cpan 1.9402 (still needs config)<br />
* gcc 4.4.5, make 3.81, bash 4.1.5<br />
* nano 2.2.4 with color syntax highlighting<br />
* curl 7.21.0<br />
<br />
The code for each project is stored in /home/warrior/projects/<PROJECTNAME>/<br />
<br />
1. Start the virtual machine<br />
2. Linux boots<br />
3. The user warrior is automatically logged in.<br />
4. /etc/inittab kicks off /home/warrior/warrior-code2/boot.sh. This will git pull https://github.com/ArchiveTeam/warrior-code2 <br />
into /home/warrior/warrior-code2/. /home/warrior/warrior-code2/warrior-runner.sh sets up a process which monitors /dev/shm/ready-for-warrior<br />
and launches run-warrior when the state changes.<br />
5. boot.sh launches /home/warrior/warrior-code/boot-part-2.sh<br />
6. boot-part-2.sh is a short script that does the following:<br />
./warrior-install.sh<br />
* install/update seesaw, check branch, version<br />
* install framebuffer support, DNS caching<br />
* sets up /data<br />
sudo ./make-data-disk.sh<br />
* cleans up<br />
* creates and prepares the partition<br />
mkdir -p /home/warrior/projects<br />
touch /dev/shm/ready-for-warrior<br />
* triggers the launch of /usr/local/bin/run-warrior which launches /home/warrior/warrior-code2/src/seesaw/run-warrior<br />
./say-hello.sh<br />
* setup vmware port forwarding<br />
* show splash screen<br />
7. Point your web browser to http://localhost:8001 and go.</div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=ArchiveTeam_Warrior&diff=17714ArchiveTeam Warrior2013-10-09T15:01:17Z<p>DukeNukem: /* How the warrior works */</p>
<hr />
<div>[[Image:Archive_team.png|100px|left]]<br />
[[Image:Warrior-vm-screenshot.png|right]]<br />
[[Image:Warrior-web-screenshot.png|right]]<br />
<br />
The ArchiveTeam Warrior is a virtual archiving appliance. You can run it to help with the ArchiveTeam archiving efforts. It will download sites and upload them to our archive — and it’s really easy to do!<br />
<br />
The warrior is a virtual machine, so there is no risk to your computer. The warrior will only use your bandwidth and some of your disk space. It will get tasks from and report progress to the [[Tracker]].<br />
<br />
The warrior runs on Windows, OS X and Linux. You’ll need [https://www.virtualbox.org/ VirtualBox] (recommended), VMware or a similar program to run the virtual machine.<br />
<br />
Instructions for VirtualBox:<br />
<ol><br />
<li>Download the [http://archive.org/download/archiveteam-warrior/archiveteam-warrior-v2-20121008.ova appliance] (174MB).</li><br />
<li>In VirtualBox, click File > Import Appliance and open the file.</li><br />
<li>Start the virtual machine. It will fetch the latest updates and will eventually tell you to start your web browser.</li><br />
</ol><br />
<br />
Once you’ve started your warrior:<br />
<ol><br />
<li>Go to http://localhost:8001/ and check the Settings page.</li><br />
<li>Choose a username — we’ll show your progress on the leaderboard.</li><br />
<li>Go to the All projects tab and pick a project to work on. Even better: select ArchiveTeam’s Choice to let your warrior work on the most urgent project.</li><br />
</ol><br />
<br />
<br />
<br />
<br />
----<br />
<br />
<br />
==Warrior FAQ==<br />
<br />
=== Why am I seeing a message about that no item was received? ===<br />
<br />
It means that there is no work available. This happens for several because:<br />
<br />
* There project has just finished and someone is inspecting the work done. If a problem is discovered, items may be re-queued and more work is available.<br />
* In the rare case, you have been banned by a tracker administrator because you were requesting too much work or your internet connection is "unclean". We prefer connections from many public IP addresses, use of non-captive DNS servers, and no proxies/firewalls.<br />
<br />
=== Why am I seeing a message about rate limiting? ===<br />
<br />
Keep in mind that although downloading the internet for digital preservation and fun are the primary goals of all Archive Team activities, serious stress on the target's server may occur. The rate limit is imposed by a [[Tracker#People|tracker administrator]] and should not be subverted.<br />
<br />
===Help! The warrior is eating all my bandwidth!===<br />
<br />
You can limit the warriors bandwidth quite easily for virtualbox as long as you are running a relatively recent version. The option is not offered with a GUI however.<br />
<br />
The command <pre>VBoxManage bandwidthctl archiveteam-warrior-2 --name Limit --add network --limit 3</pre> will limit the warrior instance called archiveteam-warrior-2 (The default name of the warrior vm currently) to 3Mb/s. Adjust as needed.<br />
<br />
In the latest version of VirtualBox on Windows, the syntax appears to have changed. The correct command now seems to be:<br />
<br />
<pre>VBoxManage bandwidthctl archiveteam-warrior-2 add netlimit --type network --limit 3</pre><br />
<br />
=== I turned my warrior off, will those tasks be lost? ===<br />
<br />
If you've killed your warrior instances then the work your warrior did has been lost, however the tasks will be returned to the pool after a period of time. If you want you can alert the admins via IRC of whats happened, and they can clear the claims your username may of made however this isn't very important on most projects.<br />
<br />
=== I need to disconnect my internet / reboot my PC but I don't want to lose work ===<br />
<br />
If you pause/suspend the warrior instance, most projects will allow resuming of work in progress when you unsuspend the warrior instance.<br />
<br />
=== I told the warrior to shutdown from the interface but nothing has changed! what gives? ===<br />
<br />
The warrior will attempt to finish the current running tasks before shutting down. If you need to shut down right away; go ahead, your progress will be lost however the jobs will eventually cycle out to another user.<br />
<br />
=== How much disk space will the warrior use? ===<br />
<br />
Short answer: it depends on the project.<br />
<br />
Long answer: because the way each project defines an item differently, the warrior may be downloading a small file to downloading a whole subsection of a website. The virtual machine is configured by default to use 60GB as an absolute maximum. Any unused virtual machine disk space is not used on the host computer. You may, however, run the virtual machine on less than 60GB if you like to live dangerously. We're downloading the internet after all!<br />
<br />
=== The secondary disk is using up space even though it's not running a project. ===<br />
<br />
Virtual machine disk images do not behave like a regular file. There are several ways to reclaim space:<br />
<br />
* Delete the second disk and put back an empty disk. The warrior should reformat the second disk.<br />
* Delete the entire warrior application and re-import it.<br />
* Use the zerofree program and then clone the disk image. Reattach the cloned disk image.<br />
<br />
=== I can't connect to localhost? ===<br />
<br />
The application includes a configuration to set up port forwarding to the guest machine on port 8001 so you can access the interface through your web browser. If this does not happen, you may need to double check your machine's network settings.<br />
<br />
=== I'm looking at the text scrolling by and I notice some errors? Rsync is not working? ===<br />
<br />
Uh-oh! Something is not right. Notify us immediately in the appropriate [[IRC]] channel.<br />
<br />
=== The warrior seems to have too much overhead. I can't run a VM in a VPS! ===<br />
<br />
You don't need to run a virtual machine. If you are managing a VPS, it's likely you are comfortable with some Linux stuff. Projects can be run manually. Consult the project wiki page or the source code repository readme file.<br />
<br />
=== Why a virtual machine in the first place? ===<br />
<br />
The virtual machine is a quick, safe, and easy way for newcomers to help us out. It offers many features:<br />
<br />
* Graphical interface<br />
* Automatically selects which project is important to run<br />
* Self-updating software infrastructure<br />
* Allows for unattended use<br />
* In case of software faults, your machine is not ruined<br />
* Restarts itself in case of runaway programs<br />
* Runs on Windows, Mac OS, Linux painlessly<br />
<br />
If you have suggestions for improving this system, please talk to us as described below.<br />
<br />
=== I still have a question! ===<br />
<br />
Talk to us on [[IRC]]. Use [irc://irc.efnet.org/warrior #warrior] for specific warrior questions or [irc://irc.efnet.org/archiveteam #archiveteam] for general questions.<br />
<br />
== Projects ==<br />
<br />
Previous and current warrior projects:<br />
<br />
{| class="wikitable"<br />
! Project !! Status !! Began !! Finished !! Result !! Archive Location<br />
|-<br />
| [[MobileMe]] || '''Archive Posted''' || April 3, 2012 || Aug 8, 2012 || Success || <br />
[http://archive.org/details/archiveteam-mobileme-hero archive] [http://archive.org/details/archiveteam-mobileme-index index] [http://archive.org/download/archiveteam-mobileme-index/mobileme-20120817.html user lookup]<br />
|-<br />
| [[FortuneCity]] || '''Archive Posted''' || April 4, 2012 || April 11, 2012 || Partial Success || [http://archive.org/details/archiveteam-fortunecity archive] [http://archive.org/download/test-memac-index-test/fortunecity.html user lookup]<br />
|-<br />
| [[Tabblo]] || '''Archive Posted''' || May 23, 2012 || May 26, 2012 || Success || [http://archive.org/details/tabblo-archive archive] [http://archive.org/download/test-memac-index-test/tabblo.html user lookup]<br />
|-<br />
| [[Picplz]] || '''Archive Posted''' || June 3, 2012 || June 15, 2012 || || [http://archive.org/details/archiveteam-picplz archive] [http://archive.org/details/archiveteam-picplz-index index] [http://archive.org/download/archiveteam-picplz-index/picplz-20120823.html user lookup]<br />
|-<br />
| [[Tumblr]] (test project) || '''Archive Posted''' || August 9, 2012 || August 19, 2012 || || [http://archive.org/details/archiveteam-tumblr-test archive (tar)] [http://archive.org/details/archiveteam-tumblr-test-warc archive (warc)]<br />
|-<br />
| [[Cinch]].FM || '''Archive Posted''' || August 20, 2012 || August 22, 2012 || Success || [http://archive.org/details/archiveteam-cinch archive]<br />
|-<br />
| [[City of Heroes]] || '''Archive Posted''' || September 3, 2012 || December 1, 2012 || Success || [http://archive.org/details/archiveteam-city-of-heroes-www www] [http://archive.org/details/archiveteam-city-of-heroes-main forums] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-1 1] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-2 2] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-3 3] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-4 4] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-5 5]<br />
|-<br />
| [[Webshots]] || '''Archive Posted''' || October 4, 2012 || November 18, 2012 || || [http://archive.org/download/webshots-freeze-frame-index/index.html index]<br />
|-<br />
| [[BT Internet]] || '''Archive Posted''' || October 10, 2012 || November 2, 2012 || Success || [http://archive.org/details/archiveteam-btinternet archive]<br />
|-<br />
| [[DailyBooth| Daily Booth]] || '''Archive Posted''' || November 19, 2012 || December 29, 2012 || || [http://archive.org/details/archiveteam_dailybooth archive] [http://archive.org/download/dailybooth-freeze-frame-index/index.html lookup]<br />
|-<br />
| [[GitHub Downloads]] || '''Archive Posted''' || December 13, 2012 || December 17, 2012 || Success || [http://archive.org/details/github-downloads-2012-12 archive] [http://archive.org/details/archiveteam-github-repository-index-201212 index]<br />
|-<br />
| [[Yahoo! Blog]] || '''Archive Posted''' || January 8, 2013 || January 19, 2013 || || [http://archive.org/details/yahoo_korea_blogs archive]<br />
|-<br />
| [[weblog.nl]] || '''Archive Posted''' || January 19, 2013 || February 2, 2013 || || [http://archive.org/details/archiveteam_weblognl archive] [http://archive.org/download/archiveteam_weblognl-index/ lookup]<br />
|-<br />
| [[URLTeam]] || Active || || || || [http://urlte.am/releases/2013-01-02/urlteam.torrent latest]<br />
|-<br />
| [[Punchfork]] || '''Archive Posted''' || January 11, 2013 || March 6, 2013 || || [http://archive.org/details/archiveteam_punchfork archive] [http://archive.org/download/archiveteam_punchfork_index/ user lookup]<br />
|-<br />
| [[Xanga]] || Downloads Paused || January 22, 2013 || February 16, 2013 || || [http://archive.org/details/archiveteam_xanga archive] [http://archive.org/download/archiveteam_xanga_index/ user lookup] [http://archive.org/details/archiveteam-xanga-userlist-20130142 user list]<br />
|-<br />
| [[Posterous]] || Downloads Finished || February 23, 2013 || June 29, 2013 || || [http://archive.org/details/archiveteam_posterous archive]<br />
|-<br />
| [[Storylane]] || Downloads Finished || March 8, 2013 || March 15, 2013 || ||<br />
|-<br />
| [[Yahoo! Messages]] || Downloads Finished || March 20, 2013 || March 31, 2013 || || [http://archive.org/details/archiveteam_yahoo_messages archive]<br />
|-<br />
| [[Formspring]] || Downloads Finished || March 24, 2013 || September 19, 2013 || Success || [http://archive.org/details/archiveteam_formspring archive]<br />
|-<br />
| [[Yahoo Upcoming]] || '''Archive Posted''' || April 20, 2013 || April 25, 2013 || || [http://archive.org/details/archiveteam archive]<br />
|-<br />
| [[Streetfiles]].org || Downloads Finished || April 28, 2013 || April 30, 2013 || Partial || [http://archive.org/details/archiveteam archive]<br />
|-<br />
| [[Xanga]] || Downloads Paused || June 21, 2013 || August 31, 2013 || || [http://archive.org/details/archiveteam_xanga archive] <br />
|-<br />
| [[Zapd]] || '''Archive Posted''' || October 1, 2013 || October 8, 2013 || Success || [https://archive.org/details/archiveteam_zapd archive]<br />
|}<br />
<br />
=== Status ===<br />
:; In Development : a future project<br />
:; Active : start up a Warrior and join the fun; this one is in progress right now<br />
:; Downloads Finished : we've finished downloading the data<br />
:; Archived : the collected data has been properly archived<br />
:; Archive Posted : the archive is available for download<br />
<br />
=== Result ===<br />
:; Success : downloaded all of the data and posted the archive publicly<br />
:; Qualified Success : either we couldn't get all of the data, or the archive can't be made public<br />
:; Failure : the site closed before we could download anything<br />
<br />
== Testing pre-production code ==<br />
<br />
(Don't do this unless you really need or want to.) If you are developing a warrior script, you can test it by switching your warrior from the <code>production</code> branch to the <code>master</code> branch.<br />
<br />
<ol><br />
<li>Start the warrior.</li><br />
<li>Press Alt+F2 and log in with username <code>root</code> and password <code>archiveteam</code>.</li><br />
<li><code>cd /home/warrior/warrior-code</code></li><br />
<li><code>sudo -u warrior git checkout master</code></li><br />
<li><code>reboot</code></li><br />
</ol><br />
<br />
By the same route you can return your warrior to the <code>production</code> branch.<br />
<br />
== How the warrior works ==<br />
The warrior image is built off Debian 6.0.5 (squeeze). Here are the basics:<br />
<br />
* kernel 2.6.32-5-686 (released 2009-03-12)<br />
* Python 2.6.6, pip 1.1<br />
* Perl v5.10.1, cpan 1.9402 (still needs config)<br />
* gcc 4.4.5, make 3.81, bash 4.1.5<br />
* nano 2.2.4 with color syntax highlighting<br />
* curl 7.21.0<br />
<br />
The code for each project is stored in /home/warrior/projects/<PROJECTNAME>/<br />
<br />
1. Start the virtual machine<br />
2. Linux boots<br />
3. The user warrior is automatically logged in.<br />
4. /etc/inittab kicks off /home/warrior/warrior-code2/boot.sh. This will git pull https://github.com/ArchiveTeam/warrior-code2 <br />
into /home/warrior/warrior-code2/. /home/warrior/warrior-code2/warrior-runner.sh sets up a process which monitors /dev/shm/ready-for-warrior<br />
and launches run-warrior when the state changes.<br />
5. boot.sh launches /home/warrior/warrior-code/boot-part-2.sh<br />
6. boot-part-2.sh is a short script that does the following:<br />
./warrior-install.sh<br />
* install/update seesaw, check branch, version<br />
* install framebuffer support, DNS caching<br />
* sets up /data<br />
sudo ./make-data-disk.sh<br />
* cleans up<br />
* creates and prepares the partition<br />
mkdir -p /home/warrior/projects<br />
touch /dev/shm/ready-for-warrior<br />
* triggers the launch of /usr/local/bin/run-warrior which provides the web interface.<br />
./say-hello.sh<br />
* setup vmware port forwarding<br />
* show splash screen<br />
7. Point your web browser to http://localhost:8001 and go.</div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=Blip.tv&diff=17713Blip.tv2013-10-09T14:46:14Z<p>DukeNukem: /* Site Structure */</p>
<hr />
<div>{{Infobox project<br />
| title = Blip.tv<br />
| logo = Blip_web_logo.png<br />
| image = Blip.tv_1303512711518.png<br />
| description = <br />
| URL = http://blip.tv/<br />
| project_status = {{online}}<br />
| archiving_status = {{nosavedyet}}<br />
| irc = blooper.tv<br />
}}<br />
<br />
'''Blip.tv''' is a [[Video hostings|video sharing]] website.<br />
<br />
{{expand}}<br />
<br />
== Blip.tv 2.0 ==<br />
<br />
[[File:Blip.jpg]]<br />
<br />
<blockquote><br />
<p>Blip.tv Acquired By Video-Blog Killers Maker Studios</p><br />
<p>''Posted on Oct 8 2013 - 3:12pm by Zennie Abraham</p><br />
<p>Blip.tv, the video sharing site that was created by a team lead by Mike Hudack and Dina Kaplan, is dead. It’s now called “the old Blip.tv” and has been replaced by something owned by that horrible video-channel eating company network Maker Studios. (Mike and Dina left Blip in 2012.)</p><br />
<br />
<p>And if you’re asking “Is that the same Maker Studios that took YouTube Partner Ray William Johnson’s Google AdSense account and never gave it back to him? The same Maker Studios that pushed Pew Die Pie at us on YouTube? The same Maker Studios that was founded by Danny Zappin, Lisa Donovan (LisaNova on YouTube), Scott Katz, Derek Jones and Will Watkin? The same Maker Studios that’s involved in a nasty lawsuit between Mr. Zappin and the others, including his ex-girlfriend Lisa Donovan?</p><br />
<br />
<p>The answer is yes.<ref>http://www.zennie62blog.com/2013/10/08/blip-tv-acquired-by-video-blog-killers-maker-studios-94233/</ref></p><br />
</blockquote><br />
<br />
== Site Structure ==<br />
* On pages like http://blip.tv/comedy-videos there is pagination but the page links are only '#' and need js to work<br />
* Each anchor tag does have a data-results_page value that appears to carry the url information. Here is an example:<br />
<a class='currentResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=1">1</a><br />
<a class='advanceResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=2">2</a><br />
<a class='advanceResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=3">3</a><br />
* Only 3 links of pagination are shown at a time. To find out how many pages a category has you must click the "double arrow right" at the bottom of the page, which requires javascript.<br />
* Each page of results in a category is only 8 shows at a time.<br />
* Each show has x many episodes<br />
* To video an episode list for a show you must have javascript enabled. This is also true for pagination on these pages.<br />
* Some shows have rss feeds. Example http://blip.tv/ylse/rss<br />
* RSS feeds only show a partial list of episodes. http://blip.tv/schlomo/ has more episodes than listed in the RSS feed.<br />
* If you have the url of a video's page http://blip.tv/zomblogalypse/zomblogalypse-series-trailer-5617646 for example get-flash-video can download the video in a nice mp4 file.<br />
* robots.txt crawl delay is 1 second.<br />
* Here's some 20,000 blip.tv urls from [[URLTeam]]: http://paste.archivingyoursh.it/raw/gidaqotimo.sm<br />
<br />
<br />
=== sitemap ===<br />
<br />
* The sitemap is http://blip.tv/sitemap/xml/bliptv-sitemap-index.xml which links to more sitemaps.<br />
* Contains 3,397 shows that consist of 1 or more episodes.<br />
<br />
== External links ==<br />
* {{url|1=http://blip.tv/|2=Blip.tv}}<br />
* [http://support.blip.tv/entries/23277196-An-Important-Update-from-Blip-Regarding-Account-Removals An Important Update from Blip Regarding Account Removals]<br />
<br />
== References ==<br />
<br />
<references/><br />
<br />
{{Navigation box}}<br />
<br />
[[Category:Video hostings]]</div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=Blip.tv&diff=17712Blip.tv2013-10-09T14:43:26Z<p>DukeNukem: /* Site Structure */</p>
<hr />
<div>{{Infobox project<br />
| title = Blip.tv<br />
| logo = Blip_web_logo.png<br />
| image = Blip.tv_1303512711518.png<br />
| description = <br />
| URL = http://blip.tv/<br />
| project_status = {{online}}<br />
| archiving_status = {{nosavedyet}}<br />
| irc = blooper.tv<br />
}}<br />
<br />
'''Blip.tv''' is a [[Video hostings|video sharing]] website.<br />
<br />
{{expand}}<br />
<br />
== Blip.tv 2.0 ==<br />
<br />
[[File:Blip.jpg]]<br />
<br />
<blockquote><br />
<p>Blip.tv Acquired By Video-Blog Killers Maker Studios</p><br />
<p>''Posted on Oct 8 2013 - 3:12pm by Zennie Abraham</p><br />
<p>Blip.tv, the video sharing site that was created by a team lead by Mike Hudack and Dina Kaplan, is dead. It’s now called “the old Blip.tv” and has been replaced by something owned by that horrible video-channel eating company network Maker Studios. (Mike and Dina left Blip in 2012.)</p><br />
<br />
<p>And if you’re asking “Is that the same Maker Studios that took YouTube Partner Ray William Johnson’s Google AdSense account and never gave it back to him? The same Maker Studios that pushed Pew Die Pie at us on YouTube? The same Maker Studios that was founded by Danny Zappin, Lisa Donovan (LisaNova on YouTube), Scott Katz, Derek Jones and Will Watkin? The same Maker Studios that’s involved in a nasty lawsuit between Mr. Zappin and the others, including his ex-girlfriend Lisa Donovan?</p><br />
<br />
<p>The answer is yes.<ref>http://www.zennie62blog.com/2013/10/08/blip-tv-acquired-by-video-blog-killers-maker-studios-94233/</ref></p><br />
</blockquote><br />
<br />
== Site Structure ==<br />
* On pages like http://blip.tv/comedy-videos there is pagination but the page links are only '#' and need js to work<br />
* Each anchor tag does have a data-results_page value that appears to carry the url information. Here is an example:<br />
<a class='currentResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=1">1</a><br />
<a class='advanceResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=2">2</a><br />
<a class='advanceResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=3">3</a><br />
* Only 3 links of pagination are shown at a time. To find out how many pages a category has you must click the "double arrow right" at the bottom of the page, which requires javascript.<br />
* Each page of results in a category is only 8 shows at a time.<br />
* Each show has x many episodes<br />
* To video an episode list for a show you must have javascript enabled. This is also true for pagination on these pages.<br />
* Some shows have rss feeds. Example http://blip.tv/ylse/rss<br />
* RSS feeds only show a partial list of episodes. http://blip.tv/schlomo/ has more episodes than listed in the RSS feed.<br />
* If you have the url of a video's page http://blip.tv/zomblogalypse/zomblogalypse-series-trailer-5617646 for example get-flash-video can download the video in a nice mp4 file.<br />
* There is no /sitemap.xml<br />
* robots.txt crawl delay is 1 second.<br />
* Here's some 20,000 blip.tv urls from [[URLTeam]]: http://paste.archivingyoursh.it/raw/gidaqotimo.sm<br />
* Contains 3,397 shows that consist of 1 or more episodes.<br />
<br />
== External links ==<br />
* {{url|1=http://blip.tv/|2=Blip.tv}}<br />
* [http://support.blip.tv/entries/23277196-An-Important-Update-from-Blip-Regarding-Account-Removals An Important Update from Blip Regarding Account Removals]<br />
<br />
== References ==<br />
<br />
<references/><br />
<br />
{{Navigation box}}<br />
<br />
[[Category:Video hostings]]</div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=ArchiveTeam_Warrior&diff=17711ArchiveTeam Warrior2013-10-09T14:14:20Z<p>DukeNukem: /* How the warrior works */</p>
<hr />
<div>[[Image:Archive_team.png|100px|left]]<br />
[[Image:Warrior-vm-screenshot.png|right]]<br />
[[Image:Warrior-web-screenshot.png|right]]<br />
<br />
The ArchiveTeam Warrior is a virtual archiving appliance. You can run it to help with the ArchiveTeam archiving efforts. It will download sites and upload them to our archive — and it’s really easy to do!<br />
<br />
The warrior is a virtual machine, so there is no risk to your computer. The warrior will only use your bandwidth and some of your disk space. It will get tasks from and report progress to the [[Tracker]].<br />
<br />
The warrior runs on Windows, OS X and Linux. You’ll need [https://www.virtualbox.org/ VirtualBox] (recommended), VMware or a similar program to run the virtual machine.<br />
<br />
Instructions for VirtualBox:<br />
<ol><br />
<li>Download the [http://archive.org/download/archiveteam-warrior/archiveteam-warrior-v2-20121008.ova appliance] (174MB).</li><br />
<li>In VirtualBox, click File > Import Appliance and open the file.</li><br />
<li>Start the virtual machine. It will fetch the latest updates and will eventually tell you to start your web browser.</li><br />
</ol><br />
<br />
Once you’ve started your warrior:<br />
<ol><br />
<li>Go to http://localhost:8001/ and check the Settings page.</li><br />
<li>Choose a username — we’ll show your progress on the leaderboard.</li><br />
<li>Go to the All projects tab and pick a project to work on. Even better: select ArchiveTeam’s Choice to let your warrior work on the most urgent project.</li><br />
</ol><br />
<br />
<br />
<br />
<br />
----<br />
<br />
<br />
==Warrior FAQ==<br />
<br />
=== Why am I seeing a message about that no item was received? ===<br />
<br />
It means that there is no work available. This happens for several because:<br />
<br />
* There project has just finished and someone is inspecting the work done. If a problem is discovered, items may be re-queued and more work is available.<br />
* In the rare case, you have been banned by a tracker administrator because you were requesting too much work or your internet connection is "unclean". We prefer connections from many public IP addresses, use of non-captive DNS servers, and no proxies/firewalls.<br />
<br />
=== Why am I seeing a message about rate limiting? ===<br />
<br />
Keep in mind that although downloading the internet for digital preservation and fun are the primary goals of all Archive Team activities, serious stress on the target's server may occur. The rate limit is imposed by a [[Tracker#People|tracker administrator]] and should not be subverted.<br />
<br />
===Help! The warrior is eating all my bandwidth!===<br />
<br />
You can limit the warriors bandwidth quite easily for virtualbox as long as you are running a relatively recent version. The option is not offered with a GUI however.<br />
<br />
The command <pre>VBoxManage bandwidthctl archiveteam-warrior-2 --name Limit --add network --limit 3</pre> will limit the warrior instance called archiveteam-warrior-2 (The default name of the warrior vm currently) to 3Mb/s. Adjust as needed.<br />
<br />
In the latest version of VirtualBox on Windows, the syntax appears to have changed. The correct command now seems to be:<br />
<br />
<pre>VBoxManage bandwidthctl archiveteam-warrior-2 add netlimit --type network --limit 3</pre><br />
<br />
=== I turned my warrior off, will those tasks be lost? ===<br />
<br />
If you've killed your warrior instances then the work your warrior did has been lost, however the tasks will be returned to the pool after a period of time. If you want you can alert the admins via IRC of whats happened, and they can clear the claims your username may of made however this isn't very important on most projects.<br />
<br />
=== I need to disconnect my internet / reboot my PC but I don't want to lose work ===<br />
<br />
If you pause/suspend the warrior instance, most projects will allow resuming of work in progress when you unsuspend the warrior instance.<br />
<br />
=== I told the warrior to shutdown from the interface but nothing has changed! what gives? ===<br />
<br />
The warrior will attempt to finish the current running tasks before shutting down. If you need to shut down right away; go ahead, your progress will be lost however the jobs will eventually cycle out to another user.<br />
<br />
=== How much disk space will the warrior use? ===<br />
<br />
Short answer: it depends on the project.<br />
<br />
Long answer: because the way each project defines an item differently, the warrior may be downloading a small file to downloading a whole subsection of a website. The virtual machine is configured by default to use 60GB as an absolute maximum. Any unused virtual machine disk space is not used on the host computer. You may, however, run the virtual machine on less than 60GB if you like to live dangerously. We're downloading the internet after all!<br />
<br />
=== The secondary disk is using up space even though it's not running a project. ===<br />
<br />
Virtual machine disk images do not behave like a regular file. There are several ways to reclaim space:<br />
<br />
* Delete the second disk and put back an empty disk. The warrior should reformat the second disk.<br />
* Delete the entire warrior application and re-import it.<br />
* Use the zerofree program and then clone the disk image. Reattach the cloned disk image.<br />
<br />
=== I can't connect to localhost? ===<br />
<br />
The application includes a configuration to set up port forwarding to the guest machine on port 8001 so you can access the interface through your web browser. If this does not happen, you may need to double check your machine's network settings.<br />
<br />
=== I'm looking at the text scrolling by and I notice some errors? Rsync is not working? ===<br />
<br />
Uh-oh! Something is not right. Notify us immediately in the appropriate [[IRC]] channel.<br />
<br />
=== The warrior seems to have too much overhead. I can't run a VM in a VPS! ===<br />
<br />
You don't need to run a virtual machine. If you are managing a VPS, it's likely you are comfortable with some Linux stuff. Projects can be run manually. Consult the project wiki page or the source code repository readme file.<br />
<br />
=== Why a virtual machine in the first place? ===<br />
<br />
The virtual machine is a quick, safe, and easy way for newcomers to help us out. It offers many features:<br />
<br />
* Graphical interface<br />
* Automatically selects which project is important to run<br />
* Self-updating software infrastructure<br />
* Allows for unattended use<br />
* In case of software faults, your machine is not ruined<br />
* Restarts itself in case of runaway programs<br />
* Runs on Windows, Mac OS, Linux painlessly<br />
<br />
If you have suggestions for improving this system, please talk to us as described below.<br />
<br />
=== I still have a question! ===<br />
<br />
Talk to us on [[IRC]]. Use [irc://irc.efnet.org/warrior #warrior] for specific warrior questions or [irc://irc.efnet.org/archiveteam #archiveteam] for general questions.<br />
<br />
== Projects ==<br />
<br />
Previous and current warrior projects:<br />
<br />
{| class="wikitable"<br />
! Project !! Status !! Began !! Finished !! Result !! Archive Location<br />
|-<br />
| [[MobileMe]] || '''Archive Posted''' || April 3, 2012 || Aug 8, 2012 || Success || <br />
[http://archive.org/details/archiveteam-mobileme-hero archive] [http://archive.org/details/archiveteam-mobileme-index index] [http://archive.org/download/archiveteam-mobileme-index/mobileme-20120817.html user lookup]<br />
|-<br />
| [[FortuneCity]] || '''Archive Posted''' || April 4, 2012 || April 11, 2012 || Partial Success || [http://archive.org/details/archiveteam-fortunecity archive] [http://archive.org/download/test-memac-index-test/fortunecity.html user lookup]<br />
|-<br />
| [[Tabblo]] || '''Archive Posted''' || May 23, 2012 || May 26, 2012 || Success || [http://archive.org/details/tabblo-archive archive] [http://archive.org/download/test-memac-index-test/tabblo.html user lookup]<br />
|-<br />
| [[Picplz]] || '''Archive Posted''' || June 3, 2012 || June 15, 2012 || || [http://archive.org/details/archiveteam-picplz archive] [http://archive.org/details/archiveteam-picplz-index index] [http://archive.org/download/archiveteam-picplz-index/picplz-20120823.html user lookup]<br />
|-<br />
| [[Tumblr]] (test project) || '''Archive Posted''' || August 9, 2012 || August 19, 2012 || || [http://archive.org/details/archiveteam-tumblr-test archive (tar)] [http://archive.org/details/archiveteam-tumblr-test-warc archive (warc)]<br />
|-<br />
| [[Cinch]].FM || '''Archive Posted''' || August 20, 2012 || August 22, 2012 || Success || [http://archive.org/details/archiveteam-cinch archive]<br />
|-<br />
| [[City of Heroes]] || '''Archive Posted''' || September 3, 2012 || December 1, 2012 || Success || [http://archive.org/details/archiveteam-city-of-heroes-www www] [http://archive.org/details/archiveteam-city-of-heroes-main forums] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-1 1] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-2 2] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-3 3] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-4 4] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-5 5]<br />
|-<br />
| [[Webshots]] || '''Archive Posted''' || October 4, 2012 || November 18, 2012 || || [http://archive.org/download/webshots-freeze-frame-index/index.html index]<br />
|-<br />
| [[BT Internet]] || '''Archive Posted''' || October 10, 2012 || November 2, 2012 || Success || [http://archive.org/details/archiveteam-btinternet archive]<br />
|-<br />
| [[DailyBooth| Daily Booth]] || '''Archive Posted''' || November 19, 2012 || December 29, 2012 || || [http://archive.org/details/archiveteam_dailybooth archive] [http://archive.org/download/dailybooth-freeze-frame-index/index.html lookup]<br />
|-<br />
| [[GitHub Downloads]] || '''Archive Posted''' || December 13, 2012 || December 17, 2012 || Success || [http://archive.org/details/github-downloads-2012-12 archive] [http://archive.org/details/archiveteam-github-repository-index-201212 index]<br />
|-<br />
| [[Yahoo! Blog]] || '''Archive Posted''' || January 8, 2013 || January 19, 2013 || || [http://archive.org/details/yahoo_korea_blogs archive]<br />
|-<br />
| [[weblog.nl]] || '''Archive Posted''' || January 19, 2013 || February 2, 2013 || || [http://archive.org/details/archiveteam_weblognl archive] [http://archive.org/download/archiveteam_weblognl-index/ lookup]<br />
|-<br />
| [[URLTeam]] || Active || || || || [http://urlte.am/releases/2013-01-02/urlteam.torrent latest]<br />
|-<br />
| [[Punchfork]] || '''Archive Posted''' || January 11, 2013 || March 6, 2013 || || [http://archive.org/details/archiveteam_punchfork archive] [http://archive.org/download/archiveteam_punchfork_index/ user lookup]<br />
|-<br />
| [[Xanga]] || Downloads Paused || January 22, 2013 || February 16, 2013 || || [http://archive.org/details/archiveteam_xanga archive] [http://archive.org/download/archiveteam_xanga_index/ user lookup] [http://archive.org/details/archiveteam-xanga-userlist-20130142 user list]<br />
|-<br />
| [[Posterous]] || Downloads Finished || February 23, 2013 || June 29, 2013 || || [http://archive.org/details/archiveteam_posterous archive]<br />
|-<br />
| [[Storylane]] || Downloads Finished || March 8, 2013 || March 15, 2013 || ||<br />
|-<br />
| [[Yahoo! Messages]] || Downloads Finished || March 20, 2013 || March 31, 2013 || || [http://archive.org/details/archiveteam_yahoo_messages archive]<br />
|-<br />
| [[Formspring]] || Downloads Finished || March 24, 2013 || September 19, 2013 || Success || [http://archive.org/details/archiveteam_formspring archive]<br />
|-<br />
| [[Yahoo Upcoming]] || '''Archive Posted''' || April 20, 2013 || April 25, 2013 || || [http://archive.org/details/archiveteam archive]<br />
|-<br />
| [[Streetfiles]].org || Downloads Finished || April 28, 2013 || April 30, 2013 || Partial || [http://archive.org/details/archiveteam archive]<br />
|-<br />
| [[Xanga]] || Downloads Paused || June 21, 2013 || August 31, 2013 || || [http://archive.org/details/archiveteam_xanga archive] <br />
|-<br />
| [[Zapd]] || '''Archive Posted''' || October 1, 2013 || October 8, 2013 || Success || [https://archive.org/details/archiveteam_zapd archive]<br />
|}<br />
<br />
=== Status ===<br />
:; In Development : a future project<br />
:; Active : start up a Warrior and join the fun; this one is in progress right now<br />
:; Downloads Finished : we've finished downloading the data<br />
:; Archived : the collected data has been properly archived<br />
:; Archive Posted : the archive is available for download<br />
<br />
=== Result ===<br />
:; Success : downloaded all of the data and posted the archive publicly<br />
:; Qualified Success : either we couldn't get all of the data, or the archive can't be made public<br />
:; Failure : the site closed before we could download anything<br />
<br />
== Testing pre-production code ==<br />
<br />
(Don't do this unless you really need or want to.) If you are developing a warrior script, you can test it by switching your warrior from the <code>production</code> branch to the <code>master</code> branch.<br />
<br />
<ol><br />
<li>Start the warrior.</li><br />
<li>Press Alt+F2 and log in with username <code>root</code> and password <code>archiveteam</code>.</li><br />
<li><code>cd /home/warrior/warrior-code</code></li><br />
<li><code>sudo -u warrior git checkout master</code></li><br />
<li><code>reboot</code></li><br />
</ol><br />
<br />
By the same route you can return your warrior to the <code>production</code> branch.<br />
<br />
== How the warrior works ==<br />
The warrior image is built off Debian 6.0.5 (squeeze). Here are the basics:<br />
<br />
* kernel 2.6.32-5-686 (released 2009-03-12)<br />
* Python 2.6.6, pip 1.1<br />
* Perl v5.10.1, cpan 1.9402 (still needs config)<br />
* gcc 4.4.5, make 3.81, bash 4.1.5<br />
* nano 2.2.4 with color syntax highlighting<br />
* curl 7.21.0<br />
<br />
The code for each project is stored in /home/warrior/projects/<PROJECTNAME>/<br />
<br />
1. Start the virtual machine<br />
2. Linux boots<br />
3. The user warrior is automatically logged in.<br />
4. /etc/inittab kicks off /home/warrior/warrior-code2/boot.sh. This will git pull https://github.com/ArchiveTeam/warrior-code2 <br />
into /home/warrior/warrior-code2/. /home/warrior/warrior-code2/warrior-runner.sh sets up a process which monitors /dev/shm/ready-for-warrior<br />
and launches run-warrior when the state changes.<br />
5. boot.sh launches /home/warrior/warrior-code/boot-part-2.sh<br />
6. boot-part-2.sh is a short script that does the following:<br />
./warrior-install.sh<br />
* install/update seesaw, check branch, version<br />
* install framebuffer support, DNS caching<br />
* sets up /data<br />
sudo ./make-data-disk.sh<br />
* cleans up<br />
* creates and prepares the partition<br />
mkdir -p /home/warrior/projects<br />
touch /dev/shm/ready-for-warrior<br />
* triggers the launch of run-warrior which provides the web interface.<br />
./say-hello.sh<br />
* setup vmware port forwarding<br />
* show splash screen<br />
7. Point your web browser to http://localhost:8001 and go.</div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=ArchiveTeam_Warrior&diff=17710ArchiveTeam Warrior2013-10-09T14:07:21Z<p>DukeNukem: /* How the warrior works */</p>
<hr />
<div>[[Image:Archive_team.png|100px|left]]<br />
[[Image:Warrior-vm-screenshot.png|right]]<br />
[[Image:Warrior-web-screenshot.png|right]]<br />
<br />
The ArchiveTeam Warrior is a virtual archiving appliance. You can run it to help with the ArchiveTeam archiving efforts. It will download sites and upload them to our archive — and it’s really easy to do!<br />
<br />
The warrior is a virtual machine, so there is no risk to your computer. The warrior will only use your bandwidth and some of your disk space. It will get tasks from and report progress to the [[Tracker]].<br />
<br />
The warrior runs on Windows, OS X and Linux. You’ll need [https://www.virtualbox.org/ VirtualBox] (recommended), VMware or a similar program to run the virtual machine.<br />
<br />
Instructions for VirtualBox:<br />
<ol><br />
<li>Download the [http://archive.org/download/archiveteam-warrior/archiveteam-warrior-v2-20121008.ova appliance] (174MB).</li><br />
<li>In VirtualBox, click File > Import Appliance and open the file.</li><br />
<li>Start the virtual machine. It will fetch the latest updates and will eventually tell you to start your web browser.</li><br />
</ol><br />
<br />
Once you’ve started your warrior:<br />
<ol><br />
<li>Go to http://localhost:8001/ and check the Settings page.</li><br />
<li>Choose a username — we’ll show your progress on the leaderboard.</li><br />
<li>Go to the All projects tab and pick a project to work on. Even better: select ArchiveTeam’s Choice to let your warrior work on the most urgent project.</li><br />
</ol><br />
<br />
<br />
<br />
<br />
----<br />
<br />
<br />
==Warrior FAQ==<br />
<br />
=== Why am I seeing a message about that no item was received? ===<br />
<br />
It means that there is no work available. This happens for several because:<br />
<br />
* There project has just finished and someone is inspecting the work done. If a problem is discovered, items may be re-queued and more work is available.<br />
* In the rare case, you have been banned by a tracker administrator because you were requesting too much work or your internet connection is "unclean". We prefer connections from many public IP addresses, use of non-captive DNS servers, and no proxies/firewalls.<br />
<br />
=== Why am I seeing a message about rate limiting? ===<br />
<br />
Keep in mind that although downloading the internet for digital preservation and fun are the primary goals of all Archive Team activities, serious stress on the target's server may occur. The rate limit is imposed by a [[Tracker#People|tracker administrator]] and should not be subverted.<br />
<br />
===Help! The warrior is eating all my bandwidth!===<br />
<br />
You can limit the warriors bandwidth quite easily for virtualbox as long as you are running a relatively recent version. The option is not offered with a GUI however.<br />
<br />
The command <pre>VBoxManage bandwidthctl archiveteam-warrior-2 --name Limit --add network --limit 3</pre> will limit the warrior instance called archiveteam-warrior-2 (The default name of the warrior vm currently) to 3Mb/s. Adjust as needed.<br />
<br />
In the latest version of VirtualBox on Windows, the syntax appears to have changed. The correct command now seems to be:<br />
<br />
<pre>VBoxManage bandwidthctl archiveteam-warrior-2 add netlimit --type network --limit 3</pre><br />
<br />
=== I turned my warrior off, will those tasks be lost? ===<br />
<br />
If you've killed your warrior instances then the work your warrior did has been lost, however the tasks will be returned to the pool after a period of time. If you want you can alert the admins via IRC of whats happened, and they can clear the claims your username may of made however this isn't very important on most projects.<br />
<br />
=== I need to disconnect my internet / reboot my PC but I don't want to lose work ===<br />
<br />
If you pause/suspend the warrior instance, most projects will allow resuming of work in progress when you unsuspend the warrior instance.<br />
<br />
=== I told the warrior to shutdown from the interface but nothing has changed! what gives? ===<br />
<br />
The warrior will attempt to finish the current running tasks before shutting down. If you need to shut down right away; go ahead, your progress will be lost however the jobs will eventually cycle out to another user.<br />
<br />
=== How much disk space will the warrior use? ===<br />
<br />
Short answer: it depends on the project.<br />
<br />
Long answer: because the way each project defines an item differently, the warrior may be downloading a small file to downloading a whole subsection of a website. The virtual machine is configured by default to use 60GB as an absolute maximum. Any unused virtual machine disk space is not used on the host computer. You may, however, run the virtual machine on less than 60GB if you like to live dangerously. We're downloading the internet after all!<br />
<br />
=== The secondary disk is using up space even though it's not running a project. ===<br />
<br />
Virtual machine disk images do not behave like a regular file. There are several ways to reclaim space:<br />
<br />
* Delete the second disk and put back an empty disk. The warrior should reformat the second disk.<br />
* Delete the entire warrior application and re-import it.<br />
* Use the zerofree program and then clone the disk image. Reattach the cloned disk image.<br />
<br />
=== I can't connect to localhost? ===<br />
<br />
The application includes a configuration to set up port forwarding to the guest machine on port 8001 so you can access the interface through your web browser. If this does not happen, you may need to double check your machine's network settings.<br />
<br />
=== I'm looking at the text scrolling by and I notice some errors? Rsync is not working? ===<br />
<br />
Uh-oh! Something is not right. Notify us immediately in the appropriate [[IRC]] channel.<br />
<br />
=== The warrior seems to have too much overhead. I can't run a VM in a VPS! ===<br />
<br />
You don't need to run a virtual machine. If you are managing a VPS, it's likely you are comfortable with some Linux stuff. Projects can be run manually. Consult the project wiki page or the source code repository readme file.<br />
<br />
=== Why a virtual machine in the first place? ===<br />
<br />
The virtual machine is a quick, safe, and easy way for newcomers to help us out. It offers many features:<br />
<br />
* Graphical interface<br />
* Automatically selects which project is important to run<br />
* Self-updating software infrastructure<br />
* Allows for unattended use<br />
* In case of software faults, your machine is not ruined<br />
* Restarts itself in case of runaway programs<br />
* Runs on Windows, Mac OS, Linux painlessly<br />
<br />
If you have suggestions for improving this system, please talk to us as described below.<br />
<br />
=== I still have a question! ===<br />
<br />
Talk to us on [[IRC]]. Use [irc://irc.efnet.org/warrior #warrior] for specific warrior questions or [irc://irc.efnet.org/archiveteam #archiveteam] for general questions.<br />
<br />
== Projects ==<br />
<br />
Previous and current warrior projects:<br />
<br />
{| class="wikitable"<br />
! Project !! Status !! Began !! Finished !! Result !! Archive Location<br />
|-<br />
| [[MobileMe]] || '''Archive Posted''' || April 3, 2012 || Aug 8, 2012 || Success || <br />
[http://archive.org/details/archiveteam-mobileme-hero archive] [http://archive.org/details/archiveteam-mobileme-index index] [http://archive.org/download/archiveteam-mobileme-index/mobileme-20120817.html user lookup]<br />
|-<br />
| [[FortuneCity]] || '''Archive Posted''' || April 4, 2012 || April 11, 2012 || Partial Success || [http://archive.org/details/archiveteam-fortunecity archive] [http://archive.org/download/test-memac-index-test/fortunecity.html user lookup]<br />
|-<br />
| [[Tabblo]] || '''Archive Posted''' || May 23, 2012 || May 26, 2012 || Success || [http://archive.org/details/tabblo-archive archive] [http://archive.org/download/test-memac-index-test/tabblo.html user lookup]<br />
|-<br />
| [[Picplz]] || '''Archive Posted''' || June 3, 2012 || June 15, 2012 || || [http://archive.org/details/archiveteam-picplz archive] [http://archive.org/details/archiveteam-picplz-index index] [http://archive.org/download/archiveteam-picplz-index/picplz-20120823.html user lookup]<br />
|-<br />
| [[Tumblr]] (test project) || '''Archive Posted''' || August 9, 2012 || August 19, 2012 || || [http://archive.org/details/archiveteam-tumblr-test archive (tar)] [http://archive.org/details/archiveteam-tumblr-test-warc archive (warc)]<br />
|-<br />
| [[Cinch]].FM || '''Archive Posted''' || August 20, 2012 || August 22, 2012 || Success || [http://archive.org/details/archiveteam-cinch archive]<br />
|-<br />
| [[City of Heroes]] || '''Archive Posted''' || September 3, 2012 || December 1, 2012 || Success || [http://archive.org/details/archiveteam-city-of-heroes-www www] [http://archive.org/details/archiveteam-city-of-heroes-main forums] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-1 1] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-2 2] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-3 3] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-4 4] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-5 5]<br />
|-<br />
| [[Webshots]] || '''Archive Posted''' || October 4, 2012 || November 18, 2012 || || [http://archive.org/download/webshots-freeze-frame-index/index.html index]<br />
|-<br />
| [[BT Internet]] || '''Archive Posted''' || October 10, 2012 || November 2, 2012 || Success || [http://archive.org/details/archiveteam-btinternet archive]<br />
|-<br />
| [[DailyBooth| Daily Booth]] || '''Archive Posted''' || November 19, 2012 || December 29, 2012 || || [http://archive.org/details/archiveteam_dailybooth archive] [http://archive.org/download/dailybooth-freeze-frame-index/index.html lookup]<br />
|-<br />
| [[GitHub Downloads]] || '''Archive Posted''' || December 13, 2012 || December 17, 2012 || Success || [http://archive.org/details/github-downloads-2012-12 archive] [http://archive.org/details/archiveteam-github-repository-index-201212 index]<br />
|-<br />
| [[Yahoo! Blog]] || '''Archive Posted''' || January 8, 2013 || January 19, 2013 || || [http://archive.org/details/yahoo_korea_blogs archive]<br />
|-<br />
| [[weblog.nl]] || '''Archive Posted''' || January 19, 2013 || February 2, 2013 || || [http://archive.org/details/archiveteam_weblognl archive] [http://archive.org/download/archiveteam_weblognl-index/ lookup]<br />
|-<br />
| [[URLTeam]] || Active || || || || [http://urlte.am/releases/2013-01-02/urlteam.torrent latest]<br />
|-<br />
| [[Punchfork]] || '''Archive Posted''' || January 11, 2013 || March 6, 2013 || || [http://archive.org/details/archiveteam_punchfork archive] [http://archive.org/download/archiveteam_punchfork_index/ user lookup]<br />
|-<br />
| [[Xanga]] || Downloads Paused || January 22, 2013 || February 16, 2013 || || [http://archive.org/details/archiveteam_xanga archive] [http://archive.org/download/archiveteam_xanga_index/ user lookup] [http://archive.org/details/archiveteam-xanga-userlist-20130142 user list]<br />
|-<br />
| [[Posterous]] || Downloads Finished || February 23, 2013 || June 29, 2013 || || [http://archive.org/details/archiveteam_posterous archive]<br />
|-<br />
| [[Storylane]] || Downloads Finished || March 8, 2013 || March 15, 2013 || ||<br />
|-<br />
| [[Yahoo! Messages]] || Downloads Finished || March 20, 2013 || March 31, 2013 || || [http://archive.org/details/archiveteam_yahoo_messages archive]<br />
|-<br />
| [[Formspring]] || Downloads Finished || March 24, 2013 || September 19, 2013 || Success || [http://archive.org/details/archiveteam_formspring archive]<br />
|-<br />
| [[Yahoo Upcoming]] || '''Archive Posted''' || April 20, 2013 || April 25, 2013 || || [http://archive.org/details/archiveteam archive]<br />
|-<br />
| [[Streetfiles]].org || Downloads Finished || April 28, 2013 || April 30, 2013 || Partial || [http://archive.org/details/archiveteam archive]<br />
|-<br />
| [[Xanga]] || Downloads Paused || June 21, 2013 || August 31, 2013 || || [http://archive.org/details/archiveteam_xanga archive] <br />
|-<br />
| [[Zapd]] || '''Archive Posted''' || October 1, 2013 || October 8, 2013 || Success || [https://archive.org/details/archiveteam_zapd archive]<br />
|}<br />
<br />
=== Status ===<br />
:; In Development : a future project<br />
:; Active : start up a Warrior and join the fun; this one is in progress right now<br />
:; Downloads Finished : we've finished downloading the data<br />
:; Archived : the collected data has been properly archived<br />
:; Archive Posted : the archive is available for download<br />
<br />
=== Result ===<br />
:; Success : downloaded all of the data and posted the archive publicly<br />
:; Qualified Success : either we couldn't get all of the data, or the archive can't be made public<br />
:; Failure : the site closed before we could download anything<br />
<br />
== Testing pre-production code ==<br />
<br />
(Don't do this unless you really need or want to.) If you are developing a warrior script, you can test it by switching your warrior from the <code>production</code> branch to the <code>master</code> branch.<br />
<br />
<ol><br />
<li>Start the warrior.</li><br />
<li>Press Alt+F2 and log in with username <code>root</code> and password <code>archiveteam</code>.</li><br />
<li><code>cd /home/warrior/warrior-code</code></li><br />
<li><code>sudo -u warrior git checkout master</code></li><br />
<li><code>reboot</code></li><br />
</ol><br />
<br />
By the same route you can return your warrior to the <code>production</code> branch.<br />
<br />
== How the warrior works ==<br />
The warrior image is built off Debian 6.0.5 (squeeze). Here are the basics:<br />
<br />
* kernel 2.6.32-5-686 (released 2009-03-12)<br />
* Python 2.6.6, pip 1.1<br />
* Perl v5.10.1<br />
* cpan v1.9402 (still needs config)<br />
* nano 2.2.4 with color syntax highlighting<br />
* curl 7.21.0<br />
<br />
The code for each project is stored in /home/warrior/projects/<PROJECTNAME>/<br />
<br />
1. Start the virtual machine<br />
2. Linux boots<br />
3. The user warrior is automatically logged in.<br />
4. /etc/inittab kicks off /home/warrior/warrior-code2/boot.sh. This will git pull https://github.com/ArchiveTeam/warrior-code2 <br />
into /home/warrior/warrior-code2/. /home/warrior/warrior-code2/warrior-runner.sh sets up a process which monitors /dev/shm/ready-for-warrior<br />
and launches run-warrior when the state changes.<br />
5. boot.sh launches /home/warrior/warrior-code/boot-part-2.sh<br />
6. boot-part-2.sh is a short script that does the following:<br />
./warrior-install.sh<br />
* install/update seesaw, check branch, version<br />
* install framebuffer support, DNS caching<br />
* sets up /data<br />
sudo ./make-data-disk.sh<br />
* cleans up<br />
* creates and prepares the partition<br />
mkdir -p /home/warrior/projects<br />
touch /dev/shm/ready-for-warrior<br />
* triggers the launch of run-warrior which provides the web interface.<br />
./say-hello.sh<br />
* setup vmware port forwarding<br />
* show splash screen<br />
7. Point your web browser to http://localhost:8001 and go.</div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=ArchiveTeam_Warrior&diff=17709ArchiveTeam Warrior2013-10-09T14:00:33Z<p>DukeNukem: /* How the warrior works */</p>
<hr />
<div>[[Image:Archive_team.png|100px|left]]<br />
[[Image:Warrior-vm-screenshot.png|right]]<br />
[[Image:Warrior-web-screenshot.png|right]]<br />
<br />
The ArchiveTeam Warrior is a virtual archiving appliance. You can run it to help with the ArchiveTeam archiving efforts. It will download sites and upload them to our archive — and it’s really easy to do!<br />
<br />
The warrior is a virtual machine, so there is no risk to your computer. The warrior will only use your bandwidth and some of your disk space. It will get tasks from and report progress to the [[Tracker]].<br />
<br />
The warrior runs on Windows, OS X and Linux. You’ll need [https://www.virtualbox.org/ VirtualBox] (recommended), VMware or a similar program to run the virtual machine.<br />
<br />
Instructions for VirtualBox:<br />
<ol><br />
<li>Download the [http://archive.org/download/archiveteam-warrior/archiveteam-warrior-v2-20121008.ova appliance] (174MB).</li><br />
<li>In VirtualBox, click File > Import Appliance and open the file.</li><br />
<li>Start the virtual machine. It will fetch the latest updates and will eventually tell you to start your web browser.</li><br />
</ol><br />
<br />
Once you’ve started your warrior:<br />
<ol><br />
<li>Go to http://localhost:8001/ and check the Settings page.</li><br />
<li>Choose a username — we’ll show your progress on the leaderboard.</li><br />
<li>Go to the All projects tab and pick a project to work on. Even better: select ArchiveTeam’s Choice to let your warrior work on the most urgent project.</li><br />
</ol><br />
<br />
<br />
<br />
<br />
----<br />
<br />
<br />
==Warrior FAQ==<br />
<br />
=== Why am I seeing a message about that no item was received? ===<br />
<br />
It means that there is no work available. This happens for several because:<br />
<br />
* There project has just finished and someone is inspecting the work done. If a problem is discovered, items may be re-queued and more work is available.<br />
* In the rare case, you have been banned by a tracker administrator because you were requesting too much work or your internet connection is "unclean". We prefer connections from many public IP addresses, use of non-captive DNS servers, and no proxies/firewalls.<br />
<br />
=== Why am I seeing a message about rate limiting? ===<br />
<br />
Keep in mind that although downloading the internet for digital preservation and fun are the primary goals of all Archive Team activities, serious stress on the target's server may occur. The rate limit is imposed by a [[Tracker#People|tracker administrator]] and should not be subverted.<br />
<br />
===Help! The warrior is eating all my bandwidth!===<br />
<br />
You can limit the warriors bandwidth quite easily for virtualbox as long as you are running a relatively recent version. The option is not offered with a GUI however.<br />
<br />
The command <pre>VBoxManage bandwidthctl archiveteam-warrior-2 --name Limit --add network --limit 3</pre> will limit the warrior instance called archiveteam-warrior-2 (The default name of the warrior vm currently) to 3Mb/s. Adjust as needed.<br />
<br />
In the latest version of VirtualBox on Windows, the syntax appears to have changed. The correct command now seems to be:<br />
<br />
<pre>VBoxManage bandwidthctl archiveteam-warrior-2 add netlimit --type network --limit 3</pre><br />
<br />
=== I turned my warrior off, will those tasks be lost? ===<br />
<br />
If you've killed your warrior instances then the work your warrior did has been lost, however the tasks will be returned to the pool after a period of time. If you want you can alert the admins via IRC of whats happened, and they can clear the claims your username may of made however this isn't very important on most projects.<br />
<br />
=== I need to disconnect my internet / reboot my PC but I don't want to lose work ===<br />
<br />
If you pause/suspend the warrior instance, most projects will allow resuming of work in progress when you unsuspend the warrior instance.<br />
<br />
=== I told the warrior to shutdown from the interface but nothing has changed! what gives? ===<br />
<br />
The warrior will attempt to finish the current running tasks before shutting down. If you need to shut down right away; go ahead, your progress will be lost however the jobs will eventually cycle out to another user.<br />
<br />
=== How much disk space will the warrior use? ===<br />
<br />
Short answer: it depends on the project.<br />
<br />
Long answer: because the way each project defines an item differently, the warrior may be downloading a small file to downloading a whole subsection of a website. The virtual machine is configured by default to use 60GB as an absolute maximum. Any unused virtual machine disk space is not used on the host computer. You may, however, run the virtual machine on less than 60GB if you like to live dangerously. We're downloading the internet after all!<br />
<br />
=== The secondary disk is using up space even though it's not running a project. ===<br />
<br />
Virtual machine disk images do not behave like a regular file. There are several ways to reclaim space:<br />
<br />
* Delete the second disk and put back an empty disk. The warrior should reformat the second disk.<br />
* Delete the entire warrior application and re-import it.<br />
* Use the zerofree program and then clone the disk image. Reattach the cloned disk image.<br />
<br />
=== I can't connect to localhost? ===<br />
<br />
The application includes a configuration to set up port forwarding to the guest machine on port 8001 so you can access the interface through your web browser. If this does not happen, you may need to double check your machine's network settings.<br />
<br />
=== I'm looking at the text scrolling by and I notice some errors? Rsync is not working? ===<br />
<br />
Uh-oh! Something is not right. Notify us immediately in the appropriate [[IRC]] channel.<br />
<br />
=== The warrior seems to have too much overhead. I can't run a VM in a VPS! ===<br />
<br />
You don't need to run a virtual machine. If you are managing a VPS, it's likely you are comfortable with some Linux stuff. Projects can be run manually. Consult the project wiki page or the source code repository readme file.<br />
<br />
=== Why a virtual machine in the first place? ===<br />
<br />
The virtual machine is a quick, safe, and easy way for newcomers to help us out. It offers many features:<br />
<br />
* Graphical interface<br />
* Automatically selects which project is important to run<br />
* Self-updating software infrastructure<br />
* Allows for unattended use<br />
* In case of software faults, your machine is not ruined<br />
* Restarts itself in case of runaway programs<br />
* Runs on Windows, Mac OS, Linux painlessly<br />
<br />
If you have suggestions for improving this system, please talk to us as described below.<br />
<br />
=== I still have a question! ===<br />
<br />
Talk to us on [[IRC]]. Use [irc://irc.efnet.org/warrior #warrior] for specific warrior questions or [irc://irc.efnet.org/archiveteam #archiveteam] for general questions.<br />
<br />
== Projects ==<br />
<br />
Previous and current warrior projects:<br />
<br />
{| class="wikitable"<br />
! Project !! Status !! Began !! Finished !! Result !! Archive Location<br />
|-<br />
| [[MobileMe]] || '''Archive Posted''' || April 3, 2012 || Aug 8, 2012 || Success || <br />
[http://archive.org/details/archiveteam-mobileme-hero archive] [http://archive.org/details/archiveteam-mobileme-index index] [http://archive.org/download/archiveteam-mobileme-index/mobileme-20120817.html user lookup]<br />
|-<br />
| [[FortuneCity]] || '''Archive Posted''' || April 4, 2012 || April 11, 2012 || Partial Success || [http://archive.org/details/archiveteam-fortunecity archive] [http://archive.org/download/test-memac-index-test/fortunecity.html user lookup]<br />
|-<br />
| [[Tabblo]] || '''Archive Posted''' || May 23, 2012 || May 26, 2012 || Success || [http://archive.org/details/tabblo-archive archive] [http://archive.org/download/test-memac-index-test/tabblo.html user lookup]<br />
|-<br />
| [[Picplz]] || '''Archive Posted''' || June 3, 2012 || June 15, 2012 || || [http://archive.org/details/archiveteam-picplz archive] [http://archive.org/details/archiveteam-picplz-index index] [http://archive.org/download/archiveteam-picplz-index/picplz-20120823.html user lookup]<br />
|-<br />
| [[Tumblr]] (test project) || '''Archive Posted''' || August 9, 2012 || August 19, 2012 || || [http://archive.org/details/archiveteam-tumblr-test archive (tar)] [http://archive.org/details/archiveteam-tumblr-test-warc archive (warc)]<br />
|-<br />
| [[Cinch]].FM || '''Archive Posted''' || August 20, 2012 || August 22, 2012 || Success || [http://archive.org/details/archiveteam-cinch archive]<br />
|-<br />
| [[City of Heroes]] || '''Archive Posted''' || September 3, 2012 || December 1, 2012 || Success || [http://archive.org/details/archiveteam-city-of-heroes-www www] [http://archive.org/details/archiveteam-city-of-heroes-main forums] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-1 1] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-2 2] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-3 3] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-4 4] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-5 5]<br />
|-<br />
| [[Webshots]] || '''Archive Posted''' || October 4, 2012 || November 18, 2012 || || [http://archive.org/download/webshots-freeze-frame-index/index.html index]<br />
|-<br />
| [[BT Internet]] || '''Archive Posted''' || October 10, 2012 || November 2, 2012 || Success || [http://archive.org/details/archiveteam-btinternet archive]<br />
|-<br />
| [[DailyBooth| Daily Booth]] || '''Archive Posted''' || November 19, 2012 || December 29, 2012 || || [http://archive.org/details/archiveteam_dailybooth archive] [http://archive.org/download/dailybooth-freeze-frame-index/index.html lookup]<br />
|-<br />
| [[GitHub Downloads]] || '''Archive Posted''' || December 13, 2012 || December 17, 2012 || Success || [http://archive.org/details/github-downloads-2012-12 archive] [http://archive.org/details/archiveteam-github-repository-index-201212 index]<br />
|-<br />
| [[Yahoo! Blog]] || '''Archive Posted''' || January 8, 2013 || January 19, 2013 || || [http://archive.org/details/yahoo_korea_blogs archive]<br />
|-<br />
| [[weblog.nl]] || '''Archive Posted''' || January 19, 2013 || February 2, 2013 || || [http://archive.org/details/archiveteam_weblognl archive] [http://archive.org/download/archiveteam_weblognl-index/ lookup]<br />
|-<br />
| [[URLTeam]] || Active || || || || [http://urlte.am/releases/2013-01-02/urlteam.torrent latest]<br />
|-<br />
| [[Punchfork]] || '''Archive Posted''' || January 11, 2013 || March 6, 2013 || || [http://archive.org/details/archiveteam_punchfork archive] [http://archive.org/download/archiveteam_punchfork_index/ user lookup]<br />
|-<br />
| [[Xanga]] || Downloads Paused || January 22, 2013 || February 16, 2013 || || [http://archive.org/details/archiveteam_xanga archive] [http://archive.org/download/archiveteam_xanga_index/ user lookup] [http://archive.org/details/archiveteam-xanga-userlist-20130142 user list]<br />
|-<br />
| [[Posterous]] || Downloads Finished || February 23, 2013 || June 29, 2013 || || [http://archive.org/details/archiveteam_posterous archive]<br />
|-<br />
| [[Storylane]] || Downloads Finished || March 8, 2013 || March 15, 2013 || ||<br />
|-<br />
| [[Yahoo! Messages]] || Downloads Finished || March 20, 2013 || March 31, 2013 || || [http://archive.org/details/archiveteam_yahoo_messages archive]<br />
|-<br />
| [[Formspring]] || Downloads Finished || March 24, 2013 || September 19, 2013 || Success || [http://archive.org/details/archiveteam_formspring archive]<br />
|-<br />
| [[Yahoo Upcoming]] || '''Archive Posted''' || April 20, 2013 || April 25, 2013 || || [http://archive.org/details/archiveteam archive]<br />
|-<br />
| [[Streetfiles]].org || Downloads Finished || April 28, 2013 || April 30, 2013 || Partial || [http://archive.org/details/archiveteam archive]<br />
|-<br />
| [[Xanga]] || Downloads Paused || June 21, 2013 || August 31, 2013 || || [http://archive.org/details/archiveteam_xanga archive] <br />
|-<br />
| [[Zapd]] || '''Archive Posted''' || October 1, 2013 || October 8, 2013 || Success || [https://archive.org/details/archiveteam_zapd archive]<br />
|}<br />
<br />
=== Status ===<br />
:; In Development : a future project<br />
:; Active : start up a Warrior and join the fun; this one is in progress right now<br />
:; Downloads Finished : we've finished downloading the data<br />
:; Archived : the collected data has been properly archived<br />
:; Archive Posted : the archive is available for download<br />
<br />
=== Result ===<br />
:; Success : downloaded all of the data and posted the archive publicly<br />
:; Qualified Success : either we couldn't get all of the data, or the archive can't be made public<br />
:; Failure : the site closed before we could download anything<br />
<br />
== Testing pre-production code ==<br />
<br />
(Don't do this unless you really need or want to.) If you are developing a warrior script, you can test it by switching your warrior from the <code>production</code> branch to the <code>master</code> branch.<br />
<br />
<ol><br />
<li>Start the warrior.</li><br />
<li>Press Alt+F2 and log in with username <code>root</code> and password <code>archiveteam</code>.</li><br />
<li><code>cd /home/warrior/warrior-code</code></li><br />
<li><code>sudo -u warrior git checkout master</code></li><br />
<li><code>reboot</code></li><br />
</ol><br />
<br />
By the same route you can return your warrior to the <code>production</code> branch.<br />
<br />
== How the warrior works ==<br />
The warrior image is built off Debian 6.0.5 (squeeze). Here are the basics:<br />
<br />
* kernel 2.6.32-5-686 (released 2009-03-12)<br />
* Python 2.6.6, pip 1.1<br />
* Perl v5.10.1<br />
* cpan v1.9402 (still needs config)<br />
* nano 2.2.4 with color syntax highlighting<br />
* curl 7.21.0<br />
<br />
The code for each project is stored in /home/warrior/projects/<PROJECTNAME>/<br />
<br />
1. Start the virtual machine<br />
2. Linux boots<br />
3. The user warrior is automatically logged in.<br />
4. /etc/inittab kicks off /home/warrior/warrior-code2/boot.sh. This will git pull https://github.com/ArchiveTeam/warrior-code2 <br />
into /home/warrior/warrior-code2/<br />
5. boot.sh launches /home/warrior/warrior-code/boot-part-2.sh<br />
6. boot-part-2.sh is a short script that does the following:<br />
./warrior-install.sh<br />
* install/update seesaw, check branch, version<br />
* install framebuffer support, DNS caching<br />
* sets up /data<br />
sudo ./make-data-disk.sh<br />
* cleans up<br />
* creates and prepares the partition<br />
mkdir -p /home/warrior/projects<br />
touch /dev/shm/ready-for-warrior<br />
./say-hello.sh<br />
* setup vmware port forwarding<br />
* show splash screen<br />
7. Point your web browser to http://localhost:8001 and go.</div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=ArchiveTeam_Warrior&diff=17708ArchiveTeam Warrior2013-10-09T13:58:24Z<p>DukeNukem: /* How the warrior works */</p>
<hr />
<div>[[Image:Archive_team.png|100px|left]]<br />
[[Image:Warrior-vm-screenshot.png|right]]<br />
[[Image:Warrior-web-screenshot.png|right]]<br />
<br />
The ArchiveTeam Warrior is a virtual archiving appliance. You can run it to help with the ArchiveTeam archiving efforts. It will download sites and upload them to our archive — and it’s really easy to do!<br />
<br />
The warrior is a virtual machine, so there is no risk to your computer. The warrior will only use your bandwidth and some of your disk space. It will get tasks from and report progress to the [[Tracker]].<br />
<br />
The warrior runs on Windows, OS X and Linux. You’ll need [https://www.virtualbox.org/ VirtualBox] (recommended), VMware or a similar program to run the virtual machine.<br />
<br />
Instructions for VirtualBox:<br />
<ol><br />
<li>Download the [http://archive.org/download/archiveteam-warrior/archiveteam-warrior-v2-20121008.ova appliance] (174MB).</li><br />
<li>In VirtualBox, click File > Import Appliance and open the file.</li><br />
<li>Start the virtual machine. It will fetch the latest updates and will eventually tell you to start your web browser.</li><br />
</ol><br />
<br />
Once you’ve started your warrior:<br />
<ol><br />
<li>Go to http://localhost:8001/ and check the Settings page.</li><br />
<li>Choose a username — we’ll show your progress on the leaderboard.</li><br />
<li>Go to the All projects tab and pick a project to work on. Even better: select ArchiveTeam’s Choice to let your warrior work on the most urgent project.</li><br />
</ol><br />
<br />
<br />
<br />
<br />
----<br />
<br />
<br />
==Warrior FAQ==<br />
<br />
=== Why am I seeing a message about that no item was received? ===<br />
<br />
It means that there is no work available. This happens for several because:<br />
<br />
* There project has just finished and someone is inspecting the work done. If a problem is discovered, items may be re-queued and more work is available.<br />
* In the rare case, you have been banned by a tracker administrator because you were requesting too much work or your internet connection is "unclean". We prefer connections from many public IP addresses, use of non-captive DNS servers, and no proxies/firewalls.<br />
<br />
=== Why am I seeing a message about rate limiting? ===<br />
<br />
Keep in mind that although downloading the internet for digital preservation and fun are the primary goals of all Archive Team activities, serious stress on the target's server may occur. The rate limit is imposed by a [[Tracker#People|tracker administrator]] and should not be subverted.<br />
<br />
===Help! The warrior is eating all my bandwidth!===<br />
<br />
You can limit the warriors bandwidth quite easily for virtualbox as long as you are running a relatively recent version. The option is not offered with a GUI however.<br />
<br />
The command <pre>VBoxManage bandwidthctl archiveteam-warrior-2 --name Limit --add network --limit 3</pre> will limit the warrior instance called archiveteam-warrior-2 (The default name of the warrior vm currently) to 3Mb/s. Adjust as needed.<br />
<br />
In the latest version of VirtualBox on Windows, the syntax appears to have changed. The correct command now seems to be:<br />
<br />
<pre>VBoxManage bandwidthctl archiveteam-warrior-2 add netlimit --type network --limit 3</pre><br />
<br />
=== I turned my warrior off, will those tasks be lost? ===<br />
<br />
If you've killed your warrior instances then the work your warrior did has been lost, however the tasks will be returned to the pool after a period of time. If you want you can alert the admins via IRC of whats happened, and they can clear the claims your username may of made however this isn't very important on most projects.<br />
<br />
=== I need to disconnect my internet / reboot my PC but I don't want to lose work ===<br />
<br />
If you pause/suspend the warrior instance, most projects will allow resuming of work in progress when you unsuspend the warrior instance.<br />
<br />
=== I told the warrior to shutdown from the interface but nothing has changed! what gives? ===<br />
<br />
The warrior will attempt to finish the current running tasks before shutting down. If you need to shut down right away; go ahead, your progress will be lost however the jobs will eventually cycle out to another user.<br />
<br />
=== How much disk space will the warrior use? ===<br />
<br />
Short answer: it depends on the project.<br />
<br />
Long answer: because the way each project defines an item differently, the warrior may be downloading a small file to downloading a whole subsection of a website. The virtual machine is configured by default to use 60GB as an absolute maximum. Any unused virtual machine disk space is not used on the host computer. You may, however, run the virtual machine on less than 60GB if you like to live dangerously. We're downloading the internet after all!<br />
<br />
=== The secondary disk is using up space even though it's not running a project. ===<br />
<br />
Virtual machine disk images do not behave like a regular file. There are several ways to reclaim space:<br />
<br />
* Delete the second disk and put back an empty disk. The warrior should reformat the second disk.<br />
* Delete the entire warrior application and re-import it.<br />
* Use the zerofree program and then clone the disk image. Reattach the cloned disk image.<br />
<br />
=== I can't connect to localhost? ===<br />
<br />
The application includes a configuration to set up port forwarding to the guest machine on port 8001 so you can access the interface through your web browser. If this does not happen, you may need to double check your machine's network settings.<br />
<br />
=== I'm looking at the text scrolling by and I notice some errors? Rsync is not working? ===<br />
<br />
Uh-oh! Something is not right. Notify us immediately in the appropriate [[IRC]] channel.<br />
<br />
=== The warrior seems to have too much overhead. I can't run a VM in a VPS! ===<br />
<br />
You don't need to run a virtual machine. If you are managing a VPS, it's likely you are comfortable with some Linux stuff. Projects can be run manually. Consult the project wiki page or the source code repository readme file.<br />
<br />
=== Why a virtual machine in the first place? ===<br />
<br />
The virtual machine is a quick, safe, and easy way for newcomers to help us out. It offers many features:<br />
<br />
* Graphical interface<br />
* Automatically selects which project is important to run<br />
* Self-updating software infrastructure<br />
* Allows for unattended use<br />
* In case of software faults, your machine is not ruined<br />
* Restarts itself in case of runaway programs<br />
* Runs on Windows, Mac OS, Linux painlessly<br />
<br />
If you have suggestions for improving this system, please talk to us as described below.<br />
<br />
=== I still have a question! ===<br />
<br />
Talk to us on [[IRC]]. Use [irc://irc.efnet.org/warrior #warrior] for specific warrior questions or [irc://irc.efnet.org/archiveteam #archiveteam] for general questions.<br />
<br />
== Projects ==<br />
<br />
Previous and current warrior projects:<br />
<br />
{| class="wikitable"<br />
! Project !! Status !! Began !! Finished !! Result !! Archive Location<br />
|-<br />
| [[MobileMe]] || '''Archive Posted''' || April 3, 2012 || Aug 8, 2012 || Success || <br />
[http://archive.org/details/archiveteam-mobileme-hero archive] [http://archive.org/details/archiveteam-mobileme-index index] [http://archive.org/download/archiveteam-mobileme-index/mobileme-20120817.html user lookup]<br />
|-<br />
| [[FortuneCity]] || '''Archive Posted''' || April 4, 2012 || April 11, 2012 || Partial Success || [http://archive.org/details/archiveteam-fortunecity archive] [http://archive.org/download/test-memac-index-test/fortunecity.html user lookup]<br />
|-<br />
| [[Tabblo]] || '''Archive Posted''' || May 23, 2012 || May 26, 2012 || Success || [http://archive.org/details/tabblo-archive archive] [http://archive.org/download/test-memac-index-test/tabblo.html user lookup]<br />
|-<br />
| [[Picplz]] || '''Archive Posted''' || June 3, 2012 || June 15, 2012 || || [http://archive.org/details/archiveteam-picplz archive] [http://archive.org/details/archiveteam-picplz-index index] [http://archive.org/download/archiveteam-picplz-index/picplz-20120823.html user lookup]<br />
|-<br />
| [[Tumblr]] (test project) || '''Archive Posted''' || August 9, 2012 || August 19, 2012 || || [http://archive.org/details/archiveteam-tumblr-test archive (tar)] [http://archive.org/details/archiveteam-tumblr-test-warc archive (warc)]<br />
|-<br />
| [[Cinch]].FM || '''Archive Posted''' || August 20, 2012 || August 22, 2012 || Success || [http://archive.org/details/archiveteam-cinch archive]<br />
|-<br />
| [[City of Heroes]] || '''Archive Posted''' || September 3, 2012 || December 1, 2012 || Success || [http://archive.org/details/archiveteam-city-of-heroes-www www] [http://archive.org/details/archiveteam-city-of-heroes-main forums] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-1 1] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-2 2] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-3 3] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-4 4] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-5 5]<br />
|-<br />
| [[Webshots]] || '''Archive Posted''' || October 4, 2012 || November 18, 2012 || || [http://archive.org/download/webshots-freeze-frame-index/index.html index]<br />
|-<br />
| [[BT Internet]] || '''Archive Posted''' || October 10, 2012 || November 2, 2012 || Success || [http://archive.org/details/archiveteam-btinternet archive]<br />
|-<br />
| [[DailyBooth| Daily Booth]] || '''Archive Posted''' || November 19, 2012 || December 29, 2012 || || [http://archive.org/details/archiveteam_dailybooth archive] [http://archive.org/download/dailybooth-freeze-frame-index/index.html lookup]<br />
|-<br />
| [[GitHub Downloads]] || '''Archive Posted''' || December 13, 2012 || December 17, 2012 || Success || [http://archive.org/details/github-downloads-2012-12 archive] [http://archive.org/details/archiveteam-github-repository-index-201212 index]<br />
|-<br />
| [[Yahoo! Blog]] || '''Archive Posted''' || January 8, 2013 || January 19, 2013 || || [http://archive.org/details/yahoo_korea_blogs archive]<br />
|-<br />
| [[weblog.nl]] || '''Archive Posted''' || January 19, 2013 || February 2, 2013 || || [http://archive.org/details/archiveteam_weblognl archive] [http://archive.org/download/archiveteam_weblognl-index/ lookup]<br />
|-<br />
| [[URLTeam]] || Active || || || || [http://urlte.am/releases/2013-01-02/urlteam.torrent latest]<br />
|-<br />
| [[Punchfork]] || '''Archive Posted''' || January 11, 2013 || March 6, 2013 || || [http://archive.org/details/archiveteam_punchfork archive] [http://archive.org/download/archiveteam_punchfork_index/ user lookup]<br />
|-<br />
| [[Xanga]] || Downloads Paused || January 22, 2013 || February 16, 2013 || || [http://archive.org/details/archiveteam_xanga archive] [http://archive.org/download/archiveteam_xanga_index/ user lookup] [http://archive.org/details/archiveteam-xanga-userlist-20130142 user list]<br />
|-<br />
| [[Posterous]] || Downloads Finished || February 23, 2013 || June 29, 2013 || || [http://archive.org/details/archiveteam_posterous archive]<br />
|-<br />
| [[Storylane]] || Downloads Finished || March 8, 2013 || March 15, 2013 || ||<br />
|-<br />
| [[Yahoo! Messages]] || Downloads Finished || March 20, 2013 || March 31, 2013 || || [http://archive.org/details/archiveteam_yahoo_messages archive]<br />
|-<br />
| [[Formspring]] || Downloads Finished || March 24, 2013 || September 19, 2013 || Success || [http://archive.org/details/archiveteam_formspring archive]<br />
|-<br />
| [[Yahoo Upcoming]] || '''Archive Posted''' || April 20, 2013 || April 25, 2013 || || [http://archive.org/details/archiveteam archive]<br />
|-<br />
| [[Streetfiles]].org || Downloads Finished || April 28, 2013 || April 30, 2013 || Partial || [http://archive.org/details/archiveteam archive]<br />
|-<br />
| [[Xanga]] || Downloads Paused || June 21, 2013 || August 31, 2013 || || [http://archive.org/details/archiveteam_xanga archive] <br />
|-<br />
| [[Zapd]] || '''Archive Posted''' || October 1, 2013 || October 8, 2013 || Success || [https://archive.org/details/archiveteam_zapd archive]<br />
|}<br />
<br />
=== Status ===<br />
:; In Development : a future project<br />
:; Active : start up a Warrior and join the fun; this one is in progress right now<br />
:; Downloads Finished : we've finished downloading the data<br />
:; Archived : the collected data has been properly archived<br />
:; Archive Posted : the archive is available for download<br />
<br />
=== Result ===<br />
:; Success : downloaded all of the data and posted the archive publicly<br />
:; Qualified Success : either we couldn't get all of the data, or the archive can't be made public<br />
:; Failure : the site closed before we could download anything<br />
<br />
== Testing pre-production code ==<br />
<br />
(Don't do this unless you really need or want to.) If you are developing a warrior script, you can test it by switching your warrior from the <code>production</code> branch to the <code>master</code> branch.<br />
<br />
<ol><br />
<li>Start the warrior.</li><br />
<li>Press Alt+F2 and log in with username <code>root</code> and password <code>archiveteam</code>.</li><br />
<li><code>cd /home/warrior/warrior-code</code></li><br />
<li><code>sudo -u warrior git checkout master</code></li><br />
<li><code>reboot</code></li><br />
</ol><br />
<br />
By the same route you can return your warrior to the <code>production</code> branch.<br />
<br />
== How the warrior works ==<br />
The warrior image is built off Debian 6.0.5 (squeeze). Here are the basics:<br />
<br />
* kernel 2.6.32-5-686 (released 2009-03-12)<br />
* Python 2.6.6, pip 1.1<br />
* Perl v5.10.1<br />
* cpan v1.9402 (still needs config)<br />
* nano 2.2.4 with color syntax highlighting<br />
* curl 7.21.0<br />
<br />
The code for each project is stored in /home/warrior/projects/<PROJECTNAME>/<br />
<br />
1. Start the virtual machine<br />
2. Linux boots<br />
3. The user warrior is automatically logged in.<br />
4. /etc/inittab kicks off /home/warrior/warrior-code2/boot.sh. This will git pull https://github.com/ArchiveTeam/warrior-code2 into /home/warrior/warrior-code2/<br />
5. boot.sh launches /home/warrior/warrior-code/boot-part-2.sh<br />
6. boot-part-2.sh is a short script that does the following:<br />
./warrior-install.sh<br />
* install/update seesaw, check branch, version<br />
* install framebuffer support, DNS caching<br />
* sets up /data<br />
sudo ./make-data-disk.sh<br />
* cleans up<br />
* creates and prepares the partition<br />
mkdir -p /home/warrior/projects<br />
touch /dev/shm/ready-for-warrior<br />
./say-hello.sh<br />
* setup vmware port forwarding<br />
* show splash screen<br />
7. Point your web browser to http://localhost:8001 and go.</div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=ArchiveTeam_Warrior&diff=17707ArchiveTeam Warrior2013-10-09T11:49:16Z<p>DukeNukem: </p>
<hr />
<div>[[Image:Archive_team.png|100px|left]]<br />
[[Image:Warrior-vm-screenshot.png|right]]<br />
[[Image:Warrior-web-screenshot.png|right]]<br />
<br />
The ArchiveTeam Warrior is a virtual archiving appliance. You can run it to help with the ArchiveTeam archiving efforts. It will download sites and upload them to our archive — and it’s really easy to do!<br />
<br />
The warrior is a virtual machine, so there is no risk to your computer. The warrior will only use your bandwidth and some of your disk space. It will get tasks from and report progress to the [[Tracker]].<br />
<br />
The warrior runs on Windows, OS X and Linux. You’ll need [https://www.virtualbox.org/ VirtualBox] (recommended), VMware or a similar program to run the virtual machine.<br />
<br />
Instructions for VirtualBox:<br />
<ol><br />
<li>Download the [http://archive.org/download/archiveteam-warrior/archiveteam-warrior-v2-20121008.ova appliance] (174MB).</li><br />
<li>In VirtualBox, click File > Import Appliance and open the file.</li><br />
<li>Start the virtual machine. It will fetch the latest updates and will eventually tell you to start your web browser.</li><br />
</ol><br />
<br />
Once you’ve started your warrior:<br />
<ol><br />
<li>Go to http://localhost:8001/ and check the Settings page.</li><br />
<li>Choose a username — we’ll show your progress on the leaderboard.</li><br />
<li>Go to the All projects tab and pick a project to work on. Even better: select ArchiveTeam’s Choice to let your warrior work on the most urgent project.</li><br />
</ol><br />
<br />
<br />
<br />
<br />
----<br />
<br />
<br />
==Warrior FAQ==<br />
<br />
=== Why am I seeing a message about that no item was received? ===<br />
<br />
It means that there is no work available. This happens for several because:<br />
<br />
* There project has just finished and someone is inspecting the work done. If a problem is discovered, items may be re-queued and more work is available.<br />
* In the rare case, you have been banned by a tracker administrator because you were requesting too much work or your internet connection is "unclean". We prefer connections from many public IP addresses, use of non-captive DNS servers, and no proxies/firewalls.<br />
<br />
=== Why am I seeing a message about rate limiting? ===<br />
<br />
Keep in mind that although downloading the internet for digital preservation and fun are the primary goals of all Archive Team activities, serious stress on the target's server may occur. The rate limit is imposed by a [[Tracker#People|tracker administrator]] and should not be subverted.<br />
<br />
===Help! The warrior is eating all my bandwidth!===<br />
<br />
You can limit the warriors bandwidth quite easily for virtualbox as long as you are running a relatively recent version. The option is not offered with a GUI however.<br />
<br />
The command <pre>VBoxManage bandwidthctl archiveteam-warrior-2 --name Limit --add network --limit 3</pre> will limit the warrior instance called archiveteam-warrior-2 (The default name of the warrior vm currently) to 3Mb/s. Adjust as needed.<br />
<br />
In the latest version of VirtualBox on Windows, the syntax appears to have changed. The correct command now seems to be:<br />
<br />
<pre>VBoxManage bandwidthctl archiveteam-warrior-2 add netlimit --type network --limit 3</pre><br />
<br />
=== I turned my warrior off, will those tasks be lost? ===<br />
<br />
If you've killed your warrior instances then the work your warrior did has been lost, however the tasks will be returned to the pool after a period of time. If you want you can alert the admins via IRC of whats happened, and they can clear the claims your username may of made however this isn't very important on most projects.<br />
<br />
=== I need to disconnect my internet / reboot my PC but I don't want to lose work ===<br />
<br />
If you pause/suspend the warrior instance, most projects will allow resuming of work in progress when you unsuspend the warrior instance.<br />
<br />
=== I told the warrior to shutdown from the interface but nothing has changed! what gives? ===<br />
<br />
The warrior will attempt to finish the current running tasks before shutting down. If you need to shut down right away; go ahead, your progress will be lost however the jobs will eventually cycle out to another user.<br />
<br />
=== How much disk space will the warrior use? ===<br />
<br />
Short answer: it depends on the project.<br />
<br />
Long answer: because the way each project defines an item differently, the warrior may be downloading a small file to downloading a whole subsection of a website. The virtual machine is configured by default to use 60GB as an absolute maximum. Any unused virtual machine disk space is not used on the host computer. You may, however, run the virtual machine on less than 60GB if you like to live dangerously. We're downloading the internet after all!<br />
<br />
=== The secondary disk is using up space even though it's not running a project. ===<br />
<br />
Virtual machine disk images do not behave like a regular file. There are several ways to reclaim space:<br />
<br />
* Delete the second disk and put back an empty disk. The warrior should reformat the second disk.<br />
* Delete the entire warrior application and re-import it.<br />
* Use the zerofree program and then clone the disk image. Reattach the cloned disk image.<br />
<br />
=== I can't connect to localhost? ===<br />
<br />
The application includes a configuration to set up port forwarding to the guest machine on port 8001 so you can access the interface through your web browser. If this does not happen, you may need to double check your machine's network settings.<br />
<br />
=== I'm looking at the text scrolling by and I notice some errors? Rsync is not working? ===<br />
<br />
Uh-oh! Something is not right. Notify us immediately in the appropriate [[IRC]] channel.<br />
<br />
=== The warrior seems to have too much overhead. I can't run a VM in a VPS! ===<br />
<br />
You don't need to run a virtual machine. If you are managing a VPS, it's likely you are comfortable with some Linux stuff. Projects can be run manually. Consult the project wiki page or the source code repository readme file.<br />
<br />
=== Why a virtual machine in the first place? ===<br />
<br />
The virtual machine is a quick, safe, and easy way for newcomers to help us out. It offers many features:<br />
<br />
* Graphical interface<br />
* Automatically selects which project is important to run<br />
* Self-updating software infrastructure<br />
* Allows for unattended use<br />
* In case of software faults, your machine is not ruined<br />
* Restarts itself in case of runaway programs<br />
* Runs on Windows, Mac OS, Linux painlessly<br />
<br />
If you have suggestions for improving this system, please talk to us as described below.<br />
<br />
=== I still have a question! ===<br />
<br />
Talk to us on [[IRC]]. Use [irc://irc.efnet.org/warrior #warrior] for specific warrior questions or [irc://irc.efnet.org/archiveteam #archiveteam] for general questions.<br />
<br />
== Projects ==<br />
<br />
Previous and current warrior projects:<br />
<br />
{| class="wikitable"<br />
! Project !! Status !! Began !! Finished !! Result !! Archive Location<br />
|-<br />
| [[MobileMe]] || '''Archive Posted''' || April 3, 2012 || Aug 8, 2012 || Success || <br />
[http://archive.org/details/archiveteam-mobileme-hero archive] [http://archive.org/details/archiveteam-mobileme-index index] [http://archive.org/download/archiveteam-mobileme-index/mobileme-20120817.html user lookup]<br />
|-<br />
| [[FortuneCity]] || '''Archive Posted''' || April 4, 2012 || April 11, 2012 || Partial Success || [http://archive.org/details/archiveteam-fortunecity archive] [http://archive.org/download/test-memac-index-test/fortunecity.html user lookup]<br />
|-<br />
| [[Tabblo]] || '''Archive Posted''' || May 23, 2012 || May 26, 2012 || Success || [http://archive.org/details/tabblo-archive archive] [http://archive.org/download/test-memac-index-test/tabblo.html user lookup]<br />
|-<br />
| [[Picplz]] || '''Archive Posted''' || June 3, 2012 || June 15, 2012 || || [http://archive.org/details/archiveteam-picplz archive] [http://archive.org/details/archiveteam-picplz-index index] [http://archive.org/download/archiveteam-picplz-index/picplz-20120823.html user lookup]<br />
|-<br />
| [[Tumblr]] (test project) || '''Archive Posted''' || August 9, 2012 || August 19, 2012 || || [http://archive.org/details/archiveteam-tumblr-test archive (tar)] [http://archive.org/details/archiveteam-tumblr-test-warc archive (warc)]<br />
|-<br />
| [[Cinch]].FM || '''Archive Posted''' || August 20, 2012 || August 22, 2012 || Success || [http://archive.org/details/archiveteam-cinch archive]<br />
|-<br />
| [[City of Heroes]] || '''Archive Posted''' || September 3, 2012 || December 1, 2012 || Success || [http://archive.org/details/archiveteam-city-of-heroes-www www] [http://archive.org/details/archiveteam-city-of-heroes-main forums] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-1 1] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-2 2] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-3 3] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-4 4] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-5 5]<br />
|-<br />
| [[Webshots]] || '''Archive Posted''' || October 4, 2012 || November 18, 2012 || || [http://archive.org/download/webshots-freeze-frame-index/index.html index]<br />
|-<br />
| [[BT Internet]] || '''Archive Posted''' || October 10, 2012 || November 2, 2012 || Success || [http://archive.org/details/archiveteam-btinternet archive]<br />
|-<br />
| [[DailyBooth| Daily Booth]] || '''Archive Posted''' || November 19, 2012 || December 29, 2012 || || [http://archive.org/details/archiveteam_dailybooth archive] [http://archive.org/download/dailybooth-freeze-frame-index/index.html lookup]<br />
|-<br />
| [[GitHub Downloads]] || '''Archive Posted''' || December 13, 2012 || December 17, 2012 || Success || [http://archive.org/details/github-downloads-2012-12 archive] [http://archive.org/details/archiveteam-github-repository-index-201212 index]<br />
|-<br />
| [[Yahoo! Blog]] || '''Archive Posted''' || January 8, 2013 || January 19, 2013 || || [http://archive.org/details/yahoo_korea_blogs archive]<br />
|-<br />
| [[weblog.nl]] || '''Archive Posted''' || January 19, 2013 || February 2, 2013 || || [http://archive.org/details/archiveteam_weblognl archive] [http://archive.org/download/archiveteam_weblognl-index/ lookup]<br />
|-<br />
| [[URLTeam]] || Active || || || || [http://urlte.am/releases/2013-01-02/urlteam.torrent latest]<br />
|-<br />
| [[Punchfork]] || '''Archive Posted''' || January 11, 2013 || March 6, 2013 || || [http://archive.org/details/archiveteam_punchfork archive] [http://archive.org/download/archiveteam_punchfork_index/ user lookup]<br />
|-<br />
| [[Xanga]] || Downloads Paused || January 22, 2013 || February 16, 2013 || || [http://archive.org/details/archiveteam_xanga archive] [http://archive.org/download/archiveteam_xanga_index/ user lookup] [http://archive.org/details/archiveteam-xanga-userlist-20130142 user list]<br />
|-<br />
| [[Posterous]] || Downloads Finished || February 23, 2013 || June 29, 2013 || || [http://archive.org/details/archiveteam_posterous archive]<br />
|-<br />
| [[Storylane]] || Downloads Finished || March 8, 2013 || March 15, 2013 || ||<br />
|-<br />
| [[Yahoo! Messages]] || Downloads Finished || March 20, 2013 || March 31, 2013 || || [http://archive.org/details/archiveteam_yahoo_messages archive]<br />
|-<br />
| [[Formspring]] || Downloads Finished || March 24, 2013 || September 19, 2013 || Success || [http://archive.org/details/archiveteam_formspring archive]<br />
|-<br />
| [[Yahoo Upcoming]] || '''Archive Posted''' || April 20, 2013 || April 25, 2013 || || [http://archive.org/details/archiveteam archive]<br />
|-<br />
| [[Streetfiles]].org || Downloads Finished || April 28, 2013 || April 30, 2013 || Partial || [http://archive.org/details/archiveteam archive]<br />
|-<br />
| [[Xanga]] || Downloads Paused || June 21, 2013 || August 31, 2013 || || [http://archive.org/details/archiveteam_xanga archive] <br />
|-<br />
| [[Zapd]] || '''Archive Posted''' || October 1, 2013 || October 8, 2013 || Success || [https://archive.org/details/archiveteam_zapd archive]<br />
|}<br />
<br />
=== Status ===<br />
:; In Development : a future project<br />
:; Active : start up a Warrior and join the fun; this one is in progress right now<br />
:; Downloads Finished : we've finished downloading the data<br />
:; Archived : the collected data has been properly archived<br />
:; Archive Posted : the archive is available for download<br />
<br />
=== Result ===<br />
:; Success : downloaded all of the data and posted the archive publicly<br />
:; Qualified Success : either we couldn't get all of the data, or the archive can't be made public<br />
:; Failure : the site closed before we could download anything<br />
<br />
== Testing pre-production code ==<br />
<br />
(Don't do this unless you really need or want to.) If you are developing a warrior script, you can test it by switching your warrior from the <code>production</code> branch to the <code>master</code> branch.<br />
<br />
<ol><br />
<li>Start the warrior.</li><br />
<li>Press Alt+F2 and log in with username <code>root</code> and password <code>archiveteam</code>.</li><br />
<li><code>cd /home/warrior/warrior-code</code></li><br />
<li><code>sudo -u warrior git checkout master</code></li><br />
<li><code>reboot</code></li><br />
</ol><br />
<br />
By the same route you can return your warrior to the <code>production</code> branch.<br />
<br />
== How the warrior works ==</div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=ArchiveTeam_Warrior&diff=17706ArchiveTeam Warrior2013-10-09T10:48:07Z<p>DukeNukem: /* Projects */</p>
<hr />
<div>[[Image:Archive_team.png|100px|left]]<br />
[[Image:Warrior-vm-screenshot.png|right]]<br />
[[Image:Warrior-web-screenshot.png|right]]<br />
<br />
The ArchiveTeam Warrior is a virtual archiving appliance. You can run it to help with the ArchiveTeam archiving efforts. It will download sites and upload them to our archive — and it’s really easy to do!<br />
<br />
The warrior is a virtual machine, so there is no risk to your computer. The warrior will only use your bandwidth and some of your disk space. It will get tasks from and report progress to the [[Tracker]].<br />
<br />
The warrior runs on Windows, OS X and Linux. You’ll need [https://www.virtualbox.org/ VirtualBox] (recommended), VMware or a similar program to run the virtual machine.<br />
<br />
Instructions for VirtualBox:<br />
<ol><br />
<li>Download the [http://archive.org/download/archiveteam-warrior/archiveteam-warrior-v2-20121008.ova appliance] (174MB).</li><br />
<li>In VirtualBox, click File > Import Appliance and open the file.</li><br />
<li>Start the virtual machine. It will fetch the latest updates and will eventually tell you to start your web browser.</li><br />
</ol><br />
<br />
Once you’ve started your warrior:<br />
<ol><br />
<li>Go to http://localhost:8001/ and check the Settings page.</li><br />
<li>Choose a username — we’ll show your progress on the leaderboard.</li><br />
<li>Go to the All projects tab and pick a project to work on. Even better: select ArchiveTeam’s Choice to let your warrior work on the most urgent project.</li><br />
</ol><br />
<br />
<br />
<br />
<br />
----<br />
<br />
<br />
==Warrior FAQ==<br />
<br />
=== Why am I seeing a message about that no item was received? ===<br />
<br />
It means that there is no work available. This happens for several because:<br />
<br />
* There project has just finished and someone is inspecting the work done. If a problem is discovered, items may be re-queued and more work is available.<br />
* In the rare case, you have been banned by a tracker administrator because you were requesting too much work or your internet connection is "unclean". We prefer connections from many public IP addresses, use of non-captive DNS servers, and no proxies/firewalls.<br />
<br />
=== Why am I seeing a message about rate limiting? ===<br />
<br />
Keep in mind that although downloading the internet for digital preservation and fun are the primary goals of all Archive Team activities, serious stress on the target's server may occur. The rate limit is imposed by a [[Tracker#People|tracker administrator]] and should not be subverted.<br />
<br />
===Help! The warrior is eating all my bandwidth!===<br />
<br />
You can limit the warriors bandwidth quite easily for virtualbox as long as you are running a relatively recent version. The option is not offered with a GUI however.<br />
<br />
The command <pre>VBoxManage bandwidthctl archiveteam-warrior-2 --name Limit --add network --limit 3</pre> will limit the warrior instance called archiveteam-warrior-2 (The default name of the warrior vm currently) to 3Mb/s. Adjust as needed.<br />
<br />
In the latest version of VirtualBox on Windows, the syntax appears to have changed. The correct command now seems to be:<br />
<br />
<pre>VBoxManage bandwidthctl archiveteam-warrior-2 add netlimit --type network --limit 3</pre><br />
<br />
=== I turned my warrior off, will those tasks be lost? ===<br />
<br />
If you've killed your warrior instances then the work your warrior did has been lost, however the tasks will be returned to the pool after a period of time. If you want you can alert the admins via IRC of whats happened, and they can clear the claims your username may of made however this isn't very important on most projects.<br />
<br />
=== I need to disconnect my internet / reboot my PC but I don't want to lose work ===<br />
<br />
If you pause/suspend the warrior instance, most projects will allow resuming of work in progress when you unsuspend the warrior instance.<br />
<br />
=== I told the warrior to shutdown from the interface but nothing has changed! what gives? ===<br />
<br />
The warrior will attempt to finish the current running tasks before shutting down. If you need to shut down right away; go ahead, your progress will be lost however the jobs will eventually cycle out to another user.<br />
<br />
=== How much disk space will the warrior use? ===<br />
<br />
Short answer: it depends on the project.<br />
<br />
Long answer: because the way each project defines an item differently, the warrior may be downloading a small file to downloading a whole subsection of a website. The virtual machine is configured by default to use 60GB as an absolute maximum. Any unused virtual machine disk space is not used on the host computer. You may, however, run the virtual machine on less than 60GB if you like to live dangerously. We're downloading the internet after all!<br />
<br />
=== The secondary disk is using up space even though it's not running a project. ===<br />
<br />
Virtual machine disk images do not behave like a regular file. There are several ways to reclaim space:<br />
<br />
* Delete the second disk and put back an empty disk. The warrior should reformat the second disk.<br />
* Delete the entire warrior application and re-import it.<br />
* Use the zerofree program and then clone the disk image. Reattach the cloned disk image.<br />
<br />
=== I can't connect to localhost? ===<br />
<br />
The application includes a configuration to set up port forwarding to the guest machine on port 8001 so you can access the interface through your web browser. If this does not happen, you may need to double check your machine's network settings.<br />
<br />
=== I'm looking at the text scrolling by and I notice some errors? Rsync is not working? ===<br />
<br />
Uh-oh! Something is not right. Notify us immediately in the appropriate [[IRC]] channel.<br />
<br />
=== The warrior seems to have too much overhead. I can't run a VM in a VPS! ===<br />
<br />
You don't need to run a virtual machine. If you are managing a VPS, it's likely you are comfortable with some Linux stuff. Projects can be run manually. Consult the project wiki page or the source code repository readme file.<br />
<br />
=== Why a virtual machine in the first place? ===<br />
<br />
The virtual machine is a quick, safe, and easy way for newcomers to help us out. It offers many features:<br />
<br />
* Graphical interface<br />
* Automatically selects which project is important to run<br />
* Self-updating software infrastructure<br />
* Allows for unattended use<br />
* In case of software faults, your machine is not ruined<br />
* Restarts itself in case of runaway programs<br />
* Runs on Windows, Mac OS, Linux painlessly<br />
<br />
If you have suggestions for improving this system, please talk to us as described below.<br />
<br />
=== I still have a question! ===<br />
<br />
Talk to us on [[IRC]]. Use [irc://irc.efnet.org/warrior #warrior] for specific warrior questions or [irc://irc.efnet.org/archiveteam #archiveteam] for general questions.<br />
<br />
== Projects ==<br />
<br />
Previous and current warrior projects:<br />
<br />
{| class="wikitable"<br />
! Project !! Status !! Began !! Finished !! Result !! Archive Location<br />
|-<br />
| [[MobileMe]] || '''Archive Posted''' || April 3, 2012 || Aug 8, 2012 || Success || <br />
[http://archive.org/details/archiveteam-mobileme-hero archive] [http://archive.org/details/archiveteam-mobileme-index index] [http://archive.org/download/archiveteam-mobileme-index/mobileme-20120817.html user lookup]<br />
|-<br />
| [[FortuneCity]] || '''Archive Posted''' || April 4, 2012 || April 11, 2012 || Partial Success || [http://archive.org/details/archiveteam-fortunecity archive] [http://archive.org/download/test-memac-index-test/fortunecity.html user lookup]<br />
|-<br />
| [[Tabblo]] || '''Archive Posted''' || May 23, 2012 || May 26, 2012 || Success || [http://archive.org/details/tabblo-archive archive] [http://archive.org/download/test-memac-index-test/tabblo.html user lookup]<br />
|-<br />
| [[Picplz]] || '''Archive Posted''' || June 3, 2012 || June 15, 2012 || || [http://archive.org/details/archiveteam-picplz archive] [http://archive.org/details/archiveteam-picplz-index index] [http://archive.org/download/archiveteam-picplz-index/picplz-20120823.html user lookup]<br />
|-<br />
| [[Tumblr]] (test project) || '''Archive Posted''' || August 9, 2012 || August 19, 2012 || || [http://archive.org/details/archiveteam-tumblr-test archive (tar)] [http://archive.org/details/archiveteam-tumblr-test-warc archive (warc)]<br />
|-<br />
| [[Cinch]].FM || '''Archive Posted''' || August 20, 2012 || August 22, 2012 || Success || [http://archive.org/details/archiveteam-cinch archive]<br />
|-<br />
| [[City of Heroes]] || '''Archive Posted''' || September 3, 2012 || December 1, 2012 || Success || [http://archive.org/details/archiveteam-city-of-heroes-www www] [http://archive.org/details/archiveteam-city-of-heroes-main forums] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-1 1] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-2 2] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-3 3] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-4 4] [http://archive.org/details/archiveteam-city-of-heroes-forums-megawarc-5 5]<br />
|-<br />
| [[Webshots]] || '''Archive Posted''' || October 4, 2012 || November 18, 2012 || || [http://archive.org/download/webshots-freeze-frame-index/index.html index]<br />
|-<br />
| [[BT Internet]] || '''Archive Posted''' || October 10, 2012 || November 2, 2012 || Success || [http://archive.org/details/archiveteam-btinternet archive]<br />
|-<br />
| [[DailyBooth| Daily Booth]] || '''Archive Posted''' || November 19, 2012 || December 29, 2012 || || [http://archive.org/details/archiveteam_dailybooth archive] [http://archive.org/download/dailybooth-freeze-frame-index/index.html lookup]<br />
|-<br />
| [[GitHub Downloads]] || '''Archive Posted''' || December 13, 2012 || December 17, 2012 || Success || [http://archive.org/details/github-downloads-2012-12 archive] [http://archive.org/details/archiveteam-github-repository-index-201212 index]<br />
|-<br />
| [[Yahoo! Blog]] || '''Archive Posted''' || January 8, 2013 || January 19, 2013 || || [http://archive.org/details/yahoo_korea_blogs archive]<br />
|-<br />
| [[weblog.nl]] || '''Archive Posted''' || January 19, 2013 || February 2, 2013 || || [http://archive.org/details/archiveteam_weblognl archive] [http://archive.org/download/archiveteam_weblognl-index/ lookup]<br />
|-<br />
| [[URLTeam]] || Active || || || || [http://urlte.am/releases/2013-01-02/urlteam.torrent latest]<br />
|-<br />
| [[Punchfork]] || '''Archive Posted''' || January 11, 2013 || March 6, 2013 || || [http://archive.org/details/archiveteam_punchfork archive] [http://archive.org/download/archiveteam_punchfork_index/ user lookup]<br />
|-<br />
| [[Xanga]] || Downloads Paused || January 22, 2013 || February 16, 2013 || || [http://archive.org/details/archiveteam_xanga archive] [http://archive.org/download/archiveteam_xanga_index/ user lookup] [http://archive.org/details/archiveteam-xanga-userlist-20130142 user list]<br />
|-<br />
| [[Posterous]] || Downloads Finished || February 23, 2013 || June 29, 2013 || || [http://archive.org/details/archiveteam_posterous archive]<br />
|-<br />
| [[Storylane]] || Downloads Finished || March 8, 2013 || March 15, 2013 || ||<br />
|-<br />
| [[Yahoo! Messages]] || Downloads Finished || March 20, 2013 || March 31, 2013 || || [http://archive.org/details/archiveteam_yahoo_messages archive]<br />
|-<br />
| [[Formspring]] || Downloads Finished || March 24, 2013 || September 19, 2013 || Success || [http://archive.org/details/archiveteam_formspring archive]<br />
|-<br />
| [[Yahoo Upcoming]] || '''Archive Posted''' || April 20, 2013 || April 25, 2013 || || [http://archive.org/details/archiveteam archive]<br />
|-<br />
| [[Streetfiles]].org || Downloads Finished || April 28, 2013 || April 30, 2013 || Partial || [http://archive.org/details/archiveteam archive]<br />
|-<br />
| [[Xanga]] || Downloads Paused || June 21, 2013 || August 31, 2013 || || [http://archive.org/details/archiveteam_xanga archive] <br />
|-<br />
| [[Zapd]] || '''Archive Posted''' || October 1, 2013 || October 8, 2013 || Success || [https://archive.org/details/archiveteam_zapd archive]<br />
|}<br />
<br />
=== Status ===<br />
:; In Development : a future project<br />
:; Active : start up a Warrior and join the fun; this one is in progress right now<br />
:; Downloads Finished : we've finished downloading the data<br />
:; Archived : the collected data has been properly archived<br />
:; Archive Posted : the archive is available for download<br />
<br />
=== Result ===<br />
:; Success : downloaded all of the data and posted the archive publicly<br />
:; Qualified Success : either we couldn't get all of the data, or the archive can't be made public<br />
:; Failure : the site closed before we could download anything<br />
<br />
== Testing pre-production code ==<br />
<br />
(Don't do this unless you really need or want to.) If you are developing a warrior script, you can test it by switching your warrior from the <code>production</code> branch to the <code>master</code> branch.<br />
<br />
<ol><br />
<li>Start the warrior.</li><br />
<li>Press Alt+F2 and log in with username <code>root</code> and password <code>archiveteam</code>.</li><br />
<li><code>cd /home/warrior/warrior-code</code></li><br />
<li><code>sudo -u warrior git checkout master</code></li><br />
<li><code>reboot</code></li><br />
</ol><br />
<br />
By the same route you can return your warrior to the <code>production</code> branch.</div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=Blip.tv&diff=17697Blip.tv2013-10-09T00:37:21Z<p>DukeNukem: </p>
<hr />
<div>{{Infobox project<br />
| title = Blip.tv<br />
| image = Blip.tv_1303512711518.png<br />
| description = <br />
| URL = http://blip.tv/<br />
| project_status = {{online}}<br />
| archiving_status = {{nosavedyet}}<br />
| irc = blooper.tv<br />
}}<br />
<br />
'''Blip.tv''' is a [[Video hostings|video sharing]] website.<br />
<br />
{{expand}}<br />
<br />
== Site Structure ==<br />
* On pages like http://blip.tv/comedy-videos there is pagination but the page links are only '#' and need js to work<br />
* Each anchor tag does have a data-results_page value that appears to carry the url information. Here is an example:<br />
<a class='currentResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=1">1</a><br />
<a class='advanceResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=2">2</a><br />
<a class='advanceResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=3">3</a><br />
* Only 3 links of pagination are shown at a time. To find out how many pages a category has you must click the "double arrow right" at the bottom of the page, which requires javascript.<br />
* Each page of results in a category is only 8 shows at a time.<br />
* Each show has x many episodes<br />
* To video an episode list for a show you must have javascript enabled. This is also true for pagination on these pages.<br />
* Some shows have rss feeds. Example http://blip.tv/ylse/rss<br />
* RSS feeds only show a partial list of episodes. http://blip.tv/schlomo/ has more episodes than listed in the RSS feed.<br />
* If you have the url of a video's page http://blip.tv/zomblogalypse/zomblogalypse-series-trailer-5617646 for example get-flash-video can download the video in a nice mp4 file.<br />
* There is no /sitemap.xml<br />
* robots.txt crawl delay is 1 second.<br />
<br />
== External links ==<br />
* {{url|1=http://blip.tv/|2=Blip.tv}}<br />
<br />
{{Navigation box}}<br />
<br />
[[Category:Video hostings]]</div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=Blip.tv&diff=17696Blip.tv2013-10-08T23:17:00Z<p>DukeNukem: /* Site Structure */</p>
<hr />
<div>{{Infobox project<br />
| title = Blip.tv<br />
| image = Blip.tv_1303512711518.png<br />
| description = <br />
| URL = http://blip.tv/<br />
| project_status = {{online}}<br />
| archiving_status = {{nosavedyet}}<br />
}}<br />
<br />
'''Blip.tv''' is a [[Video hostings|video sharing]] website.<br />
<br />
{{expand}}<br />
<br />
== Site Structure ==<br />
* On pages like http://blip.tv/comedy-videos there is pagination but the page links are only '#' and need js to work<br />
* Each anchor tag does have a data-results_page value that appears to carry the url information. Here is an example:<br />
<a class='currentResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=1">1</a><br />
<a class='advanceResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=2">2</a><br />
<a class='advanceResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=3">3</a><br />
* Only 3 links of pagination are shown at a time. To find out how many pages a category has you must click the "double arrow right" at the bottom of the page, which requires javascript.<br />
* Each page of results in a category is only 8 shows at a time.<br />
* Each show has x many episodes<br />
* To video an episode list for a show you must have javascript enabled. This is also true for pagination on these pages.<br />
* Some shows have rss feeds. Example http://blip.tv/ylse/rss<br />
* RSS feeds only show a partial list of episodes. http://blip.tv/schlomo/ has more episodes than listed in the RSS feed.<br />
* If you have the url of a video's page http://blip.tv/zomblogalypse/zomblogalypse-series-trailer-5617646 for example get-flash-video can download the video in a nice mp4 file.<br />
* There is no /sitemap.xml<br />
* robots.txt crawl delay is 1 second.<br />
<br />
== External links ==<br />
* {{url|1=http://blip.tv/|2=Blip.tv}}<br />
<br />
{{Navigation box}}<br />
<br />
[[Category:Video hostings]]</div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=Blip.tv&diff=17695Blip.tv2013-10-08T23:13:41Z<p>DukeNukem: /* Site Structure */</p>
<hr />
<div>{{Infobox project<br />
| title = Blip.tv<br />
| image = Blip.tv_1303512711518.png<br />
| description = <br />
| URL = http://blip.tv/<br />
| project_status = {{online}}<br />
| archiving_status = {{nosavedyet}}<br />
}}<br />
<br />
'''Blip.tv''' is a [[Video hostings|video sharing]] website.<br />
<br />
{{expand}}<br />
<br />
== Site Structure ==<br />
* On pages like http://blip.tv/comedy-videos there is pagination but the page links are only '#' and need js to work<br />
* Each anchor tag does have a data-results_page value that appears to carry the url information. Here is an example:<br />
<a class='currentResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=1">1</a><br />
<a class='advanceResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=2">2</a><br />
<a class='advanceResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=3">3</a><br />
* Only 3 links of pagination are shown at a time. To find out how many pages a category has you must click the "double arrow right" at the bottom of the page, which requires javascript.<br />
* Each page of results in a category is only 8 shows at a time.<br />
* Each show has x many episodes<br />
* To video an episode list for a show you must have javascript enabled. This is also true for pagination on these pages.<br />
* Some shows have rss feeds. Example http://blip.tv/ylse/rss<br />
* RSS feeds might have the whole video history of a show but it has only been tested with a show with 54 episodes.<br />
* If you have the url of a video's page http://blip.tv/zomblogalypse/zomblogalypse-series-trailer-5617646 for example get-flash-video can download the video in a nice mp4 file.<br />
* There is no /sitemap.xml<br />
* robots.txt crawl delay is 1 second.<br />
<br />
== External links ==<br />
* {{url|1=http://blip.tv/|2=Blip.tv}}<br />
<br />
{{Navigation box}}<br />
<br />
[[Category:Video hostings]]</div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=Blip.tv&diff=17694Blip.tv2013-10-08T23:10:27Z<p>DukeNukem: /* Site Structure */</p>
<hr />
<div>{{Infobox project<br />
| title = Blip.tv<br />
| image = Blip.tv_1303512711518.png<br />
| description = <br />
| URL = http://blip.tv/<br />
| project_status = {{online}}<br />
| archiving_status = {{nosavedyet}}<br />
}}<br />
<br />
'''Blip.tv''' is a [[Video hostings|video sharing]] website.<br />
<br />
{{expand}}<br />
<br />
== Site Structure ==<br />
* On pages like http://blip.tv/comedy-videos there is pagination but the page links are only '#' and need js to work<br />
* Each anchor tag does have a data-results_page value that appears to carry the url information. Here is an example:<br />
<a class='currentResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=1">1</a><br />
<a class='advanceResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=2">2</a><br />
<a class='advanceResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=3">3</a><br />
* Only 3 links of pagination are shown at a time. To find out how many pages a category has you must click the "double arrow right" at the bottom of the page, which requires javascript.<br />
* Each page of results in a category is only 8 shows at a time.<br />
* Each show has x many episodes<br />
* To video an episode list for a show you must have javascript enabled. This is also true for pagination on these pages.<br />
* Some shows have rss feeds. Example http://blip.tv/ylse/rss<br />
* If you have the url of a video's page http://blip.tv/zomblogalypse/zomblogalypse-series-trailer-5617646 for example get-flash-video can download the video in a nice mp4 file.<br />
* There is no /sitemap.xml<br />
* robots.txt crawl delay is 1 second.<br />
<br />
== External links ==<br />
* {{url|1=http://blip.tv/|2=Blip.tv}}<br />
<br />
{{Navigation box}}<br />
<br />
[[Category:Video hostings]]</div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=Blip.tv&diff=17693Blip.tv2013-10-08T23:06:15Z<p>DukeNukem: /* Site Structure */</p>
<hr />
<div>{{Infobox project<br />
| title = Blip.tv<br />
| image = Blip.tv_1303512711518.png<br />
| description = <br />
| URL = http://blip.tv/<br />
| project_status = {{online}}<br />
| archiving_status = {{nosavedyet}}<br />
}}<br />
<br />
'''Blip.tv''' is a [[Video hostings|video sharing]] website.<br />
<br />
{{expand}}<br />
<br />
== Site Structure ==<br />
* On pages like http://blip.tv/comedy-videos there is pagination but the page links are only '#' and need js to work<br />
* Each anchor tag does have a data-results_page value that appears to carry the url information. Here is an example:<br />
<a class='currentResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=1">1</a><br />
<a class='advanceResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=2">2</a><br />
<a class='advanceResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=3">3</a><br />
* Only 3 links of pagination are shown at a time. To find out how many pages a category has you must click the "double arrow right" at the bottom of the page, which requires javascript.<br />
* Each page of results in a category is only 8 shows at a time.<br />
* Each show has x many episodes<br />
* To video an episode list for a show you must have javascript enabled. This is also true for pagination on these pages.<br />
* If you have the url of a video's page http://blip.tv/zomblogalypse/zomblogalypse-series-trailer-5617646 for example get-flash-video can download the video in a nice mp4 file.<br />
* There is no /sitemap.xml<br />
* robots.txt crawl delay is 1 second.<br />
<br />
== External links ==<br />
* {{url|1=http://blip.tv/|2=Blip.tv}}<br />
<br />
{{Navigation box}}<br />
<br />
[[Category:Video hostings]]</div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=Blip.tv&diff=17692Blip.tv2013-10-08T22:55:24Z<p>DukeNukem: /* Site Structure */</p>
<hr />
<div>{{Infobox project<br />
| title = Blip.tv<br />
| image = Blip.tv_1303512711518.png<br />
| description = <br />
| URL = http://blip.tv/<br />
| project_status = {{online}}<br />
| archiving_status = {{nosavedyet}}<br />
}}<br />
<br />
'''Blip.tv''' is a [[Video hostings|video sharing]] website.<br />
<br />
{{expand}}<br />
<br />
== Site Structure ==<br />
* On pages like http://blip.tv/comedy-videos there is pagination but the page links are only '#' and need js to work<br />
* Each anchor tag does have a data-results_page value that appears to carry the url information. Here is an example:<br />
<a class='currentResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=1">1</a><br />
<a class='advanceResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=2">2</a><br />
<a class='advanceResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=3">3</a><br />
* Only 3 links of pagination are shown at a time. To find out how many pages a category has you must click the "double arrow right" at the bottom of the page, which requires javascript.<br />
* Each page of results in a category is only 8 shows at a time.<br />
* Each show has x many episodes<br />
* To video an episode list for a show you must have javascript enabled. This is also true for pagination on these pages.<br />
* If you have the url of a video's page http://blip.tv/zomblogalypse/zomblogalypse-series-trailer-5617646 for example get-flash-video can download the video in a nice mp4 file.<br />
<br />
== External links ==<br />
* {{url|1=http://blip.tv/|2=Blip.tv}}<br />
<br />
{{Navigation box}}<br />
<br />
[[Category:Video hostings]]</div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=Blip.tv&diff=17691Blip.tv2013-10-08T22:50:02Z<p>DukeNukem: /* Site Structure */</p>
<hr />
<div>{{Infobox project<br />
| title = Blip.tv<br />
| image = Blip.tv_1303512711518.png<br />
| description = <br />
| URL = http://blip.tv/<br />
| project_status = {{online}}<br />
| archiving_status = {{nosavedyet}}<br />
}}<br />
<br />
'''Blip.tv''' is a [[Video hostings|video sharing]] website.<br />
<br />
{{expand}}<br />
<br />
== Site Structure ==<br />
* On pages like http://blip.tv/comedy-videos there is pagination but the page links are only '#' and need js to work<br />
* Each anchor tag does have a data-results_page value that appears to carry the url information. Here is an example:<br />
<a class='currentResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=1">1</a><br />
<a class='advanceResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=2">2</a><br />
<a class='advanceResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=3">3</a><br />
* Only 3 links of pagination are shown at a time. To find out how many pages a category has you must click the "double arrow right" at the bottom of the page.<br />
* Each page of results in a category is only 8 shows at a time.<br />
* If you have the url of a video's page http://blip.tv/zomblogalypse/zomblogalypse-series-trailer-5617646 for example get-flash-video can download the video in a nice mp4 file.<br />
<br />
== External links ==<br />
* {{url|1=http://blip.tv/|2=Blip.tv}}<br />
<br />
{{Navigation box}}<br />
<br />
[[Category:Video hostings]]</div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=Blip.tv&diff=17690Blip.tv2013-10-08T22:29:49Z<p>DukeNukem: /* Site Structure */</p>
<hr />
<div>{{Infobox project<br />
| title = Blip.tv<br />
| image = Blip.tv_1303512711518.png<br />
| description = <br />
| URL = http://blip.tv/<br />
| project_status = {{online}}<br />
| archiving_status = {{nosavedyet}}<br />
}}<br />
<br />
'''Blip.tv''' is a [[Video hostings|video sharing]] website.<br />
<br />
{{expand}}<br />
<br />
== Site Structure ==<br />
* On pages like http://blip.tv/comedy-videos there is pagination but the page links are only '#' and need js to work<br />
* Each anchor tag does have a data-results_page value that appears to carry the url information. Here is an example:<br />
<a class='currentResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=1">1</a><br />
<a class='advanceResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=2">2</a><br />
<a class='advanceResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=3">3</a><br />
* Only 3 links of pagination are shown at a time. To find out how many pages a category has you must click the "double arrow right" at the bottom of the page.<br />
* Each page of results in a category is only 8 shows at a time.<br />
<br />
== External links ==<br />
* {{url|1=http://blip.tv/|2=Blip.tv}}<br />
<br />
{{Navigation box}}<br />
<br />
[[Category:Video hostings]]</div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=Blip.tv&diff=17689Blip.tv2013-10-08T22:25:07Z<p>DukeNukem: /* Site Structure */</p>
<hr />
<div>{{Infobox project<br />
| title = Blip.tv<br />
| image = Blip.tv_1303512711518.png<br />
| description = <br />
| URL = http://blip.tv/<br />
| project_status = {{online}}<br />
| archiving_status = {{nosavedyet}}<br />
}}<br />
<br />
'''Blip.tv''' is a [[Video hostings|video sharing]] website.<br />
<br />
{{expand}}<br />
<br />
== Site Structure ==<br />
* On pages like http://blip.tv/comedy-videos there is pagination but the page links are only '#' and need js to work<br />
* Each anchor tag does have a data-results_page value that appears to carry the url information. Here is an example:<br />
<a class='currentResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=1">1</a><br />
<a class='advanceResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=2">2</a><br />
<a class='advanceResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=3">3</a><br />
* Only 3 links of pagination are shown at a time. To find out how many pages a channel has you must click the "double arrow right" at the bottom of the page.<br />
<br />
== External links ==<br />
* {{url|1=http://blip.tv/|2=Blip.tv}}<br />
<br />
{{Navigation box}}<br />
<br />
[[Category:Video hostings]]</div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=Blip.tv&diff=17688Blip.tv2013-10-08T22:23:23Z<p>DukeNukem: /* Site Structure */</p>
<hr />
<div>{{Infobox project<br />
| title = Blip.tv<br />
| image = Blip.tv_1303512711518.png<br />
| description = <br />
| URL = http://blip.tv/<br />
| project_status = {{online}}<br />
| archiving_status = {{nosavedyet}}<br />
}}<br />
<br />
'''Blip.tv''' is a [[Video hostings|video sharing]] website.<br />
<br />
{{expand}}<br />
<br />
== Site Structure ==<br />
* On pages like http://blip.tv/comedy-videos there is pagination but the page links are only '#' and need js to work<br />
* Each anchor tag does have a data-results_page value that appears to carry the url information. Here is an example:<br />
<a class='currentResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=1">1</a><br />
<a class='advanceResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=2">2</a><br />
<a class='advanceResults' href="#" data-results_page="/channel/get_directory_listing?channels_id=46&page=3">3</a><br />
<br />
== External links ==<br />
* {{url|1=http://blip.tv/|2=Blip.tv}}<br />
<br />
{{Navigation box}}<br />
<br />
[[Category:Video hostings]]</div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=Blip.tv&diff=17687Blip.tv2013-10-08T22:19:20Z<p>DukeNukem: </p>
<hr />
<div>{{Infobox project<br />
| title = Blip.tv<br />
| image = Blip.tv_1303512711518.png<br />
| description = <br />
| URL = http://blip.tv/<br />
| project_status = {{online}}<br />
| archiving_status = {{nosavedyet}}<br />
}}<br />
<br />
'''Blip.tv''' is a [[Video hostings|video sharing]] website.<br />
<br />
{{expand}}<br />
<br />
== Site Structure ==<br />
* On pages like http://blip.tv/comedy-videos there is pagination but the page links are only '#' and need js to work<br />
* Each anchor tag does have a data-results_page value that appears to carry the url information.<br />
<br />
<br />
<br />
== External links ==<br />
* {{url|1=http://blip.tv/|2=Blip.tv}}<br />
<br />
{{Navigation box}}<br />
<br />
[[Category:Video hostings]]</div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=Tracker&diff=17651Tracker2013-10-04T04:22:36Z<p>DukeNukem: /* People */</p>
<hr />
<div>== General Overview ==<br />
<br />
[[File:Tracker_test_project_overview_screenshot.png|right|thumb|Project admin overview]]<br />
<br />
The [https://github.com/ArchiveTeam/universal-tracker Tracker] software is the center-pivot of Archiveteam's distributed archiving efforts. It hands out items to be downloaded and keeps track of what is completed. Items can be usernames, subdomains, full urls, basically any unit we can use to break the site into manageable chunks. The progress of each project can be viewed via the leader board interface on http://tracker.archiveteam.org .<br />
<br />
[[File:Xanga_leaderboard.png|right|thumb|A leaderboard]]<br />
<br />
The [[ArchiveTeam Warrior|Warrior]] is the yang to the Tracker's yin. The warriors get the list of current projects from the project file on http://warriorhq.archiveteam.org/ .<br />
<br />
__TOC__<br />
<br />
== API ==<br />
<br />
This is a sample from the project file (line breaks included for readability):<br />
<br />
<pre><br />
{<br />
"name": "streetfiles",<br />
"title": "Streetfiles",<br />
"description": "Streetfiles is closing April, 30th, 2013.",<br />
"repository": "https://github.com/ArchiveTeam/streetfiles-grab.git",<br />
"logo": "http://archiveteam.org/images/7/7b/Streetfiles-logo.png",<br />
"marker_html": <br />
"<a href='http://tracker.archiveteam.org/streetfiles/'><br />
<img src='http://archiveteam.org/images/7/7b/Streetfiles-logo.png'<br />
alt='Streetfiles' width='235' height='50' /></a>",<br />
"deadline": "2013-04-30T23:59:59Z",<br />
"host": "streetfiles.org",<br />
"leaderboard": "http://tracker.archiveteam.org/streetfiles/",<br />
"lat_lng": [<br />
51,<br />
9<br />
]<br />
},<br />
</pre><br />
<br />
It shows where to get the grab code and other project information.<br />
<br />
== Hardware ==<br />
The tracker runs on a [http://www.archiveteam.org/index.php?title=Clown_hosting#linode Linode 1 GB] instance operated by [[User:Chronomex|chronomex]]. <br />
<br />
== Monitoring ==<br />
<br />
http://tracker.archiveteam.org has a Munin instance located at http://zeppelin.xrtc.net/corp.xrtc.net/shilling.corp.xrtc.net/.<br />
<br />
== Software used: ==<br />
<br />
* [https://github.com/ArchiveTeam/universal-tracker Universal Tracker] is a Ruby HTTP application that sends and receives JSON payloads and uses Redis for the data store.<br />
* Redis A memory based key value store<br />
* [http://debian.org/ Debian] is the Linux distribution the stack is built upon.<br />
* [https://github.com/ArchiveTeam/warrior-hq warrior-hq] a small Sinatra web app to manage the Warriors and display the geo-location world map.<br />
<br />
You can also [[Tracker_Setup|set up your own tracker]].<br />
<br />
== People ==<br />
These are the volunteers who take care of the different services that form Archive Team and URLTeam.<br />
<br />
{| class="wikitable"<br />
! Service:<br />
! Admins:<br />
|-<br />
|Wiki Admins <br />
|SketchCow, winr4r <br />
|-<br />
|warriorhq.archiveteam.org (projects.json) <br />
|Smiley<br />
|-<br />
|Universal Tracker SSH<br />
|alard, Smiley, underscor, yipdw, xmc<br />
|-<br />
|Universal Tracker web interface<br />
|alard, GLaDOS, omf_, Smiley, underscor<br />
|-<br />
|Anarchive server<br />
|GLaDOS, omf_, Smiley<br />
|-<br />
|URLTeam Tracker software<br />
|GLaDOS, omf_, Smiley<br />
|-<br />
|Github Organization Admins<br />
|GLaDOS, ivan, omf_<br />
|-<br />
|#archiveteam-twitter twitter to IRC bot<br />
|GLaDOS<br />
|-<br />
|pad.archivingyoursh.it <br />
paste.archivingyoursh.it<br />
|GLaDOS<br />
|-<br />
|Domain registration (archiveteam.org urlte.am)<br />
|SketchCow<br />
|}</div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=Tracker&diff=17646Tracker2013-10-03T18:50:47Z<p>DukeNukem: /* People */</p>
<hr />
<div>== General Overview ==<br />
<br />
[[File:Tracker_test_project_overview_screenshot.png|right|thumb|Project admin overview]]<br />
<br />
The [https://github.com/ArchiveTeam/universal-tracker Tracker] software is the center-pivot of Archiveteam's distributed archiving efforts. It hands out items to be downloaded and keeps track of what is completed. Items can be usernames, subdomains, full urls, basically any unit we can use to break the site into manageable chunks. The progress of each project can be viewed via the leader board interface on http://tracker.archiveteam.org .<br />
<br />
[[File:Xanga_leaderboard.png|right|thumb|A leaderboard]]<br />
<br />
The [[ArchiveTeam Warrior|Warrior]] is the yang to the Tracker's yin. The warriors get the list of current projects from the project file on http://warriorhq.archiveteam.org/ .<br />
<br />
__TOC__<br />
<br />
== API ==<br />
<br />
This is a sample from the project file (line breaks included for readability):<br />
<br />
<pre><br />
{<br />
"name": "streetfiles",<br />
"title": "Streetfiles",<br />
"description": "Streetfiles is closing April, 30th, 2013.",<br />
"repository": "https://github.com/ArchiveTeam/streetfiles-grab.git",<br />
"logo": "http://archiveteam.org/images/7/7b/Streetfiles-logo.png",<br />
"marker_html": <br />
"<a href='http://tracker.archiveteam.org/streetfiles/'><br />
<img src='http://archiveteam.org/images/7/7b/Streetfiles-logo.png'<br />
alt='Streetfiles' width='235' height='50' /></a>",<br />
"deadline": "2013-04-30T23:59:59Z",<br />
"host": "streetfiles.org",<br />
"leaderboard": "http://tracker.archiveteam.org/streetfiles/",<br />
"lat_lng": [<br />
51,<br />
9<br />
]<br />
},<br />
</pre><br />
<br />
It shows where to get the grab code and other project information.<br />
<br />
== Hardware ==<br />
The tracker runs on a [http://www.archiveteam.org/index.php?title=Clown_hosting#linode Linode 1 GB] instance operated by [[User:Chronomex|chronomex]]. <br />
<br />
== Monitoring ==<br />
<br />
http://tracker.archiveteam.org has a Munin instance located at http://zeppelin.xrtc.net/corp.xrtc.net/shilling.corp.xrtc.net/.<br />
<br />
== Software used: ==<br />
<br />
* [https://github.com/ArchiveTeam/universal-tracker Universal Tracker] is a Ruby HTTP application that sends and receives JSON payloads and uses Redis for the data store.<br />
* Redis A memory based key value store<br />
* [http://debian.org/ Debian] is the Linux distribution the stack is built upon.<br />
* [https://github.com/ArchiveTeam/warrior-hq warrior-hq] a small Sinatra web app to manage the Warriors and display the geo-location world map.<br />
<br />
You can also [[Tracker_Setup|set up your own tracker]].<br />
<br />
== People ==<br />
These are the volunteers who take care of the different services that form Archive Team and URLTeam.<br />
<br />
{| class="wikitable"<br />
! Service:<br />
! Admins:<br />
|-<br />
|Wiki Admins <br />
|SketchCow, winr4r <br />
|-<br />
|warriorhq.archiveteam.org (projects.json) <br />
|Smiley<br />
|-<br />
|Universal Tracker SSH<br />
|alard, Smiley, underscor, yipdw, xmc<br />
|-<br />
|Universal Tracker web interface<br />
|alard, omf_, Smiley, underscor<br />
|-<br />
|Anarchive server<br />
|GLaDOS, omf_, Smiley<br />
|-<br />
|URLTeam Tracker software<br />
|GLaDOS, omf_, Smiley<br />
|-<br />
|Github Organization Admins<br />
|GLaDOS, ivan, omf_<br />
|-<br />
|#archiveteam-twitter twitter to IRC bot<br />
|GLaDOS<br />
|-<br />
|pad.archivingyoursh.it <br />
paste.archivingyoursh.it<br />
|GLaDOS<br />
|-<br />
|Domain registration (archiveteam.org urlte.am)<br />
|SketchCow<br />
|}</div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=Tracker&diff=17645Tracker2013-10-03T18:42:33Z<p>DukeNukem: /* People */</p>
<hr />
<div>== General Overview ==<br />
<br />
[[File:Tracker_test_project_overview_screenshot.png|right|thumb|Project admin overview]]<br />
<br />
The [https://github.com/ArchiveTeam/universal-tracker Tracker] software is the center-pivot of Archiveteam's distributed archiving efforts. It hands out items to be downloaded and keeps track of what is completed. Items can be usernames, subdomains, full urls, basically any unit we can use to break the site into manageable chunks. The progress of each project can be viewed via the leader board interface on http://tracker.archiveteam.org .<br />
<br />
[[File:Xanga_leaderboard.png|right|thumb|A leaderboard]]<br />
<br />
The [[ArchiveTeam Warrior|Warrior]] is the yang to the Tracker's yin. The warriors get the list of current projects from the project file on http://warriorhq.archiveteam.org/ .<br />
<br />
__TOC__<br />
<br />
== API ==<br />
<br />
This is a sample from the project file (line breaks included for readability):<br />
<br />
<pre><br />
{<br />
"name": "streetfiles",<br />
"title": "Streetfiles",<br />
"description": "Streetfiles is closing April, 30th, 2013.",<br />
"repository": "https://github.com/ArchiveTeam/streetfiles-grab.git",<br />
"logo": "http://archiveteam.org/images/7/7b/Streetfiles-logo.png",<br />
"marker_html": <br />
"<a href='http://tracker.archiveteam.org/streetfiles/'><br />
<img src='http://archiveteam.org/images/7/7b/Streetfiles-logo.png'<br />
alt='Streetfiles' width='235' height='50' /></a>",<br />
"deadline": "2013-04-30T23:59:59Z",<br />
"host": "streetfiles.org",<br />
"leaderboard": "http://tracker.archiveteam.org/streetfiles/",<br />
"lat_lng": [<br />
51,<br />
9<br />
]<br />
},<br />
</pre><br />
<br />
It shows where to get the grab code and other project information.<br />
<br />
== Hardware ==<br />
The tracker runs on a [http://www.archiveteam.org/index.php?title=Clown_hosting#linode Linode 1 GB] instance operated by [[User:Chronomex|chronomex]]. <br />
<br />
== Monitoring ==<br />
<br />
http://tracker.archiveteam.org has a Munin instance located at http://zeppelin.xrtc.net/corp.xrtc.net/shilling.corp.xrtc.net/.<br />
<br />
== Software used: ==<br />
<br />
* [https://github.com/ArchiveTeam/universal-tracker Universal Tracker] is a Ruby HTTP application that sends and receives JSON payloads and uses Redis for the data store.<br />
* Redis A memory based key value store<br />
* [http://debian.org/ Debian] is the Linux distribution the stack is built upon.<br />
* [https://github.com/ArchiveTeam/warrior-hq warrior-hq] a small Sinatra web app to manage the Warriors and display the geo-location world map.<br />
<br />
You can also [[Tracker_Setup|set up your own tracker]].<br />
<br />
== People ==<br />
These are the volunteers who take care of the different services that form Archive Team and URLTeam.<br />
<br />
{| class="wikitable"<br />
! Service:<br />
! Admins:<br />
|-<br />
|Wiki Admins <br />
|SketchCow, winr4r <br />
|-<br />
|warriorhq.archiveteam.org (projects.json) <br />
|Smiley<br />
|-<br />
|Universal Tracker SSH<br />
|alard, Smiley, underscor, xmc<br />
|-<br />
|Universal Tracker web interface<br />
|alard, omf_, Smiley, underscor<br />
|-<br />
|Anarchive server<br />
|GLaDOS, omf_, Smiley<br />
|-<br />
|URLTeam Tracker software<br />
|GLaDOS, omf_, Smiley<br />
|-<br />
|Github Organization Admins<br />
|GLaDOS, omf_, ivan<br />
|-<br />
|#archiveteam-twitter twitter to IRC bot<br />
|GLaDOS<br />
|-<br />
|pad.archivingyoursh.it <br />
paste.archivingyoursh.it<br />
|GLaDOS<br />
|-<br />
|Domain registration (archiveteam.org urlte.am)<br />
|SketchCow<br />
|}</div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=Tracker&diff=17644Tracker2013-10-03T18:25:17Z<p>DukeNukem: /* People */</p>
<hr />
<div>== General Overview ==<br />
<br />
[[File:Tracker_test_project_overview_screenshot.png|right|thumb|Project admin overview]]<br />
<br />
The [https://github.com/ArchiveTeam/universal-tracker Tracker] software is the center-pivot of Archiveteam's distributed archiving efforts. It hands out items to be downloaded and keeps track of what is completed. Items can be usernames, subdomains, full urls, basically any unit we can use to break the site into manageable chunks. The progress of each project can be viewed via the leader board interface on http://tracker.archiveteam.org .<br />
<br />
[[File:Xanga_leaderboard.png|right|thumb|A leaderboard]]<br />
<br />
The [[ArchiveTeam Warrior|Warrior]] is the yang to the Tracker's yin. The warriors get the list of current projects from the project file on http://warriorhq.archiveteam.org/ .<br />
<br />
__TOC__<br />
<br />
== API ==<br />
<br />
This is a sample from the project file (line breaks included for readability):<br />
<br />
<pre><br />
{<br />
"name": "streetfiles",<br />
"title": "Streetfiles",<br />
"description": "Streetfiles is closing April, 30th, 2013.",<br />
"repository": "https://github.com/ArchiveTeam/streetfiles-grab.git",<br />
"logo": "http://archiveteam.org/images/7/7b/Streetfiles-logo.png",<br />
"marker_html": <br />
"<a href='http://tracker.archiveteam.org/streetfiles/'><br />
<img src='http://archiveteam.org/images/7/7b/Streetfiles-logo.png'<br />
alt='Streetfiles' width='235' height='50' /></a>",<br />
"deadline": "2013-04-30T23:59:59Z",<br />
"host": "streetfiles.org",<br />
"leaderboard": "http://tracker.archiveteam.org/streetfiles/",<br />
"lat_lng": [<br />
51,<br />
9<br />
]<br />
},<br />
</pre><br />
<br />
It shows where to get the grab code and other project information.<br />
<br />
== Hardware ==<br />
The tracker runs on a [http://www.archiveteam.org/index.php?title=Clown_hosting#linode Linode 1 GB] instance operated by [[User:Chronomex|chronomex]]. <br />
<br />
== Monitoring ==<br />
<br />
http://tracker.archiveteam.org has a Munin instance located at http://zeppelin.xrtc.net/corp.xrtc.net/shilling.corp.xrtc.net/.<br />
<br />
== Software used: ==<br />
<br />
* [https://github.com/ArchiveTeam/universal-tracker Universal Tracker] is a Ruby HTTP application that sends and receives JSON payloads and uses Redis for the data store.<br />
* Redis A memory based key value store<br />
* [http://debian.org/ Debian] is the Linux distribution the stack is built upon.<br />
* [https://github.com/ArchiveTeam/warrior-hq warrior-hq] a small Sinatra web app to manage the Warriors and display the geo-location world map.<br />
<br />
You can also [[Tracker_Setup|set up your own tracker]].<br />
<br />
== People ==<br />
These are the volunteers who take care of the different services that form Archive Team and URLTeam.<br />
<br />
{| class="wikitable"<br />
! Service:<br />
! Admins:<br />
|-<br />
|Wiki Admins <br />
|SketchCow, winr4r <br />
|-<br />
|warriorhq.archiveteam.org (projects.json) <br />
|xmc, Smiley<br />
|-<br />
|Universal Tracker SSH<br />
|alard, underscor, Smiley<br />
|-<br />
|Universal Tracker web interface<br />
|alard, underscor, omf_, Smiley<br />
|-<br />
|Anarchive server<br />
|GLaDOS, omf_, Smiley<br />
|-<br />
|URLTeam Tracker software<br />
|GLaDOS, omf_, Smiley<br />
|-<br />
|Github Organization Admins<br />
|GLaDOS, omf_, ivan<br />
|-<br />
|#archiveteam-twitter twitter to IRC bot<br />
|GLaDOS<br />
|-<br />
|pad.archivingyoursh.it <br />
paste.archivingyoursh.it<br />
|GLaDOS<br />
|-<br />
|Domain registration (archiveteam.org urlte.am)<br />
|SketchCow<br />
|}</div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=Tracker&diff=17638Tracker2013-10-03T18:10:59Z<p>DukeNukem: /* People */</p>
<hr />
<div>== General Overview ==<br />
<br />
[[File:Tracker_test_project_overview_screenshot.png|right|thumb|Project admin overview]]<br />
<br />
The [https://github.com/ArchiveTeam/universal-tracker Tracker] software is the center-pivot of Archiveteam's distributed archiving efforts. It hands out items to be downloaded and keeps track of what is completed. Items can be usernames, subdomains, full urls, basically any unit we can use to break the site into manageable chunks. The progress of each project can be viewed via the leader board interface on http://tracker.archiveteam.org .<br />
<br />
[[File:Xanga_leaderboard.png|right|thumb|A leaderboard]]<br />
<br />
The [[ArchiveTeam Warrior|Warrior]] is the yang to the Tracker's yin. The warriors get the list of current projects from the project file on http://warriorhq.archiveteam.org/ .<br />
<br />
__TOC__<br />
<br />
== API ==<br />
<br />
This is a sample from the project file (line breaks included for readability):<br />
<br />
<pre><br />
{<br />
"name": "streetfiles",<br />
"title": "Streetfiles",<br />
"description": "Streetfiles is closing April, 30th, 2013.",<br />
"repository": "https://github.com/ArchiveTeam/streetfiles-grab.git",<br />
"logo": "http://archiveteam.org/images/7/7b/Streetfiles-logo.png",<br />
"marker_html": <br />
"<a href='http://tracker.archiveteam.org/streetfiles/'><br />
<img src='http://archiveteam.org/images/7/7b/Streetfiles-logo.png'<br />
alt='Streetfiles' width='235' height='50' /></a>",<br />
"deadline": "2013-04-30T23:59:59Z",<br />
"host": "streetfiles.org",<br />
"leaderboard": "http://tracker.archiveteam.org/streetfiles/",<br />
"lat_lng": [<br />
51,<br />
9<br />
]<br />
},<br />
</pre><br />
<br />
It shows where to get the grab code and other project information.<br />
<br />
== Hardware ==<br />
The tracker runs on a [http://www.archiveteam.org/index.php?title=Clown_hosting#linode Linode 1 GB] instance operated by [[User:Chronomex|chronomex]]. <br />
<br />
== Monitoring ==<br />
<br />
http://tracker.archiveteam.org has a Munin instance located at http://zeppelin.xrtc.net/corp.xrtc.net/shilling.corp.xrtc.net/.<br />
<br />
== Software used: ==<br />
<br />
* [https://github.com/ArchiveTeam/universal-tracker Universal Tracker] is a Ruby HTTP application that sends and receives JSON payloads and uses Redis for the data store.<br />
* Redis A memory based key value store<br />
* [http://debian.org/ Debian] is the Linux distribution the stack is built upon.<br />
* [https://github.com/ArchiveTeam/warrior-hq warrior-hq] a small Sinatra web app to manage the Warriors and display the geo-location world map.<br />
<br />
You can also [[Tracker_Setup|set up your own tracker]].<br />
<br />
== People ==<br />
These are the volunteers who take care of the different services that form Archive Team and URLTeam.<br />
<br />
* Wiki Admins - SketchCow, winr4r<br />
* warriorhq.archiveteam.org (projects.json) - xmc, Smiley<br />
* Warrior Tracker ssh - alard, underscor, Smiley<br />
* Universal Tracker web interface - alard, underscor, omf_, Smiley<br />
* Anarchive server - GLaDOS, omf_, Smiley<br />
* URLTeam Tracker software - GLaDOS, omf_, Smiley<br />
* Github Organization Admins - GLaDOS, omf_, ivan<br />
* #archiveteam-twitter twitter to IRC bot - GLaDOS<br />
* pad.archivingyoursh.it & paste.archivingyoursh.it - GLaDOS<br />
* Domain registration (archiveteam.org urlte.am) - SketchCow</div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=Zapd&diff=17625Zapd2013-09-30T17:03:52Z<p>DukeNukem: /* Site structure */</p>
<hr />
<div>{{Infobox project<br />
| title = Zapd<br />
| URL = http://zapd.com/<br />
| logo = Zapd_logo.png<br />
| image = Zapd_homepage_screenshot.png<br />
| project_status = {{closing}}<br />
| archiving_status = {{nosavedyet}}<br />
| irc = at-zapd<br />
}}<br />
<br />
“'''Zapd''' is like Tumblr, in that it makes making pretty websites super easy, but Zapd does all its web building magic from your iPhone.”<br />
<br />
== Shutdown ==<br />
<br />
=== The News ===<br />
<br />
<blockquote><br />
<p>Fast-growing RealSelf gobbles up Zapd, names Kelly Smith chief experience officer</p><br />
<br />
<p>''September 11, 2013 at 8:26 am by John Cook''</p><br />
<br />
<p>You could call this an “acquihire.” But, in this case, it’s really just about grabbing the talents of one person.</p><br />
<br />
<p>RealSelf is buying Pressplane, the parent company of Zapd, picking up the skills of experienced entrepreneur and designer Kelly Smith in the process. Zapd will be shut down at the end of the month, though some of the technology will be carried over to RealSelf, which is building what it dubs the world’s largest community of cosmetic surgery, dermatology and dentistry.<ref>http://www.geekwire.com/2013/fastgrowing-profitabe-realself-gobbles-zapd-names-kelly-smith-chief-experience-officer/</ref></p><br />
</blockquote><br />
<br />
=== The Email ===<br />
''September 28, 2013''<br />
<blockquote><br />
<p>Zapd has been acquired!</p><br />
<p>The Zapd service will be discontinued on October 7, 2013</p><br />
<br />
<p>Today I wanted to share that Zapd has been acquired by RealSelf. RealSelf is the leading online resource for elective cosmetic medical procedures. As the new Chief Experience Designer, I'll be leveraging everything we learned at Zapd to help build a better mobile engagement experience. The Zapd website and mobile apps will stay up until October 7, 2013 and then will be shutting down.<ref>http://pastie.org/8362982</ref></p><br />
</blockquote><br />
<br />
<br />
== Site structure ==<br />
<br />
* All content is served via javascript, with javasscript disabled you just get an empty template.<br />
* They have a url shortener zapd.co<br />
* zapd.co url scheme is #.zapd.co, #[a-z].zapd.co, #[a-z][a-z].zapd.co, or [a-z]#.zapd.co . # represents a number 0-9.<br />
* Had 350k app downloads in 2011. At worst they have 500,000 users.<br />
* No API<br />
* All images are hosted on Cloudfront<br />
* Comments have no separate page or url for items. They are served dynamically<br />
* You cannot view the like information without logging into Facebook<br />
* It only shows the newest 5 comments for any item. There appears to be no way to see older comments or show all comments.<br />
* Working example url: http://anna-heimbichner.zapd.com/cake-pops<br />
* Each "story" page has all the necessary data as a json blob instead of a script section.<br />
* The json part has all the urls to images under "full_image_url" and users can be found via "Contributor" -> "url"<br />
<br />
== Crawling Process ==<br />
<br />
* We are still trying to discover urls.<br />
* We are working on code to scrape the content.<br />
* Commoncrawl had no urls for zapd.<br />
<br />
<br />
<br />
== References ==<br />
<br />
<references/></div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=Tracker&diff=17622Tracker2013-09-30T14:28:00Z<p>DukeNukem: /* People */</p>
<hr />
<div>== General Overview ==<br />
<br />
[[File:Tracker_test_project_overview_screenshot.png|right|thumb|Project admin overview]]<br />
<br />
The [https://github.com/ArchiveTeam/universal-tracker Tracker] software is the center-pivot of Archiveteam's distributed archiving efforts. It hands out items to be downloaded and keeps track of what is completed. Items can be usernames, subdomains, full urls, basically any unit we can use to break the site into manageable chunks. The progress of each project can be viewed via the leader board interface on http://tracker.archiveteam.org .<br />
<br />
[[File:Xanga_leaderboard.png|right|thumb|A leaderboard]]<br />
<br />
The [[ArchiveTeam Warrior|Warrior]] is the yang to the Tracker's yin. The warriors get the list of current projects from the project file on http://warriorhq.archiveteam.org/ .<br />
<br />
__TOC__<br />
<br />
== API ==<br />
<br />
This is a sample from the project file (line breaks included for readability):<br />
<br />
<pre><br />
{<br />
"name": "streetfiles",<br />
"title": "Streetfiles",<br />
"description": "Streetfiles is closing April, 30th, 2013.",<br />
"repository": "https://github.com/ArchiveTeam/streetfiles-grab.git",<br />
"logo": "http://archiveteam.org/images/7/7b/Streetfiles-logo.png",<br />
"marker_html": <br />
"<a href='http://tracker.archiveteam.org/streetfiles/'><br />
<img src='http://archiveteam.org/images/7/7b/Streetfiles-logo.png'<br />
alt='Streetfiles' width='235' height='50' /></a>",<br />
"deadline": "2013-04-30T23:59:59Z",<br />
"host": "streetfiles.org",<br />
"leaderboard": "http://tracker.archiveteam.org/streetfiles/",<br />
"lat_lng": [<br />
51,<br />
9<br />
]<br />
},<br />
</pre><br />
<br />
It shows where to get the grab code and other project information.<br />
<br />
== Hardware ==<br />
The tracker runs on a [http://www.archiveteam.org/index.php?title=Clown_hosting#linode Linode 1 GB] instance operated by [[User:Chronomex|chronomex]]. <br />
<br />
== Monitoring ==<br />
<br />
http://tracker.archiveteam.org has a Munin instance located at http://zeppelin.xrtc.net/corp.xrtc.net/shilling.corp.xrtc.net/.<br />
<br />
== Software used: ==<br />
<br />
* [https://github.com/ArchiveTeam/universal-tracker Universal Tracker] is a Ruby HTTP application that sends and receives JSON payloads and uses Redis for the data store.<br />
* Redis A memory based key value store<br />
* [http://debian.org/ Debian] is the Linux distribution the stack is built upon.<br />
* [https://github.com/ArchiveTeam/warrior-hq warrior-hq] a small Sinatra web app to manage the Warriors and display the geo-location world map.<br />
<br />
You can also [[Tracker_Setup|set up your own tracker]].<br />
<br />
== People ==<br />
These are the volunteers who take care of the different services that form Archive Team and URLTeam.<br />
<br />
* Wiki Admins - SketchCow, winr4r<br />
* warriorhq.archiveteam.org (projects.json) - xmc, Smiley<br />
* Warrior Tracker ssh - alard, underscor, Smiley<br />
* Warrior Tracker web - alard, underscor, omf_, Smiley<br />
* Anarchive server - GLaDOS, omf_, Smiley<br />
* URLTeam Tracker software - GLaDOS, omf_, Smiley<br />
* Github Organization Admins - GLaDOS, omf_, ivan<br />
* #archiveteam-twitter twitter to IRC bot - GLaDOS<br />
* pad.archivingyoursh.it & paste.archivingyoursh.it - GLaDOS<br />
* Domain registration (archiveteam.org urlte.am) - SketchCow</div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=Zapd&diff=17621Zapd2013-09-30T14:24:21Z<p>DukeNukem: </p>
<hr />
<div>{{Infobox project<br />
| title = Zapd<br />
| URL = http://zapd.com/<br />
| logo = Zapd_logo.png<br />
| image = Zapd_homepage_screenshot.png<br />
| project_status = {{closing}}<br />
| archiving_status = {{nosavedyet}}<br />
| irc = at-zapd<br />
}}<br />
<br />
“'''Zapd''' is like Tumblr, in that it makes making pretty websites super easy, but Zapd does all its web building magic from your iPhone.”<br />
<br />
== Shutdown ==<br />
<br />
=== The News ===<br />
<br />
<blockquote><br />
<p>Fast-growing RealSelf gobbles up Zapd, names Kelly Smith chief experience officer</p><br />
<br />
<p>''September 11, 2013 at 8:26 am by John Cook''</p><br />
<br />
<p>You could call this an “acquihire.” But, in this case, it’s really just about grabbing the talents of one person.</p><br />
<br />
<p>RealSelf is buying Pressplane, the parent company of Zapd, picking up the skills of experienced entrepreneur and designer Kelly Smith in the process. Zapd will be shut down at the end of the month, though some of the technology will be carried over to RealSelf, which is building what it dubs the world’s largest community of cosmetic surgery, dermatology and dentistry.<ref>http://www.geekwire.com/2013/fastgrowing-profitabe-realself-gobbles-zapd-names-kelly-smith-chief-experience-officer/</ref></p><br />
</blockquote><br />
<br />
=== The Email ===<br />
''September 28, 2013''<br />
<blockquote><br />
<p>Zapd has been acquired!</p><br />
<p>The Zapd service will be discontinued on October 7, 2013</p><br />
<br />
<p>Today I wanted to share that Zapd has been acquired by RealSelf. RealSelf is the leading online resource for elective cosmetic medical procedures. As the new Chief Experience Designer, I'll be leveraging everything we learned at Zapd to help build a better mobile engagement experience. The Zapd website and mobile apps will stay up until October 7, 2013 and then will be shutting down.<ref>http://pastie.org/8362982</ref></p><br />
</blockquote><br />
<br />
<br />
== Site structure ==<br />
<br />
* All content is served via javascript, with javasscript disabled you just get an empty template.<br />
* They have a url shortener zapd.co<br />
* zapd.co url scheme is #.zapd.co, #[a-z].zapd.co, #[a-z][a-z].zapd.co, or [a-z]#.zapd.co . # represents a number 0-9.<br />
* Had 350k app downloads in 2011. At worst they have 500,000 users.<br />
* No API<br />
* All images are hosted on Cloudfront<br />
* Comments have no separate page or url for items. They are served dynamically<br />
* You cannot view the like information without logging into Facebook<br />
* It only shows the newest 5 comments for any item. There appears to be no way to see older comments or show all comments.<br />
* Working example url: http://anna-heimbichner.zapd.com/cake-pops<br />
<br />
== Crawling Process ==<br />
<br />
* We are still trying to discover urls.<br />
* We are working on code to scrape the content.<br />
* Commoncrawl had no urls for zapd.<br />
<br />
<br />
<br />
== References ==<br />
<br />
<references/></div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=Zapd&diff=17620Zapd2013-09-30T14:18:25Z<p>DukeNukem: /* Site structure */</p>
<hr />
<div>{{Infobox project<br />
| title = Zapd<br />
| URL = http://zapd.com/<br />
| logo = Zapd_logo.png<br />
| image = Zapd_homepage_screenshot.png<br />
| project_status = {{closing}}<br />
| archiving_status = {{nosavedyet}}<br />
| irc = at-zapd<br />
}}<br />
<br />
“'''Zapd''' is like Tumblr, in that it makes making pretty websites super easy, but Zapd does all its web building magic from your iPhone.”<br />
<br />
== Shutdown ==<br />
<br />
=== The News ===<br />
<br />
<blockquote><br />
<p>Fast-growing RealSelf gobbles up Zapd, names Kelly Smith chief experience officer</p><br />
<br />
<p>''September 11, 2013 at 8:26 am by John Cook''</p><br />
<br />
<p>You could call this an “acquihire.” But, in this case, it’s really just about grabbing the talents of one person.</p><br />
<br />
<p>RealSelf is buying Pressplane, the parent company of Zapd, picking up the skills of experienced entrepreneur and designer Kelly Smith in the process. Zapd will be shut down at the end of the month, though some of the technology will be carried over to RealSelf, which is building what it dubs the world’s largest community of cosmetic surgery, dermatology and dentistry.<ref>http://www.geekwire.com/2013/fastgrowing-profitabe-realself-gobbles-zapd-names-kelly-smith-chief-experience-officer/</ref></p><br />
</blockquote><br />
<br />
=== The Email ===<br />
''September 28, 2013''<br />
<blockquote><br />
<p>Zapd has been acquired!</p><br />
<p>The Zapd service will be discontinued on October 7, 2013</p><br />
<br />
<p>Today I wanted to share that Zapd has been acquired by RealSelf. RealSelf is the leading online resource for elective cosmetic medical procedures. As the new Chief Experience Designer, I'll be leveraging everything we learned at Zapd to help build a better mobile engagement experience. The Zapd website and mobile apps will stay up until October 7, 2013 and then will be shutting down.<ref>http://pastie.org/8362982</ref></p><br />
</blockquote><br />
<br />
<br />
== Site structure ==<br />
<br />
* All content is served via javascript, with javasscript disabled you just get an empty template.<br />
* They have a url shortener zapd.co<br />
* zapd.co url scheme is #.zapd.co, #[a-z].zapd.co, #[a-z][a-z].zapd.co, or [a-z]#.zapd.co . # represents a number 0-9.<br />
* Had 350k app downloads in 2011. At worst they have 500,000 users.<br />
* No API<br />
* All images are hosted on Cloudfront<br />
* Comments have no separate page or url for items. They are served dynamically<br />
* You cannot view the like information without logging into Facebook<br />
* It only shows the newest 5 comments for any item. There appears to be no way to see older comments or show all comments.<br />
* Working example url: http://anna-heimbichner.zapd.com/cake-pops<br />
<br />
== References ==<br />
<br />
<references/></div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=Zapd&diff=17619Zapd2013-09-30T14:16:44Z<p>DukeNukem: /* Site structure */</p>
<hr />
<div>{{Infobox project<br />
| title = Zapd<br />
| URL = http://zapd.com/<br />
| logo = Zapd_logo.png<br />
| image = Zapd_homepage_screenshot.png<br />
| project_status = {{closing}}<br />
| archiving_status = {{nosavedyet}}<br />
| irc = at-zapd<br />
}}<br />
<br />
“'''Zapd''' is like Tumblr, in that it makes making pretty websites super easy, but Zapd does all its web building magic from your iPhone.”<br />
<br />
== Shutdown ==<br />
<br />
=== The News ===<br />
<br />
<blockquote><br />
<p>Fast-growing RealSelf gobbles up Zapd, names Kelly Smith chief experience officer</p><br />
<br />
<p>''September 11, 2013 at 8:26 am by John Cook''</p><br />
<br />
<p>You could call this an “acquihire.” But, in this case, it’s really just about grabbing the talents of one person.</p><br />
<br />
<p>RealSelf is buying Pressplane, the parent company of Zapd, picking up the skills of experienced entrepreneur and designer Kelly Smith in the process. Zapd will be shut down at the end of the month, though some of the technology will be carried over to RealSelf, which is building what it dubs the world’s largest community of cosmetic surgery, dermatology and dentistry.<ref>http://www.geekwire.com/2013/fastgrowing-profitabe-realself-gobbles-zapd-names-kelly-smith-chief-experience-officer/</ref></p><br />
</blockquote><br />
<br />
=== The Email ===<br />
''September 28, 2013''<br />
<blockquote><br />
<p>Zapd has been acquired!</p><br />
<p>The Zapd service will be discontinued on October 7, 2013</p><br />
<br />
<p>Today I wanted to share that Zapd has been acquired by RealSelf. RealSelf is the leading online resource for elective cosmetic medical procedures. As the new Chief Experience Designer, I'll be leveraging everything we learned at Zapd to help build a better mobile engagement experience. The Zapd website and mobile apps will stay up until October 7, 2013 and then will be shutting down.<ref>http://pastie.org/8362982</ref></p><br />
</blockquote><br />
<br />
<br />
== Site structure ==<br />
<br />
* All content is served via javascript, with javasscript disabled you just get an empty template.<br />
* They have a url shortener zapd.co<br />
* zapd.co url scheme is #.zapd.co, #[a-z].zapd.co, #[a-z][a-z].zapd.co, or [a-z]#.zapd.co . # represents a number 0-9.<br />
* Had 350k app downloads in 2011. At worst they have 500,000 users.<br />
* No API<br />
* All images are hosted on Cloudfront<br />
* Comments have no separate page or url for items. They are served dynamically<br />
* You cannot view the like information without logging into Facebook<br />
* It only shows the newest 5 comments for any item. There appears to be no way to see older comments or show all comments.<br />
<br />
== References ==<br />
<br />
<references/></div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=Zapd&diff=17618Zapd2013-09-30T14:15:01Z<p>DukeNukem: /* Site structure */</p>
<hr />
<div>{{Infobox project<br />
| title = Zapd<br />
| URL = http://zapd.com/<br />
| logo = Zapd_logo.png<br />
| image = Zapd_homepage_screenshot.png<br />
| project_status = {{closing}}<br />
| archiving_status = {{nosavedyet}}<br />
| irc = at-zapd<br />
}}<br />
<br />
“'''Zapd''' is like Tumblr, in that it makes making pretty websites super easy, but Zapd does all its web building magic from your iPhone.”<br />
<br />
== Shutdown ==<br />
<br />
=== The News ===<br />
<br />
<blockquote><br />
<p>Fast-growing RealSelf gobbles up Zapd, names Kelly Smith chief experience officer</p><br />
<br />
<p>''September 11, 2013 at 8:26 am by John Cook''</p><br />
<br />
<p>You could call this an “acquihire.” But, in this case, it’s really just about grabbing the talents of one person.</p><br />
<br />
<p>RealSelf is buying Pressplane, the parent company of Zapd, picking up the skills of experienced entrepreneur and designer Kelly Smith in the process. Zapd will be shut down at the end of the month, though some of the technology will be carried over to RealSelf, which is building what it dubs the world’s largest community of cosmetic surgery, dermatology and dentistry.<ref>http://www.geekwire.com/2013/fastgrowing-profitabe-realself-gobbles-zapd-names-kelly-smith-chief-experience-officer/</ref></p><br />
</blockquote><br />
<br />
=== The Email ===<br />
''September 28, 2013''<br />
<blockquote><br />
<p>Zapd has been acquired!</p><br />
<p>The Zapd service will be discontinued on October 7, 2013</p><br />
<br />
<p>Today I wanted to share that Zapd has been acquired by RealSelf. RealSelf is the leading online resource for elective cosmetic medical procedures. As the new Chief Experience Designer, I'll be leveraging everything we learned at Zapd to help build a better mobile engagement experience. The Zapd website and mobile apps will stay up until October 7, 2013 and then will be shutting down.<ref>http://pastie.org/8362982</ref></p><br />
</blockquote><br />
<br />
<br />
== Site structure ==<br />
<br />
* All content is served via javascript, with js disabled you just get an empty template.<br />
* They have a url shortener zapd.co<br />
* zapd.co url scheme is #.zapd.co, #[a-z].zapd.co, #[a-z][a-z].zapd.co, or [a-z]#.zapd.co<br />
* Had 350k app downloads in 2011. At worst they have 500,000 users.<br />
* No API<br />
* All images are hosted on Cloudfront<br />
* Comments have no separate page or url for items. They are served dynamically<br />
* You cannot view the like information without logging into Facebook<br />
* It only shows the newest 5 comments for any item. There appears to be no way to see older comments or show all comments.<br />
<br />
== References ==<br />
<br />
<references/></div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=Current_Projects&diff=17615Current Projects2013-09-30T13:58:34Z<p>DukeNukem: /* Manual projects */</p>
<hr />
<div>== Warrior based projects ==<br />
* No warrior based projects!<br />
<br />
== Manual projects ==<br />
* [[AOL Music]]: Announced April 26, 2013, the saving effort started promptly. IRC Channel '''#aolsilence'''.<br />
* [[Gamespy, 1up, UGO, IGN]]: Shutdown date unknown, announced Feb 21, 2013. IRC Channel '''#ispygames'''.<br />
* [[Puu.sh]]: Adding a 1 month expiry to files. IRC Channel '''#pushharder'''.<br />
* [[Zapd]]: The Zapd service will be discontinued on October 7, 2013. IRC Channel '''#at-zapd'''. We are working on software to scrape the site.<br />
__NOTOC__<br />
<br />
== Progress ==<br />
<br />
We have an [http://tracker.archiveteam.org/ online tracker] that lists currently running warrior projects.</div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=Zapd&diff=17614Zapd2013-09-30T13:55:12Z<p>DukeNukem: </p>
<hr />
<div>{{Infobox project<br />
| title = Zapd<br />
| URL = http://zapd.com/<br />
| logo = Zapd_logo.png<br />
| image = Zapd_homepage_screenshot.png<br />
| project_status = {{closing}}<br />
| archiving_status = {{nosavedyet}}<br />
| irc = at-zapd<br />
}}<br />
<br />
“'''Zapd''' is like Tumblr, in that it makes making pretty websites super easy, but Zapd does all its web building magic from your iPhone.”<br />
<br />
== Shutdown ==<br />
<br />
=== The News ===<br />
<br />
<blockquote><br />
<p>Fast-growing RealSelf gobbles up Zapd, names Kelly Smith chief experience officer</p><br />
<br />
<p>''September 11, 2013 at 8:26 am by John Cook''</p><br />
<br />
<p>You could call this an “acquihire.” But, in this case, it’s really just about grabbing the talents of one person.</p><br />
<br />
<p>RealSelf is buying Pressplane, the parent company of Zapd, picking up the skills of experienced entrepreneur and designer Kelly Smith in the process. Zapd will be shut down at the end of the month, though some of the technology will be carried over to RealSelf, which is building what it dubs the world’s largest community of cosmetic surgery, dermatology and dentistry.<ref>http://www.geekwire.com/2013/fastgrowing-profitabe-realself-gobbles-zapd-names-kelly-smith-chief-experience-officer/</ref></p><br />
</blockquote><br />
<br />
=== The Email ===<br />
''September 28, 2013''<br />
<blockquote><br />
<p>Zapd has been acquired!</p><br />
<p>The Zapd service will be discontinued on October 7, 2013</p><br />
<br />
<p>Today I wanted to share that Zapd has been acquired by RealSelf. RealSelf is the leading online resource for elective cosmetic medical procedures. As the new Chief Experience Designer, I'll be leveraging everything we learned at Zapd to help build a better mobile engagement experience. The Zapd website and mobile apps will stay up until October 7, 2013 and then will be shutting down.<ref>http://pastie.org/8362982</ref></p><br />
</blockquote><br />
<br />
<br />
== Site structure ==<br />
<br />
* All content is served via javascript, with js disabled you just get an empty template.<br />
* They have a url shortener zapd.co<br />
* zapd.co url scheme is #.zapd.co, #[a-z].zapd.co, #[a-z][a-z].zapd.co, or [a-z]#.zapd.co<br />
* Had 350k app downloads in 2011. At worst they have 500,000 users.<br />
* No API<br />
<br />
== References ==<br />
<br />
<references/></div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=Zapd&diff=17613Zapd2013-09-30T13:48:51Z<p>DukeNukem: /* Site structure */</p>
<hr />
<div>{{Infobox project<br />
| title = Zapd<br />
| URL = http://zapd.com/<br />
| logo = Zapd_logo.png<br />
| image = Zapd_homepage_screenshot.png<br />
| project_status = {{closing}}<br />
| archiving_status = {{nosavedyet}}<br />
}}<br />
<br />
“'''Zapd''' is like Tumblr, in that it makes making pretty websites super easy, but Zapd does all its web building magic from your iPhone.”<br />
<br />
== Shutdown ==<br />
<br />
=== The News ===<br />
<br />
<blockquote><br />
<p>Fast-growing RealSelf gobbles up Zapd, names Kelly Smith chief experience officer</p><br />
<br />
<p>''September 11, 2013 at 8:26 am by John Cook''</p><br />
<br />
<p>You could call this an “acquihire.” But, in this case, it’s really just about grabbing the talents of one person.</p><br />
<br />
<p>RealSelf is buying Pressplane, the parent company of Zapd, picking up the skills of experienced entrepreneur and designer Kelly Smith in the process. Zapd will be shut down at the end of the month, though some of the technology will be carried over to RealSelf, which is building what it dubs the world’s largest community of cosmetic surgery, dermatology and dentistry.<ref>http://www.geekwire.com/2013/fastgrowing-profitabe-realself-gobbles-zapd-names-kelly-smith-chief-experience-officer/</ref></p><br />
</blockquote><br />
<br />
=== The Email ===<br />
''September 28, 2013''<br />
<blockquote><br />
<p>Zapd has been acquired!</p><br />
<p>The Zapd service will be discontinued on October 7, 2013</p><br />
<br />
<p>Today I wanted to share that Zapd has been acquired by RealSelf. RealSelf is the leading online resource for elective cosmetic medical procedures. As the new Chief Experience Designer, I'll be leveraging everything we learned at Zapd to help build a better mobile engagement experience. The Zapd website and mobile apps will stay up until October 7, 2013 and then will be shutting down.<ref>http://pastie.org/8362982</ref></p><br />
</blockquote><br />
<br />
<br />
== Site structure ==<br />
<br />
* All content is served via javascript, with js disabled you just get an empty template.<br />
* They have a url shortener zapd.co<br />
* zapd.co url scheme is #.zapd.co, #[a-z].zapd.co, #[a-z][a-z].zapd.co, or [a-z]#.zapd.co<br />
* Had 350k app downloads in 2011. At worst they have 500,000 users.<br />
* No API<br />
<br />
== References ==<br />
<br />
<references/></div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=Zapd&diff=17612Zapd2013-09-30T13:48:10Z<p>DukeNukem: </p>
<hr />
<div>{{Infobox project<br />
| title = Zapd<br />
| URL = http://zapd.com/<br />
| logo = Zapd_logo.png<br />
| image = Zapd_homepage_screenshot.png<br />
| project_status = {{closing}}<br />
| archiving_status = {{nosavedyet}}<br />
}}<br />
<br />
“'''Zapd''' is like Tumblr, in that it makes making pretty websites super easy, but Zapd does all its web building magic from your iPhone.”<br />
<br />
== Shutdown ==<br />
<br />
=== The News ===<br />
<br />
<blockquote><br />
<p>Fast-growing RealSelf gobbles up Zapd, names Kelly Smith chief experience officer</p><br />
<br />
<p>''September 11, 2013 at 8:26 am by John Cook''</p><br />
<br />
<p>You could call this an “acquihire.” But, in this case, it’s really just about grabbing the talents of one person.</p><br />
<br />
<p>RealSelf is buying Pressplane, the parent company of Zapd, picking up the skills of experienced entrepreneur and designer Kelly Smith in the process. Zapd will be shut down at the end of the month, though some of the technology will be carried over to RealSelf, which is building what it dubs the world’s largest community of cosmetic surgery, dermatology and dentistry.<ref>http://www.geekwire.com/2013/fastgrowing-profitabe-realself-gobbles-zapd-names-kelly-smith-chief-experience-officer/</ref></p><br />
</blockquote><br />
<br />
=== The Email ===<br />
''September 28, 2013''<br />
<blockquote><br />
<p>Zapd has been acquired!</p><br />
<p>The Zapd service will be discontinued on October 7, 2013</p><br />
<br />
<p>Today I wanted to share that Zapd has been acquired by RealSelf. RealSelf is the leading online resource for elective cosmetic medical procedures. As the new Chief Experience Designer, I'll be leveraging everything we learned at Zapd to help build a better mobile engagement experience. The Zapd website and mobile apps will stay up until October 7, 2013 and then will be shutting down.<ref>http://pastie.org/8362982</ref></p><br />
</blockquote><br />
<br />
<br />
== Site structure ==<br />
<br />
* All content is served via javascript, with js disabled you just get an empty template.<br />
* They have a url shortener zapd.co<br />
* zapd.co url scheme is #.zapd.co, #[a-z].zapd.co, #[a-z][a-z].zapd.co, or [a-z]#.zapd.co<br />
* Had 350k app downloads in 2011. At worst they have 500,000 users.<br />
<br />
<br />
== References ==<br />
<br />
<references/></div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=Tracker&diff=17611Tracker2013-09-30T13:05:34Z<p>DukeNukem: /* People */</p>
<hr />
<div>== General Overview ==<br />
<br />
[[File:Tracker_test_project_overview_screenshot.png|right|thumb|Project admin overview]]<br />
<br />
The [https://github.com/ArchiveTeam/universal-tracker Tracker] software is the center-pivot of Archiveteam's distributed archiving efforts. It hands out items to be downloaded and keeps track of what is completed. Items can be usernames, subdomains, full urls, basically any unit we can use to break the site into manageable chunks. The progress of each project can be viewed via the leader board interface on http://tracker.archiveteam.org .<br />
<br />
[[File:Xanga_leaderboard.png|right|thumb|A leaderboard]]<br />
<br />
The [[ArchiveTeam Warrior|Warrior]] is the yang to the Tracker's yin. The warriors get the list of current projects from the project file on http://warriorhq.archiveteam.org/ .<br />
<br />
__TOC__<br />
<br />
== API ==<br />
<br />
This is a sample from the project file (line breaks included for readability):<br />
<br />
<pre><br />
{<br />
"name": "streetfiles",<br />
"title": "Streetfiles",<br />
"description": "Streetfiles is closing April, 30th, 2013.",<br />
"repository": "https://github.com/ArchiveTeam/streetfiles-grab.git",<br />
"logo": "http://archiveteam.org/images/7/7b/Streetfiles-logo.png",<br />
"marker_html": <br />
"<a href='http://tracker.archiveteam.org/streetfiles/'><br />
<img src='http://archiveteam.org/images/7/7b/Streetfiles-logo.png'<br />
alt='Streetfiles' width='235' height='50' /></a>",<br />
"deadline": "2013-04-30T23:59:59Z",<br />
"host": "streetfiles.org",<br />
"leaderboard": "http://tracker.archiveteam.org/streetfiles/",<br />
"lat_lng": [<br />
51,<br />
9<br />
]<br />
},<br />
</pre><br />
<br />
It shows where to get the grab code and other project information.<br />
<br />
== Hardware ==<br />
The tracker runs on a [http://www.archiveteam.org/index.php?title=Clown_hosting#linode Linode 1 GB] instance operated by [[User:Chronomex|chronomex]]. <br />
<br />
== Monitoring ==<br />
<br />
http://tracker.archiveteam.org has a Munin instance located at http://zeppelin.xrtc.net/corp.xrtc.net/shilling.corp.xrtc.net/.<br />
<br />
== Software used: ==<br />
<br />
* [https://github.com/ArchiveTeam/universal-tracker Universal Tracker] is a Ruby HTTP application that sends and receives JSON payloads and uses Redis for the data store.<br />
* Redis A memory based key value store<br />
* [http://debian.org/ Debian] is the Linux distribution the stack is built upon.<br />
* [https://github.com/ArchiveTeam/warrior-hq warrior-hq] a small Sinatra web app to manage the Warriors and display the geo-location world map.<br />
<br />
You can also [[Tracker_Setup|set up your own tracker]].<br />
<br />
== People ==<br />
These are the volunteers who take care of the different services that form Archive Team and URLTeam.<br />
<br />
* Wiki Admins - SketchCow, winr4r<br />
* warriorhq.archiveteam.org (projects.json) - xmc, Smiley<br />
* Warrior Tracker ssh - alard, underscor, Smiley<br />
* Warrior Tracker web - alard, underscor, omf_, Smiley<br />
* Anarchive server - GLaDOS, omf_, Smiley<br />
* URLTeam Tracker software - GLaDOS, omf_, Smiley<br />
* Github Organization Admins - GLaDOS, omf_, ivan<br />
* IRC twitter bot - swebb<br />
* pad.archivingyoursh.it & paste.archivingyoursh.it - GLaDOS<br />
* Domain registration (archiveteam.org urlte.am) - SketchCow</div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=User:DukeNukem&diff=17472User:DukeNukem2013-08-29T06:04:46Z<p>DukeNukem: Blanked the page</p>
<hr />
<div></div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=User:DukeNukem&diff=17470User:DukeNukem2013-08-29T06:04:20Z<p>DukeNukem: </p>
<hr />
<div>{{:Current_Projects}}</div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=User:DukeNukem&diff=17469User:DukeNukem2013-08-29T06:01:18Z<p>DukeNukem: Created page with "{{Current}}"</p>
<hr />
<div>{{Current}}</div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=Tracker&diff=17441Tracker2013-08-20T09:26:16Z<p>DukeNukem: /* Hardware */</p>
<hr />
<div>== General Overview ==<br />
<br />
[[File:Tracker_test_project_overview_screenshot.png|right|thumb|Project admin overview]]<br />
<br />
The [https://github.com/ArchiveTeam/universal-tracker Tracker] software is the center-pivot of Archiveteam's distributed archiving efforts. It hands out items to be downloaded and keeps track of what is completed. Items can be usernames, subdomains, full urls, basically any unit we can use to break the site into manageable chunks. The progress of each project can be viewed via the leader board interface on http://tracker.archiveteam.org .<br />
<br />
[[File:Xanga_leaderboard.png|right|thumb|A leaderboard]]<br />
<br />
The [[ArchiveTeam Warrior|Warrior]] is the yang to the Tracker's yin. The warriors get the list of current projects from the project file on http://warriorhq.archiveteam.org/ .<br />
<br />
__TOC__<br />
<br />
== API ==<br />
<br />
This is a sample from the project file (line breaks included for readability):<br />
<br />
<pre><br />
{<br />
"name": "streetfiles",<br />
"title": "Streetfiles",<br />
"description": "Streetfiles is closing April, 30th, 2013.",<br />
"repository": "https://github.com/ArchiveTeam/streetfiles-grab.git",<br />
"logo": "http://archiveteam.org/images/7/7b/Streetfiles-logo.png",<br />
"marker_html": <br />
"<a href='http://tracker.archiveteam.org/streetfiles/'><br />
<img src='http://archiveteam.org/images/7/7b/Streetfiles-logo.png'<br />
alt='Streetfiles' width='235' height='50' /></a>",<br />
"deadline": "2013-04-30T23:59:59Z",<br />
"host": "streetfiles.org",<br />
"leaderboard": "http://tracker.archiveteam.org/streetfiles/",<br />
"lat_lng": [<br />
51,<br />
9<br />
]<br />
},<br />
</pre><br />
<br />
It shows where to get the grab code and other project information.<br />
<br />
== Hardware ==<br />
The tracker runs on a [http://www.archiveteam.org/index.php?title=Clown_hosting#linode Linode 1 GB] instance operated by [[User:Chronomex|chronomex]]. <br />
<br />
== Monitoring ==<br />
<br />
http://tracker.archiveteam.org has a Munin instance located at http://zeppelin.xrtc.net/corp.xrtc.net/shilling.corp.xrtc.net/.<br />
<br />
== Software used: ==<br />
<br />
* [https://github.com/ArchiveTeam/universal-tracker Universal Tracker] is a Ruby HTTP application that sends and receives JSON payloads and uses Redis for the data store.<br />
* Redis A memory based key value store<br />
* [http://debian.org/ Debian] is the Linux distribution the stack is built upon.<br />
* [https://github.com/ArchiveTeam/warrior-hq warrior-hq] a small Sinatra web app to manage the Warriors and display the geo-location world map.<br />
<br />
You can also [[Tracker_Setup|set up your own tracker]].<br />
<br />
== People ==<br />
These are the volunteers who take care of the different services that form Archive Team and URLTeam.<br />
<br />
* Wiki Admins - SketchCow, winr4r<br />
* URLTeam Tracker - ersi, xmc, (formerly soultcer)<br />
* Warrior Tracker ssh - alard, underscor, Smiley<br />
* Warrior Tracker web - alard, underscor, omf_, Smiley<br />
* Anarchive server - GLaDOS, omf_, Smiley<br />
* Github Organization Admins - GLaDOS, omf_, ivan<br />
* IRC twitter bot - swebb<br />
* pad.archivingyoursh.it & paste.archivingyoursh.it - GLaDOS<br />
* Domain registration (archiveteam.org urlte.am) - SketchCow</div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=Tracker&diff=17385Tracker2013-08-02T05:58:07Z<p>DukeNukem: /* People */</p>
<hr />
<div>= General Overview =<br />
<br />
The [https://github.com/ArchiveTeam/universal-tracker Tracker] software is the center-pivot of Archiveteam's distributed archiving efforts. It hands out items to be downloaded and keeps track of what is completed. Items can be usernames, subdomains, full urls, basically any unit we can use to break the site into manageable chunks. The progress of each project can be viewed via the leader board interface on http://tracker.archiveteam.org .<br />
<br />
The [[ArchiveTeam Warrior|Warrior]] is the yang to the Tracker's yin. The warriors get the list of current projects from the project file on http://warriorhq.archiveteam.org/ .<br />
<br />
This is a sample from the project file:<br />
<br />
<pre><br />
{<br />
"name": "streetfiles",<br />
"title": "Streetfiles",<br />
"description": "Streetfiles is closing April, 30th, 2013.",<br />
"repository": "https://github.com/ArchiveTeam/streetfiles-grab.git",<br />
"logo": "http://archiveteam.org/images/7/7b/Streetfiles-logo.png",<br />
"marker_html": "<a href='http://tracker.archiveteam.org/streetfiles/'><img src='http://archiveteam.org/images/7/7b/Streetfiles-logo.png' alt='Streetfiles' width='235' height='50' /></a>",<br />
"deadline": "2013-04-30T23:59:59Z",<br />
"host": "streetfiles.org",<br />
"leaderboard": "http://tracker.archiveteam.org/streetfiles/",<br />
"lat_lng": [<br />
51,<br />
9<br />
]<br />
},<br />
</pre><br />
<br />
It shows where to get the grab code and other project information.<br />
<br />
== Hardware ==<br />
The tracker runs on a [http://www.archiveteam.org/index.php?title=Clown_hosting#linode Linode 1 GB] instance operated by [[User:Chronomex|chronomex]]. The system is tracked by [http://zeppelin.xrtc.net/corp.xrtc.net/shilling.corp.xrtc.net/index.html Munin].<br />
<br />
== Software used: ==<br />
<br />
* [https://github.com/ArchiveTeam/universal-tracker Universal Tracker] is a python based HTTP application that sends and receives JSON payloads and uses Redis for the data store.<br />
* Redis A memory based key value store<br />
* [http://debian.org/ Debian] is the Linux distribution the stack is built upon.<br />
* [https://github.com/ArchiveTeam/warrior-hq warrior-hq] a small Sinatra web app to manage the Warriors and display the geo-location world map.<br />
<br />
== People ==<br />
These are the volunteers who take care of the different services that form Archive Team and URLTeam.<br />
<br />
* Wiki Admins - SketchCow, winr4r<br />
* URLTeam Tracker - ersi, xmc, (formerly soultcer)<br />
* Warrior Tracker ssh - alard, Smiley<br />
* Warrior Tracker web - alard, omf_, Smiley<br />
* Anarchive server - GLaDOS, omf_, Smiley<br />
* Github Organization Admins - GLaDOS, omf_<br />
* IRC twitter bot - swebb<br />
* pad.archivingyoursh.it & paste.archivingyoursh.it - GLaDOS<br />
* Domain registration (archiveteam.org urlte.am) - SketchCow</div>DukeNukemhttps://wiki.archiveteam.org/index.php?title=Tracker&diff=17384Tracker2013-08-02T05:55:03Z<p>DukeNukem: /* Hardware */</p>
<hr />
<div>= General Overview =<br />
<br />
The [https://github.com/ArchiveTeam/universal-tracker Tracker] software is the center-pivot of Archiveteam's distributed archiving efforts. It hands out items to be downloaded and keeps track of what is completed. Items can be usernames, subdomains, full urls, basically any unit we can use to break the site into manageable chunks. The progress of each project can be viewed via the leader board interface on http://tracker.archiveteam.org .<br />
<br />
The [[ArchiveTeam Warrior|Warrior]] is the yang to the Tracker's yin. The warriors get the list of current projects from the project file on http://warriorhq.archiveteam.org/ .<br />
<br />
This is a sample from the project file:<br />
<br />
<pre><br />
{<br />
"name": "streetfiles",<br />
"title": "Streetfiles",<br />
"description": "Streetfiles is closing April, 30th, 2013.",<br />
"repository": "https://github.com/ArchiveTeam/streetfiles-grab.git",<br />
"logo": "http://archiveteam.org/images/7/7b/Streetfiles-logo.png",<br />
"marker_html": "<a href='http://tracker.archiveteam.org/streetfiles/'><img src='http://archiveteam.org/images/7/7b/Streetfiles-logo.png' alt='Streetfiles' width='235' height='50' /></a>",<br />
"deadline": "2013-04-30T23:59:59Z",<br />
"host": "streetfiles.org",<br />
"leaderboard": "http://tracker.archiveteam.org/streetfiles/",<br />
"lat_lng": [<br />
51,<br />
9<br />
]<br />
},<br />
</pre><br />
<br />
It shows where to get the grab code and other project information.<br />
<br />
== Hardware ==<br />
The tracker runs on a [http://www.archiveteam.org/index.php?title=Clown_hosting#linode Linode 1 GB] instance operated by [[User:Chronomex|chronomex]]. The system is tracked by [http://zeppelin.xrtc.net/corp.xrtc.net/shilling.corp.xrtc.net/index.html Munin].<br />
<br />
== Software used: ==<br />
<br />
* [https://github.com/ArchiveTeam/universal-tracker Universal Tracker] is a python based HTTP application that sends and receives JSON payloads and uses Redis for the data store.<br />
* Redis A memory based key value store<br />
* [http://debian.org/ Debian] is the Linux distribution the stack is built upon.<br />
* [https://github.com/ArchiveTeam/warrior-hq warrior-hq] a small Sinatra web app to manage the Warriors and display the geo-location world map.<br />
<br />
== People ==<br />
These are the volunteers who take care of the different services that form Archive Team and URLTeam.<br />
<br />
* Wiki Admins - SketchCow, winr4r<br />
* URLTeam Tracker - ersi, xmc, (formerly soultcer)<br />
* Warrior Tracker ssh - alard, Smiley<br />
* Warrior Tracker web - alard, omf_, Smiley<br />
* Anarchive server - GLaDOS, omf_, Smiley<br />
* Github Organization Admins - GLaDOS, omf_<br />
* IRC twitter bot - swebb<br />
* pad.archivingyoursh.it & paste.archivingyoursh.it - GLaDOS</div>DukeNukem