Difference between revisions of "Archive.today"

From Archiveteam
Jump to navigation Jump to search
(added link to list of all domains)
m (added information about CPU issues, and 'phở' comment)
(24 intermediate revisions by 7 users not shown)
Line 8: Line 8:
}}
}}


Archive.is is an on-demand archiving site, similar to [[WebCite]]. One key difference is that it stores "Web 2.0" pages better than WebCite; it also supports zip downloads of individual webpages. It does not store PDFs, binary files, Adobe Flash content, videos, or sounds. The maximum size of a webpage it will archive (including images) is 50MB. As of Dec. 2012, the website has archived about [http://blog.archive.is/post/38139265209/what-will-happen-to-the-data-when-you-shut-the-site 10 TB of data.]
Archive.is is a privately funded on-demand archiving site, similar to [[WebCite]]. One key difference is that it stores "Web 2.0" pages better than WebCite; it also supports zip downloads of entire individual webpages and takes a screenshot of the webpage. It does not store PDFs, binary files, Adobe Flash content, videos, or sounds. The maximum size of a webpage it will archive (including images) is 50MB. Additionally, Archive.is forwards your IP address to the submitted website in a X-Forwarded-For header.<ref>http://blog.archive.is/post/111779719291/do-you-preserve-archivers-privacy-e-g-not</ref>
 
The website shot up significantly in popularity in the second half of 2014 primarily due to the GamerGate controversy. As of Feb. 2015, the website has archived about [http://blog.archive.is/post/111780063961/how-much-storage-is-archive-today-using-currently 200 "Tb" of data.] ''It is likely 200 Terabyte '''TB''', not Terabit '''Tb''' as is quoted. Nonetheless, if accurate, 200Tb ≈ 25TB.''
 
For additional confusion, "5Tb" is [http://blog.archive.is/post/130682816686/you-mentioned-theres-no-hot-backup-as-of-yet apparently the site's weekly growth].
 
On April 14, 2014, Archive.is changed its name to Archive.today due to attacks against [http://www.isnic.is/en/ ISNIC]<ref>http://blog.archive.is/post/82775187091/curious-why-the-move-in-domain-names-from-archive-is</ref><ref>https://twitter.com/archiveis/status/455710701948903424</ref>, and then changed its name back to the original Archive.is some time later.
 
== Vital Signs ==
 
Note that the site is a commercial enterprise, and as such can go kaputt at any given point, especially if it does not find a lucrative business model. Although it's not a strong indication of long-term issues; in October 2016 the site [http://blog.archive.is/post/151979921861/how-are-you-paying-for-the-servers-are-you-just "made transparent"] the [http://blog.archive.is/post/151510917631/how-do-you-guys-keep-the-lights-on-i-gave-the server costs], and started to accept donations. A weekly crowdfunded target of $800<ref>https://liberapay.com/archiveis/donate</ref> is set to maintain the site.
 
Prior to this, the site actively refused donations. A donation link took the user to an animal shelter donation page<ref>http://web.archive.org/web/20160808113809/https://archive.is/</ref>.
 
In January 2017 the administrator commented in response to a censorship query that the site had [http://blog.archive.is/post/155523285411/have-your-servers-really-run-out-space-or-are-you "just run out of CPU for the browsers."] - With problems capturing pages, it is unclear if this is a temporary issue.
 


A list of all domains currently archived is available [http://archive.is/alldomains here].


== Funding ==
== Funding ==
Line 17: Line 31:
<blockquote>
<blockquote>
It is privately funded, there in no complex finance behind it. It may look more or less reliable compared to the startup-style funding or an univercity project, depending on which risks are taken into account. My death can cause interruption of service, but something like new market condition or changing head of a department can not.</blockquote>
It is privately funded, there in no complex finance behind it. It may look more or less reliable compared to the startup-style funding or an univercity project, depending on which risks are taken into account. My death can cause interruption of service, but something like new market condition or changing head of a department can not.</blockquote>
As of October 2016 the site has a 'liberapay'<ref>https://liberapay.com/archiveis/donate</ref> donation link at the top-right corner of the page.
Stated in January 2017, through donations the site only receives [http://blog.archive.is/post/154860178511/how-you-make-money "more than $1.50 every day, enough for a bowl of phở".]
==Site structure==
A list of all domains currently archived is available [http://archive.is/alldomains here].
[https://archive.org/download/archive.is-alldomains-20140220/archive.is_domains_20140220.txt.7z List of all domains] from [http://archive.is/alldomains archive.is/alldomains] (as of 2014/02/20) = 7,255,826 domains
Sadly, the url counts from /alldomains are out of date.
[https://archive.org/download/archive.is-alldomains-20140220/archive.is_sitemaps_20140217.7z All sitemaps] (as of 2014/02/17)
As a side note, the [http://blog.archive.is/post/117445434661/would-you-consider-handing-over-all-the-captured administrator is unsupportive] of [[Internet Archive]]'s [[robots.txt]] policy - which could hinder future backup cooperation.
== Issues ==
As of 17 Feb, 2016 archive.today domain name is unavailable since 16 Feb, likely due to [http://blog.archive.is/post/138982909006/domain-problems-again "fake DMCA requests"] ([https://web.archive.org/web/20160217044321/http://blog.archive.is/post/138982909006/domain-problems-again copy 1], [https://archive.is/zrsVn copy 2]), [https://twitter.com/archiveis/status/698708729999552512].
== Archives ==
[https://archive.org/details/archive.is-alldomains-20140220 /alldomains Archive]
== References ==
<references />
{{Navigation box}}

Revision as of 18:25, 9 January 2017

Archive.is
Archive-is 2013-07-02 17-05-40.png
URL archive.is[IAWcite.todayMemWeb]
Status Online!
Archiving status Not saved yet
Archiving type Unknown
IRC channel #archiveteam-bs (on hackint)

Archive.is is a privately funded on-demand archiving site, similar to WebCite. One key difference is that it stores "Web 2.0" pages better than WebCite; it also supports zip downloads of entire individual webpages and takes a screenshot of the webpage. It does not store PDFs, binary files, Adobe Flash content, videos, or sounds. The maximum size of a webpage it will archive (including images) is 50MB. Additionally, Archive.is forwards your IP address to the submitted website in a X-Forwarded-For header.[1]

The website shot up significantly in popularity in the second half of 2014 primarily due to the GamerGate controversy. As of Feb. 2015, the website has archived about 200 "Tb" of data. It is likely 200 Terabyte TB, not Terabit Tb as is quoted. Nonetheless, if accurate, 200Tb ≈ 25TB.

For additional confusion, "5Tb" is apparently the site's weekly growth.

On April 14, 2014, Archive.is changed its name to Archive.today due to attacks against ISNIC[2][3], and then changed its name back to the original Archive.is some time later.

Vital Signs

Note that the site is a commercial enterprise, and as such can go kaputt at any given point, especially if it does not find a lucrative business model. Although it's not a strong indication of long-term issues; in October 2016 the site "made transparent" the server costs, and started to accept donations. A weekly crowdfunded target of $800[4] is set to maintain the site.

Prior to this, the site actively refused donations. A donation link took the user to an animal shelter donation page[5].

In January 2017 the administrator commented in response to a censorship query that the site had "just run out of CPU for the browsers." - With problems capturing pages, it is unclear if this is a temporary issue.


Funding

According to their FAQ:

It is privately funded, there in no complex finance behind it. It may look more or less reliable compared to the startup-style funding or an univercity project, depending on which risks are taken into account. My death can cause interruption of service, but something like new market condition or changing head of a department can not.

As of October 2016 the site has a 'liberapay'[6] donation link at the top-right corner of the page.

Stated in January 2017, through donations the site only receives "more than $1.50 every day, enough for a bowl of phở".

Site structure

A list of all domains currently archived is available here.

List of all domains from archive.is/alldomains (as of 2014/02/20) = 7,255,826 domains

Sadly, the url counts from /alldomains are out of date.

All sitemaps (as of 2014/02/17)

As a side note, the administrator is unsupportive of Internet Archive's robots.txt policy - which could hinder future backup cooperation.

Issues

As of 17 Feb, 2016 archive.today domain name is unavailable since 16 Feb, likely due to "fake DMCA requests" (copy 1, copy 2), [1].

Archives

/alldomains Archive

References