Difference between revisions of "Google Video (Archive)"

From Archiveteam
Jump to navigation Jump to search
(docid explained)
(Add link to collection and item.)
 
(741 intermediate revisions by 77 users not shown)
Line 1: Line 1:
:''See also [[Google Video Warroom]].''
{{Infobox project
{{Infobox project
| title = Google Video
| title = Google Video
| image = Video logo lg.gif
| image = Video logo lg.gif
| description =  
| description = Google Video logo
| URL = http://video.google.com
| URL = http://video.google.com
| project_status = {{closing}} in 2011-04-29[http://video.google.com/support/bin/answer.py?answer=1233300&hl=en]
| project_status = {{offline}} on 2011-04-29[http://video.google.com/support/bin/answer.py?answer=1233300&hl=en]
| archiving_status = {{inprogress}}
| archiving_status = {{saved}}
| irc = googlegrape
| irc_network = EFnet
| irc_abandoned = true
| data = {{IA item|google-video-metadata-dumpage}} <br> {{IA collection|googlevideo2011}} (access restricted)
}}
}}
 
[[File:Papua videos.png|thumb|right|300px|Google Video results for "Papua New Guinea" keyword.]]
__NOTOC__
'''Google Video''' is a [[Video hostings|video sharing]] website which is shutting down.
'''Google Video''' is a [[Video hostings|video sharing]] website which is shutting down.


If you want to '''save your own videos''', see the announcement and tools below.  
If you want to '''save your own videos''', see the announcement and tools below.  


If you want to '''help archive Google Video''', get some Linux machines running and join us in [[IRC]] (EFNet #archiveteam / #googlegrape)
If you want to '''help archive Google Video''', get some machines running and join us in [[IRC]].


== Joining the archival effort ==
== Joining the archival effort ==
The automatic scripts only work on Linux and maybe OS X. They also seem to work fine in Cygwin. Alternatively, you can run *nix in a virtual machine (given you have a fast enough machine).  
The automatic scripts only work on FreeBSD, Linux, Solaris, Windows and maybe OS X. They also seem to work fine in Cygwin. Alternatively, you can run *nix in a virtual machine (given you have a fast enough machine).


* Download [http://www.textfiles.com/videoyahoo/SCRIPTS/youtube-dl youtube-dl] or from your distribution.
Anyone can help out, but we would *really* appreciate it if you'd use an *NIX system over any thoughts of doing it on a Windows system. If you however choose to pursue the Magical World of Windows - please make sure that what you are collecting is not damaged as a consequence of running it on a Windows system.  
* Download [http://199.48.254.90/at/googlegargle googlegargle]
* Get aria2 (from your distribution or [http://aria2.sourceforge.net/ SourceForge])
* Pick a seed list from below, save it under the filename "list" and add your name to the list (you will need a wiki account)
* Change the first few lines of the googlegargle script to reflect your installation
** If you're using youtube-dl from your distro, run "sudo updatedb; locate youtube-dl" to find the location of the command. Change DLSCRIPT to this.
* For older aria versions, some options need to be removed (--max-connection-per-server=16 --min-split-size=1M)
** You  might need to upgrade your version from your system package manager, however the most recent version still may not suffice.
* Invoke googlegargle


Join the IRC channel to coordinate!
In any case, the first thing to do is to please add your name/nickname to [http://piratepad.net/gv-participants this list], along with the storage and bandwidth you have available.


=== What can I do? ===


=== Cherry picking ===
The two main tasks are: indexing and downloading. The easiest and least taxing is indexing (see [[Google Video Warroom#Indexing Videos To Identify Related Videos]]).  If you have some extra bandwidth and space think about running [[Google Video Warroom#Downloading Videos Via Related Video Metadata (aka Listerine)|Listerine]] to download videos.  Both of these tasks are automated and can be left running in the background. It is often good practice to start a few process of each at once.
The seed files do currently not include all videos, so you might want to save precious videos explicitely. To do that, add IDs (docid URL parameter of the Google Video) to the "list" file in the same directory as the script:
  docid=1545969803753962248
docid=1598207563000425446
docid=-1679753730105404298
and start ./googlegargle


Please add all docids to the [http://piratepad.net/TL7KDN8821 cherry pick list], so that others won't download those videos for the second time!
== FAQ ==
Caution: the cherry pick list is a list of videos you have downloaded, not a list of [http://piratepad.net/gvspecificrequests videos you'd like to have archived]. That second list you can mark videos on you'd like to see saved at a higher than average priority!
* Is there any estimate on how many videos are on Google Video?
 
** Wikipedia said it has 2,500,000 videos, a semi-official Google blog mentioned 2.8M
== Seed List Downloads ==
http://199.48.254.90/at/seeds/


* seed_videos_2_a
* Is there anything about grabbing metadata for vids? like descriptions?
* seed_videos_2_k
** Googlegrape does that, it saves the html of the video download page
* seed_videos_2_l
* seed_videos_2_m
* seed_videos_2_o
* seed_videos_2_p
* seed_videos_2_q
* seed_videos_2_t
* seed_videos_2_u
* seed_videos_2_w
* seed_videos_2_x
* seed_videos_2_y
* seed_videos_2_z
* seed_videos_a dr_sweety
* seed_videos_a_related dr_sweety
* seed_videos_b bjwebb
* seed_videos_c
* seed_videos_d nomduclav
* seed_videos_e nomduclav
* seed_videos_f doublej
* seed_videos_g
* seed_videos_h ARc[Clone
* seed_videos_i DeCarabas
* seed_videos_j joethehum
* seed_videos_k aggroskater
* seed_videos_l yipdw
* seed_videos_m TJ__
* seed_videos_n ndurner
* seed_videos_o
* seed_videos_p Pneu
* seed_videos_q nomduclavier
* seed_videos_r Pentium
* seed_videos_s Pentium
* seed_videos_t joethehum
* seed_videos_u greg-g
* seed_videos_v masterme1
* seed_videos_w com_lab
* seed_videos_x Dark-Star
* seed_videos_y beremat
* seed_videos_z ksh


== Tools ==
* What happens to the data after you claim a seed on the wiki and download it?
** We've got 140TB of space allocated to us on archive.org, and can get more


=== Youtube-DL ===
* Is there already some space where it can be uploaded to?
* http://rg3.github.com/youtube-dl/download.html
** Not yet, the effort is still young and things take time to organize.
** python youtube-dl googlevideourl


=== DocID scripts ===
* How can I split seed files if I want to download fewer videos or share the task with others?
* http://piratepad.net/googlevideoscript
** On *nix machines use: '''split --lines=500 [seedfile] [seedfile]''' to create a set of files each 500 lines in length in the form '''seedfile'''aa '''seedfile'''ab ... etc.
 
=== GoogleGargle ===
* http://www.textfiles.com/googlegargle
 
=== Aria2c ===
* apt-add-repository ppa:t-tujikawa/ppa
* apt-get update
* apt-get install aria2c
** http://aria2.sourceforge.net/
 
== FAQ ==
* is there any estimate on how many videos are on google video?
wikipedia said it has 2,500,000 videos, a semi-official google blog mentioned 2.8M


* is there anything about grabbing metadata for vids? like descriptions?
* How can I check if there are duplicates in a seed file?
googlegrape does that, it saves the html of the video download page
** On *nix machines use: '''sort [infile] | uniq -d''' to show all duplicates.


* what happens to the data after you claim a seed on the wiki and download it?
* How can I remove duplicates from a seed file before I start to use it?
We've got 100TB of space allocated to us on archive.org, and can get more
** On *nix machines use: '''sort [infile] | uniq -u > [outfile]''' to produce a new seed file with duplicates removed.


* is there already some space where it can be uploaded to?
* If I wanted to run more than one listerine process, do I just make multiple clones? Do I need a different username for each?
not yet
** Only if you need to be able to differentiate later on, like we'll say, we need video 123 from "xentac3"


== Announcement: Uploaded video content no longer available  ==
== Announcement: Uploaded video content no longer available  ==
Line 126: Line 67:


== External links ==
== External links ==
* [http://video.google.com Google Video]
* {{url|1=http://video.google.com|2=Google Video}}
* [http://www.deaddyingdamned.com/assets/Google_Video_Shutdown_Email.html Announcement email]
* {{url|1=http://www.deaddyingdamned.com/assets/Google_Video_Shutdown_Email.html|2=Announcement email}}
* [http://video.google.com/support/bin/answer.py?answer=1233300&hl=en Announcement on Google Video Help]
* {{url|1=http://video.google.com/support/bin/answer.py?answer=1233300&hl=en|2=Announcement on Google Video Help}}


{{Navigation box}}
{{Navigation box}}

Latest revision as of 23:36, 22 November 2021

See also Google Video Warroom.
Google Video
Google Video logo
Google Video logo
URL http://video.google.com
Status Offline on 2011-04-29[1]
Archiving status Saved!
Archiving type Unknown
IRC channel #archiveteam-bs (on hackint)
(formerly #googlegrape (on EFnet))
Data[how to use] google-video-metadata-dumpage
googlevideo2011 (access restricted)
Google Video results for "Papua New Guinea" keyword.

Google Video is a video sharing website which is shutting down.

If you want to save your own videos, see the announcement and tools below.

If you want to help archive Google Video, get some machines running and join us in IRC.

Joining the archival effort

The automatic scripts only work on FreeBSD, Linux, Solaris, Windows and maybe OS X. They also seem to work fine in Cygwin. Alternatively, you can run *nix in a virtual machine (given you have a fast enough machine).

Anyone can help out, but we would *really* appreciate it if you'd use an *NIX system over any thoughts of doing it on a Windows system. If you however choose to pursue the Magical World of Windows - please make sure that what you are collecting is not damaged as a consequence of running it on a Windows system.

In any case, the first thing to do is to please add your name/nickname to this list, along with the storage and bandwidth you have available.

What can I do?

The two main tasks are: indexing and downloading. The easiest and least taxing is indexing (see Google Video Warroom#Indexing Videos To Identify Related Videos). If you have some extra bandwidth and space think about running Listerine to download videos. Both of these tasks are automated and can be left running in the background. It is often good practice to start a few process of each at once.

FAQ

  • Is there any estimate on how many videos are on Google Video?
    • Wikipedia said it has 2,500,000 videos, a semi-official Google blog mentioned 2.8M
  • Is there anything about grabbing metadata for vids? like descriptions?
    • Googlegrape does that, it saves the html of the video download page
  • What happens to the data after you claim a seed on the wiki and download it?
    • We've got 140TB of space allocated to us on archive.org, and can get more
  • Is there already some space where it can be uploaded to?
    • Not yet, the effort is still young and things take time to organize.
  • How can I split seed files if I want to download fewer videos or share the task with others?
    • On *nix machines use: split --lines=500 [seedfile] [seedfile] to create a set of files each 500 lines in length in the form seedfileaa seedfileab ... etc.
  • How can I check if there are duplicates in a seed file?
    • On *nix machines use: sort [infile] | uniq -d to show all duplicates.
  • How can I remove duplicates from a seed file before I start to use it?
    • On *nix machines use: sort [infile] | uniq -u > [outfile] to produce a new seed file with duplicates removed.
  • If I wanted to run more than one listerine process, do I just make multiple clones? Do I need a different username for each?
    • Only if you need to be able to differentiate later on, like we'll say, we need video 123 from "xentac3"

Announcement: Uploaded video content no longer available

On April 29, 2011 videos that have been uploaded to Google Video will no longer be available for playback. We’ve added a Download button to the Video Status page, so you can download videos that you want to save. If you don’t want to download your videos, you don’t need to do anything. (The Download feature will be disabled after May 13, 2011.)

How do I download videos that I've uploaded?

On the Video Status page, click Download Video located on the right side of each of your videos in the "Actions" column.Once a video has been downloaded, an "Already Downloaded" message will appear. If you have many videos on Google Video, you may need to use the paging controls located on the bottom right of the page to access them all. This download option will be available through May 13, 2011.

I've downloaded my videos. Now what do I do with these FLV files?

FLV files are videos that have been encoded in the Flash Video Format. You can upload your videos in FLV format to other video hosting sites like YouTube or Picassa Web Albums. If you would like to playback your videos on your computer and they don’t seem to be working, you might need to install an FLV player. In order to find an FLV player to install, try doing a Google search for [ FLV player ].

External links