Difference between revisions of "Fileplanet/uploadingfilestoia"

From Archiveteam
Jump to navigation Jump to search
(Created page with "Brainstorming on how we can upload the files to IA items. '''Remember not to upload the ftp2 stuff publically!''' '''x-archive-meta-title:''' "Fileplanet_Path_with_underscor...")
 
m (Reverted edits by Megalanya1 (talk) to last revision by Jscott)
 
(15 intermediate revisions by 4 users not shown)
Line 8: Line 8:


'''x-archive-meta-date:''' Not sure what format it supports, but this should be the file's original timestamp.
'''x-archive-meta-date:''' Not sure what format it supports, but this should be the file's original timestamp.
'''x-archive-meta-year:''' See above.
'''x-archive-meta-year:''' See above.


Line 14: Line 15:


'''x-archive-meta-collection:'''archiveteam-fileplanet no idea if this would work with s3 upload
'''x-archive-meta-collection:'''archiveteam-fileplanet no idea if this would work with s3 upload
'''x-archive-meta-mediatype:'''software (not always correct, maybe check for filename extensions like avi, mov, mp4 etc to set it to movies for those)
'''x-archive-meta-mediatype:'''software (not always correct, maybe check for filename extensions like avi, mov, mp4 etc to set it to movies for those)


'''Remember not to upload the ftp2 stuff publically!'''
'''Remember not to upload the ftp2 stuff publically!'''
#!/bin/bash
# traverse through subdirectories, generate metadata
# supply eg ftp1/102009/ as argument, it will then upload that directory
parentdirectory=$(dirname $1) # NO trailing slash!
subdirectory=$(basename $1)
echo "is ${parentdirectory}/${subdirectory} the directory you want to upload?"
read
commonheaders='--add-header x-archive-auto-make-bucket:1 --add-header x-archive-meta-noindex:true --add-header "x-archive-meta-subject: gaming;software;fileplanet;gamespy;ign;planetnetwork" --add-header "x-archive-meta-collection:archiveteam-fileplanet" --add-header "x-archive-meta-mediatype:software"'
# mediatype:software is not always correct, maybe check for filename extensions like avi, mov, mp4 etc to set it to movies for those?)
## nah, underscor said software :)
 
tempfile="/tmp/fileplanet_ListOfFiles"
echo "Generating a list of files to upload"
find ${parentdirectory}/${subdirectory} -type f > ${tempfile}
while read file
do
file=$(echo ${file}| sed "s/\.\///") # remove ./
echo "Now uploading ${file}"
datetime=$(ls -l --time-style=long-iso "${file}" | awk '{print $6" "$7}')
year=$(echo ${datetime} | grep -Eo '[0-9]{4}')
date="--add-header x-archive-meta-date:\"${datetime}\""
year="--add-header x-archive-meta-year:${year}"
filename=$(basename "${file}")
title="--add-header x-archive-meta-title:\"Fileplanet Archive: ${filename}\""
desc="--add-header x-archive-meta-description:\"${filename}, mirrored from its original location in ${file}\""
# from famicoman
# IA supports alphanum and _-.
itemname=$(echo "Fileplanet_${file}" | tr ' ' '_' | tr -d '[{}(),\!:?~@#$%^&*+=;<>|]' | tr -d "\'" | sed 's/\//_/g')
echo "s3cmd ${commonheaders} ${date} ${year} ${title} ${desc} put \"${file}\" s3://${itemname}"
echo "#################"
done < ${tempfile}
rm ${tempfile}

Latest revision as of 16:19, 17 January 2017

Brainstorming on how we can upload the files to IA items.

Remember not to upload the ftp2 stuff publically!

x-archive-meta-title: "Fileplanet_Path_with_underscores". Without the ftp1/2/3 bit, so eg "ftp1/102011/Yes_Man_Dynamic_Theme.7z" would become the item "102011_Yes_Man_Dynamic_Theme.7z", "ftp1/fpnew/patches/prorally2001_v11.exe" would be "fpnew_patches_prorally2001_v11.exe". This might need some care about special characters that IA does not support for item names.

x-archive-meta-description: For now just put the path there. So basically the item name but with special characters intact. TODO later is to add the real metadata we got, shaqfu has a sqlite db with it, otherwise use the fileinfo item: http://archive.org/details/FileplanetFiles_fileinfo_pages_images

x-archive-meta-date: Not sure what format it supports, but this should be the file's original timestamp.

x-archive-meta-year: See above.

x-archive-meta-subject: gaming;software;gaming software;fileplanet;gamespy;ign;planetnetwork


x-archive-meta-collection:archiveteam-fileplanet no idea if this would work with s3 upload

x-archive-meta-mediatype:software (not always correct, maybe check for filename extensions like avi, mov, mp4 etc to set it to movies for those)

Remember not to upload the ftp2 stuff publically!


#!/bin/bash

# traverse through subdirectories, generate metadata
# supply eg ftp1/102009/ as argument, it will then upload that directory

parentdirectory=$(dirname $1) # NO trailing slash!
subdirectory=$(basename $1)
echo "is ${parentdirectory}/${subdirectory} the directory you want to upload?"
read

commonheaders='--add-header x-archive-auto-make-bucket:1 --add-header x-archive-meta-noindex:true --add-header "x-archive-meta-subject: gaming;software;fileplanet;gamespy;ign;planetnetwork" --add-header "x-archive-meta-collection:archiveteam-fileplanet" --add-header "x-archive-meta-mediatype:software"' 
# mediatype:software is not always correct, maybe check for filename extensions like avi, mov, mp4 etc to set it to movies for those?)
## nah, underscor said software :)
 
tempfile="/tmp/fileplanet_ListOfFiles"

echo "Generating a list of files to upload"
find ${parentdirectory}/${subdirectory} -type f > ${tempfile}

while read file
do
	file=$(echo ${file}| sed "s/\.\///") # remove ./
	echo "Now uploading ${file}"
	
	datetime=$(ls -l --time-style=long-iso "${file}" | awk '{print $6" "$7}')
	year=$(echo ${datetime} | grep -Eo '[0-9]{4}')
	
	date="--add-header x-archive-meta-date:\"${datetime}\""
	year="--add-header x-archive-meta-year:${year}"
	
	filename=$(basename "${file}")
	title="--add-header x-archive-meta-title:\"Fileplanet Archive: ${filename}\""
	desc="--add-header x-archive-meta-description:\"${filename}, mirrored from its original location in ${file}\""
	
	# from famicoman
	# IA supports alphanum and _-.
	itemname=$(echo "Fileplanet_${file}" | tr ' ' '_' | tr -d '[{}(),\!:?~@#$%^&*+=;<>|]' | tr -d "\'" | sed 's/\//_/g')
	
	echo "s3cmd ${commonheaders} ${date} ${year} ${title} ${desc} put \"${file}\" s3://${itemname}"
	echo "#################"
	
done < ${tempfile}

rm ${tempfile}