Difference between revisions of "FTP"

From Archiveteam
Jump to navigation Jump to search
(Fixed whitespace and script.)
(Added size check)
Line 28: Line 28:
  }
  }


Check the size of the site before you start to make sure you have the space to hold the site and tar afterwards, also account for large files on the site when using <code>tar --remove-files</code>
lftp ftp://site.com -e 'du -h'


Now zip/tar it up and [[Internet_Archive#Uploading_to_archive.org|send to the spacious Internet Archive]]![https://archive.org/details/ftpsites] (If you're short on space: <code>tar --remove-files</code> deletes the files shortly after adding them to the tar, not waiting for it to be complete, unlike <code>zip -rm</code>.)
Now zip/tar it up and [[Internet_Archive#Uploading_to_archive.org|send to the spacious Internet Archive]]![https://archive.org/details/ftpsites] (If you're short on space: <code>tar --remove-files</code> deletes the files shortly after adding them to the tar, not waiting for it to be complete, unlike <code>zip -rm</code>.)

Revision as of 10:40, 15 June 2014

FTP
Threeplaces.jpg
Status Online!
Archiving status Not saved yet
Archiving type Unknown
Project source https://github.com/ArchiveTeam/ftp-nab
IRC channel #effteepee (on hackint)

Archiving a whole public FTP host/mirror is easy:

SketchCow> I use wget -r -l 0 -np -nc ftp://ftp.underscorporn.com
tar cvf 2014.01.ftp.underscorporn.com.tar ftp.underscorporn.com
tar tvf 2014.01.ftp.underscorporn.com.tar > 2014.01.ftp.underscorporn.com.tar.txt

OR, use this handy dandy function to put in your .bashrc file, you can also remove the first and last line to turn it into a fancy bash script. Made by SN4T14

ftp-grab(){
    target="$1"
    wget -r -l 0 -np -nc "$target"
    if "$target" =~ ^ftp://.*$ 
        then
        target="$(echo "$target" | cut -d '/' -f 3)"
        echo "ftp"
        echo "$target"
    fi
    tar cvf $(date +%Y).$(date +%m)."$target".tar "$target"
    tar tvf $(date +%Y).$(date +%m)."$target".tar > $(date +%Y).$(date +%m)."$target".tar.txt
}

Check the size of the site before you start to make sure you have the space to hold the site and tar afterwards, also account for large files on the site when using tar --remove-files

lftp ftp://site.com -e 'du -h'

Now zip/tar it up and send to the spacious Internet Archive![1] (If you're short on space: tar --remove-files deletes the files shortly after adding them to the tar, not waiting for it to be complete, unlike zip -rm.)

The Project

Who is grabbing what?
Midas ftp.tu-chemnitz.de
Midas ftp.uni-muenster.de
Midas gatekeeper.dec.com
Midas ftp.uni-erlangen.de
Midas ftp.warwick.ac.uk

Uni FTP's are massive, currently only grabbing DEC and Sweex.

External Links