How to use wget to download from hosting sites?

Some sites don’t make advanced checks and can be tricked easily: tell wget to pretend that it’s really Mozilla and that it’s coming from the download site.

wget --user-agent='Mozilla/5.0 (Windows NT 6.0) Gecko/20100101 Firefox/14.0.1' \
     --referer=http://downloadsite.example.com/download-page-url
     http://example.com/download/filename.ext`

Most sites that check let you get away with --user-agent=Mozilla and --referer set to the URL of the file you’re downloading.

I also came across a method of downloading recursively from directory listings (“index.of.files” etc) from web servers.

Solution

wget -r -np -nH --cut-dirs=3 -R index.html http://hostname/aaa/bbb/ccc/ddd/

Explanation:

It will download all files and subfolders in ddd directory:
recursively (-r),
not going to upper directories, like ccc/… (-np),
not saving files to hostname folder (-nH),
but to ddd by omitting first 3 folders aaa, bbb, ccc (–cut-dirs=3)
excluding index.html files (-R index.html)
Reference: http://bmwieczorek.wordpress.com/2008/10/01/wget-recursively-download-all-files-from-certain-directory-listed-by-apache/

1 Like