How to bulk download files from the internet archive

3.1k Views Asked by At

I checked the original site of the internet archive and they mentioned there a couple of steps to follow, which included the use of the wget utility using Cygwin over windows, I followed the steps above, I made an advanced search and extracted the CSV file, converted it to .txt and then tried to run the following commands

wget -r -H -nc -np -nH --cut-dirs=1 -A .pdf,.epub -e robots=off -l1 -i ./itemlist.txt -B 'http://archive.org/download/

The emulator gets stuck afterwards and no log message or even an error message appears indicating any practical progress, I want to know what wrong have I done so far.

2

There are 2 best solutions below

0
MyDoom On

After Some time I figured out how to resolve this matter, the commands posted in the internet archive help blog are general commands posted to help use the wget utility , the commands we will need right here are simply those which follow

--cutdirs=1
-A .pdf,.epub
-e robots=off
-i ./itemlist.txt

and of course the url source:

B- 'archive.org/download/'
0
scruss On

The ia command-line tool is the official way to do this. If you can craft a search term that captures all your items, you can have ia download everything that matches.

For example:

ia download --search 'creator:Hamilton Commodore User Group'

will download all of the items attributed to this (now defunct) computer user group. This is a live, working query that downloads roughly 8.6 MB of data for 40 Commodore 64 disk images.

It will also download from an itemlist, as above.