I want to download all pdf files in the website of "https://journals.ametsoc.org/view/journals/mwre/131/5/mwre.131.issue-5.xml". I tried many thing with wget as: wget --wait 10 --random-wait --continue https://journals.ametsoc.org/downloadpdf/view/journals/mwre/131/5/1520-0493_2003_131_*.co_2.pdf but I get this message: Warning: wildcards not supported in HTTP. --2024-03-29 23:01:27-- https://journals.ametsoc.org/downloadpdf/view/journals/mwre/131/5/1520-0493_2003_131_*.co_2.pdf Resolving journals.ametsoc.org (journals.ametsoc.org)... 54.73.220.207, 52.208.161.60 Connecting to journals.ametsoc.org (journals.ametsoc.org)|54.73.220.207|:443... connected. HTTP request sent, awaiting response... 500 2024-03-29 23:01:28 ERROR 500: (no description).
Is there any way to do that using wget, python or any tool? Thank you in advance.





As far as I see, you want to do scraping from an html page, so it won't work like file manager. You need to use either the
BeautifulsoaporLxmllibrary from Python. The following code uses thlxmllibrary, which should do what you want. It will save pdfs to the folder where the code is executed: