Python: Unzip selected files in directory tree

1.3k Views Asked by At

I have the following directory, in the parent dir there are several folders lets say ABCD and within each folder many zips with names as displayed and the letter of the parent folder included in the name along with other info:

-parent--A-xxxAxxxx_timestamp.zip
          -xxxAxxxx_timestamp.zip
          -xxxAxxxx_timestamp.zip
       --B-xxxBxxxx_timestamp.zip
          -xxxBxxxx_timestamp.zip
          -xxxBxxxx_timestamp.zip
       --C-xxxCxxxx_timestamp.zip
          -xxxCxxxx_timestamp.zip
          -xxxCxxxx_timestamp.zip
       --D-xxxDxxxx_timestamp.zip
          -xxxDxxxx_timestamp.zip
          -xxxDxxxx_timestamp.zip

I need to unzip only selected zips in this tree and place them in the same directory with the same name without the .zip extension.

Output:

-parent--A-xxxAxxxx_timestamp
          -xxxAxxxx_timestamp
          -xxxAxxxx_timestamp
       --B-xxxBxxxx_timestamp
          -xxxBxxxx_timestamp
          -xxxBxxxx_timestamp
       --C-xxxCxxxx_timestamp
          -xxxCxxxx_timestamp
          -xxxCxxxx_timestamp
       --D-xxxDxxxx_timestamp
          -xxxDxxxx_timestamp
          -xxxDxxxx_timestamp

My effort:

for path in glob.glob('./*/xxx*xxxx*'): ##walk the dir tree and find the files of interest

    zipfile=os.path.basename(path) #save the zipfile path
    zip_ref=zipfile.ZipFile(path, 'r') 
    zip_ref=extractall(zipfile.replace(r'.zip', '')) #unzip to a folder without the .zip extension

The problem is that i dont know how to save the A,B,C,D etc to include them in the path where the files will be unzipped. Thus, the unzipped folders are created in the parent directory. Any ideas?

2

There are 2 best solutions below

0
BlueEagle On BEST ANSWER

The code that you have seems to be working fine, you just to make sure that you are not overriding variable names and using the correct ones. The following code works perfectly for me

import os
import zipfile
import glob

for path in glob.glob('./*/xxx*xxxx*'): ##walk the dir tree and find the files of interest

    zf = os.path.basename(path) #save the zipfile path
    zip_ref = zipfile.ZipFile(path, 'r') 
    zip_ref.extractall(path.replace(r'.zip', '')) #unzip to a folder without the .zip extension
0
Anand S Kumar On

Instead of trying to do it in a single statement , it would be much easier and more readable to do it by first getting list of all folders and then get list of files inside each folder. Example -

import os.path
for folder in glob.glob("./*"):
    #Using *.zip to only get zip files
    for path in glob.glob(os.path.join(".",folder,"*.zip")):
        filename = os.path.split(path)[1]
        if folder in filename:
            #Do your logic