Unzip Password Protected File in R using WinZip

336 Views Asked by At

I am trying to use R to unzip password protected files from a drive without using 7-zip. My organisation doesn't have access to it, we use WinZip for everything.

I have searched far and wide here but cannot find a post that satisfies the question.

I have a file that is zipped and contains a single XML file. I need to automate the collation of this data, my thinking is unzip then read. I have found these that I can't see what I need to do:

Using unzip does not support passwords - unzip a .zip file

e.g. unzip(file.xml.zip) produces Warning message: In unzip(zipfile = "file.xml.zip") : zip file is corrupt

And the file is not corrupt as I can manually unzip it fine afterwards.

Using 7-Zip (I can't access this) - Unzip a password protected file with Powershell

Reading without unzipping (get "error reading from the connection) - Extract files from password protected zip folder in R

read_xml(unz("file.xml", "file.xml.zip")) produces Error in open.connection(x, "rb") : cannot open the connection In addition: Warning message: In open.connection(x, "rb") : cannot open zip file 'file.xml'

I have tried looking at Expand-Archive in PowerShell and trying to call that through R but am not having much luck, please someone help me!

With PowerShell I use Expand-Archive -Path 'file' which produces: Exception calling "ExtractToFile" with "3" argument(s): "The archive entry was compressed using an unsupported compression method."

1

There are 1 best solutions below

8
r2evans On

I don't have WinZip, but since both it and unzip.exe (within Rtools-4.2) support password-encoding, then we should be able to use similar methods. (Or perhaps you can use unzip included with Rtools.)

Setup:

$ echo 'hello world' > file1.txt
$ echo -e 'a,b\n11,22' > file2.csv
$ c:/rtools42/usr/bin/zip.exe -P secretpassword files.zip file1.txt file2.txt
  adding: file1.txt (stored 0%)
  adding: file2.txt (stored 0%)
$ unzip -v files.zip
Archive:  files.zip
 Length   Method    Size  Cmpr    Date    Time   CRC-32   Name
--------  ------  ------- ---- ---------- ----- --------  ----
      12  Stored       12   0% 2023-02-09 10:03 af083b2d  file1.txt
      10  Stored       10   0% 2023-02-09 10:03 1c1d572e  file2.csv
--------          -------  ---                            -------
      22               22   0%                            2 files

$ unzip -c files.zip file1.txt
Archive:  files.zip
[files.zip] file1.txt password:

Okay, now we have a password-protected zip file.

In R,

readLines(pipe("unzip -q -P secretpassword -c files.zip file1.txt"))
# [1] "hello world"
read.csv(pipe("unzip -q -P secretpassword -c files.zip file2.csv"))
#    a  b
# 1 11 22

WinZip does support a command-line interface, so we should be able to use it within pipe (or system or similar). It does support passwords, I believe it uses the -s argument instead of -P. I don't know if it supports extracting a file to stdout, so you might need to explore its command-line options for that, and if not then work out storing the document to a temporary directory.

Or, assuming you have Rtools installed, you can use its unzip as above without relying on WinZip.

Note:

  • Including the password as a command-line argument is relatively unsafe: users on the same host (if a multi-user system) can see the password in clear text by looking at the process list. I'm not certain if there's an easy way around this.