Error: expected str, bytes or os.PathLike object, not StorageStreamDownloader - Decrypting a blob file using pgp in python

47 Views Asked by At

I am trying to read a csv file (encrypted) from azure blob storage and decrypt it using gnupg and read it in python. I am able to access the blob file but when I pass it to dcrypt function it throws an error.

Error: expected str, bytes or os.PathLike object, not StorageStreamDownloader

The blob file is StorageStreamDownloader type. When I convert it to Bytes I get "Embedded Null bytes" error.

Can someone help me with this. Below is my code.

    from azure.storage.blob import BlobServiceClient, BlobClient
    import pandas as pd
    import csv
    from io import StringIO 
    from pyspark.sql import SparkSession 
    import io

    connection_string = "AAAA"
    container_name = "BBBB"
    blob_service_client = BlobServiceClient.from_connection_string(connection_string)
    container_client = blob_service_client.get_container_client(container_name)

    import gnupg
    gpg = gnupg.GPG()

    gpg.encoding = 'utf-8'

    passphrase = "12345"
    secret = "112233"


    def decrypt_file(filename, secret, passphrase):   
   
        gpg.encoding = 'utf-8' 

        with open(filename, 'rb') as f:
            decrypted_data = gpg.decrypt(f, passphrase=passphrase)     

        if decrypted_data.ok:
            print("done")        
        else:
            print("error:", decrypted_data.status)
            print("error:", decrypted_data.stderr)
        return str(decrypted_data)


    blob_client = container_client.get_blob_client(file)              
    blob_file_tinb = blob_client.download_blob()           
    tinb_file = decrypt_file(blob_file_tinb,secret,passphrase)
1

There are 1 best solutions below

0
Venkatesan On

read a csv file (encrypted) from azure blob storage and decrypt it using gnupg and read it in python.

You can use the code below to read the CSV file from Azure Blob Storage (encrypted to decrypted) using the Azure Python SDK:

Code:

from azure.storage.blob import BlobServiceClient
import io
import gnupg

connection_string = "xxxxx"
container_name = "logs"
blob_service_client = BlobServiceClient.from_connection_string(connection_string)
container_client = blob_service_client.get_container_client(container_name)

gpg = gnupg.GPG()
gpg.encoding = 'utf-8'

passphrase = "12344"
secret="12344"

def decrypt_file(file, secret, passphrase):   
    gpg.encoding = 'utf-8' 
    decrypted_data = gpg.decrypt(file.read(), passphrase=passphrase)     
    if decrypted_data.ok:
        print("done")        
    else:
        print("error:", decrypted_data.status)
        print("error:", decrypted_data.stderr)
    return str(decrypted_data)


blob_client = container_client.get_blob_client("<filename>")              
blob_data = blob_client.download_blob().readall()
tinb_file = decrypt_file(io.BytesIO(blob_data), passphrase, secret)
print(tinb_file)

The above code downloads an encrypted file from Azure Blob Storage, decrypts it using GnuPG, and prints the decrypted data.

It sets up the connection string, container name, and creates a BlobServiceClient and ContainerClient object. It also sets up the GnuPG object and passphrase to decrypt the file. Finally, it gets the BlobClient object for the encrypted file, downloads the file data, and passes it to the decrypt_file() function to decrypt and print the data.

Output:

done
Industry
Accounting/Finance
Advertising/Public Relations
Aerospace/Aviation
Arts/Entertainment/Publishing
Automotive
Banking/Mortgage
Business Development
Business Opportunity
Clerical/Administrative
Construction/Facilities

enter image description here