How to convert Json file data into Binary base64 format using ADF or Notebook?

72 Views Asked by At

I have an requirement to convert the source JSON file data into Binary format base6. I tried with Copy activity/binary dataset or using dataflow but converting the complete file data is not possible. Is there any way we can achieve this? Or is there any databricks notebook code we can use to convert the file? I have mounted the blob storage file into notebook and using the same.

I am using this code to convert the data into binary but it is not working. enter image description here

1

There are 1 best solutions below

0
JayashankarGS On

Collect gives you a list of records, but the list doesn't have the function encode. So, you need to take only the data from the list and encode it. Below are the different ways you can encode.

import base64

file_encode = base64.encodebytes(sc.binaryFiles("dbfs:/mnt/jgsblob/json/res.json").collect()[0][1])
print(file_encode)

Output:

enter image description here

You are trying using spark.read.text; this creates records for each line and you need to loop through all records and convert them to binary. Instead of that, you can use sc.binaryFiles('path'), which creates records for each file with the whole file data in binary format like below.

Output:

enter image description here

Or

import base64, json

with open("/dbfs/mnt/jgsblob/json/res.json", 'r') as file:
    json_data = file.read()

file_encode = base64.encodebytes(json_data.encode())
print(file_encode)

Here, make sure you give the file path with a /dbfs prefix to the mount path.

Output:

enter image description here