I am creating a Python script that takes the row by row attachments on a given Smartsheet and saves them to a folder in the DataBricks DBFS. The logic is that the script would save the attachments into subfolders named the email address of the user that uploaded the attachment.
The code runs successfully with no errors but there is no output in DBFS. Any guidance would be appreciated thanks!
# Import required libraries
import os
import smartsheet
# Initialize Smartsheet client
SMARTSHEET_ACCESS_TOKEN = dbutils.secrets.get(scope="***********", key="smartsheetapi")
smartsheet_client = smartsheet.Smartsheet(SMARTSHEET_ACCESS_TOKEN)
# Get Smartsheet attachments and store in DBFS
SHEET_ID = ******************
sheet = smartsheet_client.Sheets.get_sheet(SHEET_ID)
# Define the master folder in FileStore of DBFS
master_folder = os.path.join('/dbfs/FileStore', 'smartsheet_attachments')
for row in sheet.rows:
if row.attachments:
for attachment in row.attachments:
response = smartsheet_client.Attachments.get_attachment(SHEET_ID, attachment.id)
# Create a subfolder for the user who uploaded the attachment
user_folder = os.path.join(master_folder, attachment.created_by.email)
if not os.path.exists(user_folder):
os.makedirs(user_folder)
# Save the attachment to the user's folder in DBFS
with open(os.path.join(user_folder, attachment.name), 'wb') as file:
file.write(response.stream.read())
# Log the user who uploaded the attachment
print(f'User {attachment.created_by.email} uploaded the attachment {attachment.name}')
NOTE: *** used to hide private information
As the Smartsheet API docs describe, the Get Attachment operation returns a JSON response. Within the JSON response, the
urlproperty contains a (short-lived) URL where the file can be accessed.Because the Get Attachment response doesn't contain the contents of the file, it cannot be used to write the file like you're trying to do (e.g.,
file.write(response.stream.read())). Instead, you'll need to use theurlvalue in the response to get the contents of the file. I'm no Python expert, but it seems like you can use theurlopenfunction fromurllibto do this. For example:data = urlopen('http://example.com').read().