I'm trying to convert a dataframe to xlsx, and writing it to Google Cloud storage using Cloud functions, but it's not working. It worked when I tested and deployed, but when I tried to trigger with pub/sub and scheduler, it gave the "OSError: [Errno 30] Read-only file system" error. The code is below:
bucket_name = 'bucket-main'
blob_name = 'file-reception/'
df = client.query(query).to_dataframe()
xlsx_file = 'test'
writer = pandas.ExcelWriter(xlsx_file)
df.to_excel(writer, index=False)
bucket = storage_client.get_bucket(bucket_name)
blob = bucket.blob(blob_name+xlsx_file)
blob.upload_from_filename(xlsx_file)
Searched a lot on what it could be, but really don't understand why the cloud function test and deploy worked (actually wrote the wile in the gcs directory) but didn't when I used scheduler + pub/sub.
Tried to change the engine on the df.to_excel between xlsxwritter and openpyxl, nothing changed.
EDIT - NEW CODE:
bucket_name = 'bucket-main'
blob_name = 'file-reception/'
df = client.query(query).to_dataframe()
xlsx_file = 'test'
**tmp_directory = '/tmp/'**
writer = pandas.ExcelWriter(**tmp_directory**+xlsx_file)
df.to_excel(writer, index=False)
bucket = storage_client.get_bucket(bucket_name)
blob = bucket.blob(blob_name+xlsx_file)
blob.upload_from_filename(**writer**)
You didn't specify a full path to the file (
xlsx_file), so it's really hard to say for sure where in the file system it's actually trying to create it. From the documentation:So, if you end up trying to write to the "directory containing your source files", I would expect it to fail. Instead, consider specify a full path to a filesystem that is writable. Typically, this directory is the "temporary" filesystem at /tmp. But instead of hard-coding that path, you should use the API that Python gives you to get that path no matter what the host OS is.
Note that, in Cloud Functions, files written to disk actually occupy memory. If you write too much data, you could end up running out of memory and crashing. You should also delete any temporary files immediately after you don't need them anymore.
See: How do I create a temporary directory in Python?