Saving custom logging in databricks without tye, except block

403 Views Asked by At

I'm trying to write a logging system in databricks for a few jobs we need to run. Currently I'm setting up a logger and log the files in-memory -> log_stream = io.StringIO()

All functions are covered in a try, except block to catch info or exception in the logger and log them. But it is also used to have certainty that the last block of the notebook will run. Which is needed because this contains the code that uploads the in-memory file to a blob storage.

However, I feel this method is quite 'ugly', since every code needs to be covered in a try, except block.

Are their any methods to either always run the last block of the notebook even when an part of the code completly fails/errors. Or is there another method to secure that the logfile is directly uploaded in case of any errors?

current code:
-- logging --

log_stream = io.StringIO()

logger = logging.getLogger(database_name_bron)
logger.setLevel(logging.DEBUG)

formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')

handler = logging.StreamHandler(log_stream)
handler.setLevel(logging.DEBUG)
handler.setFormatter(formatter)
if logger.hasHandlers():
    logger.handlers.clear()
logger.addHandler(handler)

-- code block example --

try:
    table = output_dict['*'].select( \
        col('1*').alias('1*'), \
        col('2*').alias('2*'), \
        col('3*').alias('3*'), \
        col('4*').alias('4*'), \
        col('5*').alias('5*'), \
            )

    #join tabellen
    table2= table2.join(table1, table2.5* == table1.4*, 'left')
    logger.info('left join van table 1en table2')
except Exception as e:
    logger.exception(f"Een error heeft plaatsgevonden tijdens het joinen van table1 en table2: {e}")

-- upload block --

#extraheer log data
log_content = log_stream.getvalue()

#upload data naar de blob storage
dbutils.fs.put(f"abfss://{container_name}@{storage_account}.dfs.core.windows.net/{p_container_name}", log_content, overwrite=True)

#netjes afsluiten van de handler
logger.removeHandler(handler)
handler.close()
1

There are 1 best solutions below

2
JayashankarGS On

You run your code in custom log context by creating it.

Below is the code.

import logging
import io

class LoggingContext:
    def  __init__(self, logger, storage_account, container_name, p_container_name):
        self.logger = logger
        self.storage_account = storage_account
        self.container_name = container_name
        self.p_container_name = p_container_name
        self.log_stream = io.StringIO()
        
    def __enter__(self):
        handler = logging.StreamHandler(self.log_stream)
        handler.setLevel(logging.DEBUG)
        formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
        handler.setFormatter(formatter)
        self.logger.addHandler(handler)
        return  self.logger
        
    def __exit__(self, exc_type, exc_val, exc_tb):
        self.logger.removeHandler(self.logger.handlers[0])
        handler = self.logger.handlers[0]
        handler.close()
        if exc_type is  not  None:
            self.logger.exception(f"An error occurred: {exc_val}")
        log_content = self.log_stream.getvalue()
        dbutils.fs.put(f"abfss://{self.container_name}@{self.storage_account}.dfs.core.windows.net/{self.p_container_name}",log_content, overwrite=True)

Here, __init__ initiates all required variables in the context. __enter__ configures the log settings like format,level,lo_stream and handler. And __exit__ handles to close handlers, performs log content upload operation to storage.

logger = logging.getLogger("database_name_bron")
logger.setLevel(logging.DEBUG)
storage_account = 'jgsadls'
container_name = 'data'
p_container_name = 'databricks_log'

Add your required information here.

context code;

with LoggingContext(logger, storage_account, container_name, p_container_name):
    logger.info('Log started')
    logger.info("Table joined.Check logs for more info....")
    y = 1/0
    logger.info("Log ended")

Here, add your code blocks in LoggingContext wherever you run as above.

enter image description here

Ouputs:

enter image description here

Whenever you run your code block just run it inside Logcontext instead of try/except.