Is there a way to analyze the dask worker killed?

28 Views Asked by GUOZHAN SUN At 25 March 2024 at 07:08

I have ~30GB uncompressed spatial data, it contains id, tags, and coordinates as three columns in parquet file with row group size 64MB.

I used dask read_parquet with block_size 32MiB got 118 partitions. I processed this data by using dask dataframe, and i called DataFrame.persist and distributed.wait or progress to wait for the process finishing on all process stages.

Everything works fine, until i tried to save it to output as parquet files. I always got same exception during compute schedule call.

dfs = features.to_delayed()
layer_path = os.path.join(self._path, layer)
filenames = [ os.path.join(layer_path, f"{layer}-{i}.parquet") for i, df in enumerate(dfs)]
writes = [delayed(save)(df, layer, fn) for df, fn in zip(dfs, filenames)]
dd.compute(*writes)

Exception: TerminatedWorkerError('A worker process managed by the executor was unexpectedly terminated. This could be caused by a segmentation fault while calling the function or by an excessive memory usage causing the Operating System to kill the worker.\n\nThe exit codes of the workers are {SIGKILL(-9)}')

This is partitions before call compute to save at last step

                  layer geometry properties
npartitions=118                            
                 object   object     object
                    ...      ...        ...
...                 ...      ...        ...
                    ...      ...        ...
                    ...      ...        ...
Dask Name: apply, 2 graph layers

I tried the worker plugin, and still didn't see any useful logs. So, i am wondering if there is better way to check what happened?

P.S. 10 workers, and each worker has 8 cores, Memory 48GB available

Original Q&A

Is there a way to analyze the dask worker killed?

There are 0 best solutions below

Related Questions in PANDAS

Related Questions in DASK

Related Questions in DASK-DISTRIBUTED

Related Questions in DASK-DATAFRAME

Related Questions in DASK-DELAYED

Trending Questions

Popular # Hahtags

Popular Questions