Spark has thrown error and removed all broadcast pieces, but still reported broadcast timeout

21 Views Asked by Elena At 18 March 2024 at 03:41

Spark version: 3.1.3

The following is the stack trace.

[WARN] org.apache.spark.storage.BlockManager - Putting block broadcast_211_piece0 failed due to exception java.nio.file.FileSystemException: /tmp/blockmgr-d4fb1377-363e-48b4-8c8f-717e15ac5c48/33: No space left on device.
[ERROR] org.apache.spark.broadcast.TorrentBroadcast - Store broadcast broadcast_211 fail, remove all pieces of the broadcast
[ERROR] org.apache.spark.sql.execution.exchange.BroadcastExchangeExec - Could not execute broadcast in 3600 secs.
java.util.concurrent.TimeoutException: null
    at java.util.concurrent.FutureTask.get(FutureTask.java:205)
    at org.apache.spark.sql.execution.exchange.BroadcastExchangeExec.doExecuteBroadcast(BroadcastExchangeExec.scala:194)
    at org.apache.spark.sql.execution.InputAdapter.doExecuteBroadcast(WholeStageCodegenExec.scala:515)
    at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeBroadcast$1(SparkPlan.scala:193)
    at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:218)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
    at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:215)
    at org.apache.spark.sql.execution.SparkPlan.executeBroadcast(SparkPlan.scala:189)
    at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.prepareBroadcast(BroadcastHashJoinExec.scala:203)
    at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.prepareRelation(BroadcastHashJoinExec.scala:217)
    at org.apache.spark.sql.execution.joins.HashJoin.codegenInner(HashJoin.scala:449)
    at org.apache.spark.sql.execution.joins.HashJoin.codegenInner$(HashJoin.scala:448)
    at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.codegenInner(BroadcastHashJoinExec.scala:40)
    at org.apache.spark.sql.execution.joins.HashJoin.doConsume(HashJoin.scala:357)
    at org.apache.spark.sql.execution.joins.HashJoin.doConsume$(HashJoin.scala:355)
    at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.doConsume(BroadcastHashJoinExec.scala:40)
    at org.apache.spark.sql.execution.CodegenSupport.constructDoConsumeFunction(WholeStageCodegenExec.scala:221)
    at org.apache.spark.sql.execution.CodegenSupport.consume(WholeStageCodegenExec.scala:192)
    at org.apache.spark.sql.execution.CodegenSupport.consume$(WholeStageCodegenExec.scala:149)
    at org.apache.spark.sql.execution.aggregate.HashAggregateExec.consume(HashAggregateExec.scala:47)
    at org.apache.spark.sql.execution.aggregate.HashAggregateExec.generateResultFunction(HashAggregateExec.scala:605)
    at org.apache.spark.sql.execution.aggregate.HashAggregateExec.doProduceWithKeys(HashAggregateExec.scala:741)
    at org.apache.spark.sql.execution.aggregate.HashAggregateExec.doProduce(HashAggregateExec.scala:148)
    at org.apache.spark.sql.execution.CodegenSupport.$anonfun$produce$1(WholeStageCodegenExec.scala:95)
    at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:218)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
    at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:215)
    at org.apache.spark.sql.execution.CodegenSupport.produce(WholeStageCodegenExec.scala:90)
    at org.apache.spark.sql.execution.CodegenSupport.produce$(WholeStageCodegenExec.scala:90)
    at org.apache.spark.sql.execution.aggregate.HashAggregateExec.produce(HashAggregateExec.scala:47)
    at org.apache.spark.sql.execution.joins.HashJoin.doProduce(HashJoin.scala:352)
    at org.apache.spark.sql.execution.joins.HashJoin.doProduce$(HashJoin[2024-03-12 13:43:54,247]

I know that because the driver didn't have the enough space to store the broadcast block, spark threw the error about removing all the broadcast blocks. I think that after the error, the broadcasting would be aborted, but it seems that spark still did broadcasting and threw the next error about the broadcasting timeout. (I set the broadcast timeout for 1 hour) Why did it happen?

Original Q&A

Spark has thrown error and removed all broadcast pieces, but still reported broadcast timeout

There are 0 best solutions below

Related Questions in APACHE-SPARK

Related Questions in APACHE-SPARK-SQL

Related Questions in BROADCAST

Trending Questions

Popular # Hahtags

Popular Questions