Spark job execution time

18.4k Views Asked by At

This might be a very simple question. But is there any simple way to measure the execution time of a spark job (submitted using spark-submit)?

It would help us in profiling the spark jobs based on the size of input data.

EDIT : I use http://[driver]:4040 to monitor my jobs, but this Web UI shuts down the moment my job finishes.

3

There are 3 best solutions below

2
Ram Ghadiyaram On BEST ANSWER

Every SparkContext launches its own instance of Web UI which is available at

http://[master]:4040
by default (the port can be changed using spark.ui.port ).

It offers pages (tabs) with the following information:

Jobs, Stages, Storage (with RDD size and memory use) Environment, Executors, SQL

This information is available only until the application is running by default.

Tip : You can use the web UI after the application is finished by enabling spark.eventLog.enabled.

Sample web ui where you can see the time as 3.2hours: enter image description here

3
mpals On

SPARK itself provides much granular information about each stage of your Spark Job. Go to the Web interface of Spark on http://your-driver-node:4040, you can use also history server.

If you just need execution time, then go to "http://your-driver-node:8080", and you can see execution time for a job submitted to a spark.

0
venus On

If you want you can write a piece of code to get the net execution time.

Example:

val t1 = System.nanoTime //your first line of the code

val duration = (System.nanoTime - t1) / 1e9d //your last line of the code