我有一个Azure Databricks作业,它是通过API调用通过ADF触发的。我想看看为什么这项工作花了N分钟才能完成任务。当作业执行结果时,作业执行时间说15分钟,并且单个单元/命令甚至不会加起来4-5分钟,
交互式群集已经启动并运行。请告诉我,为什么单个单元执行时间与整体作业执行时间不符?我在哪里可以看到这里花费了额外的时间?
I have a azure databricks job and it's triggered via ADF using a api call. I want see why the job has been taking n minutes to complete the tasks. When the job execution results, The job execution time says 15 mins and the individual cells/commands doesn't add up to even 4-5 mins
The interactive cluster is already up and running while this got triggered. Please tell me why this sum of individual cell execution time doesn't match with the overall job execution time ? Where can i see what has taken the additional time here ?
发布评论
评论(1)
请在下面遵循以下参考文献中的详细说明:
数据砖笔记本中的命令单元的执行时间。
测量Apache Spark Workload指标的性能。
参考:
如何要测量spark上查询的执行时间
https://db-blog.web.cern.ch/blog/luca-canali/2017-03-measuring-apache-spark-workload-metrics-performance-故障排除
Please follow below reference it has detail explanation about:
Execution time taken for the command cell in data bricks notebook.
Measuring Apache Spark Workload Metrics for Performance.
Reference:
How to measure the execution time of a query on Spark
https://db-blog.web.cern.ch/blog/luca-canali/2017-03-measuring-apache-spark-workload-metrics-performance-troubleshooting
https://spark.apache.org/docs/latest/monitoring.html