上下文
我有一个日间 TUMBLINGWINDOW(类似于下面所示的)
SELECT
DATEADD(day, -1, System.Timestamp()) AS WindowStart
System.Timestamp() AS WindowEnd,
TollId,
COUNT(*)
FROM Input TIMESTAMP BY EntryTime
GROUP BY TumblingWindow(day, 1), TollId
我一直在阅读 自动暂停文档,并已开始遵循其中包含的步骤。我有一个测试流作业以及一个函数应用程序,它可以托管所有设置的自动暂停 PowerShell 代码,以便我可以在不影响实际作业的情况下进行操作,因为我现在使用单独的测试作业)。 PowerShell 代码保持原样(除了参数值外没有任何更改)但是我还没有真正开始测试流作业,并计划在获得测试流作业后立即执行关于用于自动暂停的参数和触发时间的更多线索。
这是以前的 stackoverflow 帖子,它提供了额外有用的解释,以帮助理解目的以及我想要实现的目标(我创建了该帖子):
发布解释开始时间如何工作的文章,并通过具体示例说明我希望作业如何暂停
其他帖子中的背景场景摘要
目标是能够允许流作业每天运行一次(足够长的时间)
允许每天的全天 TUMBLINGWINDOW 输出包含全天的值
数据)。到
确保为此目的给予足够的时间 我认为
一天中的大部分时间(从 00:30 UTC
开始),除了
23:30 UTC
应该转动的时间
并在当天 (00:00-23:30 UTC
) 之后“赶上积压的工作”
哪一天明智的窗口
在 00:00 UTC
输出,然后关闭,例如在 00:30 UTC
(有足够的时间保证输出)。
然后这个过程将在一个循环中重复
我的问题
我选择的主要参数(在下面添加)是否符合我的意图(如上所述context),如果是这样,我该如何设置 函数应用程序的触发时间,以便此代码按照这些参数按预期运行?
我是否将触发器设置为在 23:30
和 00:30
运行脚本(在文档中提到这是使用 CRON 表达式完成的),因为在这两个点上它都会需要分别启动或停止作业吗?
# This snippet is taken from the auto-pause doc linked above
# Set my own values in minutes based on above discussion
$restartThresholdMinute = 1380 # This is M (1380min = 23hours ie time left off 00:30-23:30 UTC)
$stopThresholdMinute = 60 # This is N (60min = 1hours ie time left on 23:30-00:30 UTC)
# Have left these as default due to present advice
$maxInputBacklog = 0 # The amount of backlog we tolerate when stopping the job (in event count, 0 is a good starting point)
$maxWatermark = 10 # The amount of watermark we tolerate when stopping the job (in seconds, 10 is a good starting point at low SUs)
旁白:
如果我的参数不是一个好的开始选择,还有哪些其他建议? (牢记我在上下文部分讨论的主要限制)
编辑:更新 2022-03-16
@Florian 根据我对您在帖子中提到的内容的理解,我有一些想法,但不确定处理这个问题的最佳方法。 如果您可以在答案中针对此实现添加对代码的修改,那就太好了。
- PowerShell 脚本的整体结构可以保持不变。 在
最后可能最好也改变控制台写入输出等,但是
还没有添加这个。。
- 主要区别可能是启动/停止的 if-else 逻辑
这些工作有一个条件,将时间与某些工作进行比较
预定义的设定值而不是依赖于 M 和 N。
- 也许可以保留水印和积压检查,以便输出到
控制台供参考,但已从所有条件部分中删除。
- 已将
-OutputStartMode LastOutputEventTime
保留为 start_time
选项(认为它基本上是上次停止时
)以确保我们
不要丢失任何数据,并拥有全天的数据
您在上一篇文章中提到.
- 出于最初的概念目的,我保留了几乎所有文档
代码(即使可能不需要)并且只添加了一些变量
并更改了停止/启动 if-else 条件。
- 我所做的更改附近有一个
nishcs edit
注释。
- 我正在使用 最终功能应用程序代码作为起点
目前已考虑使用功能应用进行托管设置。
# Input bindings are passed in via param block.
Param($Timer)
# Stop on error
$ErrorActionPreference = 'stop'
# Write an information log with the current time.
$currentUTCtime = (Get-Date).ToUniversalTime()
$currentUTCstringtime = Get-Date -Date $currentUTCtime -UFormat %R # nishcs edit: Getting the 24hour UTC time format as a string
Write-Host "asaRobotPause - PowerShell timer trigger function is starting at time: $currentUTCtime"
# Set variables
[string]$restartTime = $env:restartTime # nishcs edit: Set this to '23:30' These can infact be hard coded (perhaps best practice to have these as set variables in function app settings, not sure.)
[string]$stopTime = $env:stopTime # nishcs edit: Set this to '00:30'. These can infact be hard coded (perhaps best practice to have these as set variables in function app settings, not sure.)
$maxInputBacklog = $env:maxInputBacklog
$maxWatermark = $env:maxWatermark
$restartThresholdMinute = $env:restartThresholdMinute
$stopThresholdMinute = $env:stopThresholdMinute
$subscriptionId = $env:subscriptionId
$resourceGroupName = $env:resourceGroupName
$asaJobName = $env:asaJobName
$resourceId = "/subscriptions/$($subscriptionId )/resourceGroups/$($resourceGroupName)/providers/Microsoft.StreamAnalytics/streamingjobs/$($asaJobName)"
# Check if managed identity has been enabled and granted access to a subscription, resource group, or resource
$AzContext = Get-AzContext -ErrorAction SilentlyContinue
if (-not $AzContext.Subscription.Id)
{
Throw ("Managed identity is not enabled for this app or it has not been granted access to any Azure resources. Please see https://learn.microsoft.com/en-us/azure/app-service/overview-managed-identity for additional details.")
}
try
{
# throw "This is an error."
# Check current ASA job status
$currentJobState = Get-AzStreamAnalyticsJob -ResourceGroupName $resourceGroupName -Name $asaJobName | Foreach-Object {$_.JobState}
Write-Output "asaRobotPause - Job $($asaJobName) is $($currentJobState)."
# Switch state
if ($currentJobState -eq "Running")
{
# Get-AzActivityLog issues warnings about deprecation coming in future releases, here we ignore them via -WarningAction Ignore
# We check in 1000 record of history, to make sure we're not missing what we're looking for. It may need adjustment for a job that has a lot of logging happening.
# There is a bug in Get-AzActivityLog that triggers an error when Select-Object First is in the same pipeline (on the same line). We move it down.
$startTimeStamp = Get-AzActivityLog -ResourceId $resourceId -MaxRecord 1000 -WarningAction Ignore | Where-Object {$_.EventName.Value -like "Start Job*"}
$startTimeStamp = $startTimeStamp | Select-Object -First 1 | Foreach-Object {$_.EventTimeStamp}
# Get-AzMetric issues warnings about deprecation coming in future releases, here we ignore them via -WarningAction Ignore
$currentBacklog = Get-AzMetric -ResourceId $resourceId -TimeGrain 00:01:00 -MetricName "InputEventsSourcesBacklogged" -DetailedOutput -WarningAction Ignore
$currentWatermark = Get-AzMetric -ResourceId $resourceId -TimeGrain 00:01:00 -MetricName "OutputWatermarkDelaySeconds" -DetailedOutput -WarningAction Ignore
# Metric are always lagging 1-3 minutes behind, so grabbing the last N minutes means checking N+3 actually. This may be overly safe and fined tune down per job.
$Backlog = $currentBacklog.Data | `
Where-Object {$_.Maximum -ge 0} | `
Sort-Object -Property Timestamp -Descending | `
Where-Object {$_.Timestamp -ge $startTimeStamp} | `
Select-Object -First $stopThresholdMinute |
Measure-Object -Sum Maximum
$BacklogSum = $Backlog.Sum
$Watermark = $currentWatermark.Data | `
Where-Object {$_.Maximum -ge 0} | `
Sort-Object -Property Timestamp -Descending | `
Where-Object {$_.Timestamp -ge $startTimeStamp} | `
Select-Object -First $stopThresholdMinute | `
Measure-Object -Average Maximum
$WatermarkAvg = [int]$Watermark.Average
Write-Output "asaRobotPause - Job $($asaJobName) is running since $($startTimeStamp) with a sum of $($BacklogSum) backlogged events, and an average watermark of $($WatermarkAvg) sec, for $($Watermark.Count) minutes."
# nishcs edit: Conditions no longer reliant on the M and N minute. Just on the predefined start/ stop time that have been set.
if (
($currentUTCstringtime -eq $stopTime)
)
{
Write-Output "asaRobotPause - Job $($asaJobName) is stopping..."
Stop-AzStreamAnalyticsJob -ResourceGroupName $resourceGroupName -Name $asaJobName
}
else {
Write-Output "asaRobotPause - Job $($asaJobName) is not stopping yet, it needs to have less than $($maxInputBacklog) backlogged events and under $($maxWatermark) sec watermark for at least $($stopThresholdMinute) minutes."
}
}
elseif ($currentJobState -eq "Stopped")
{
# Get-AzActivityLog issues warnings about deprecation coming in future releases, here we ignore them via -WarningAction Ignore
# We check in 1000 record of history, to make sure we're not missing what we're looking for. It may need adjustment for a job that has a lot of logging happening.
# There is a bug in Get-AzActivityLog that triggers an error when Select-Object First is in the same pipeline (on the same line). We move it down.
$stopTimeStamp = Get-AzActivityLog -ResourceId $resourceId -MaxRecord 1000 -WarningAction Ignore | Where-Object {$_.EventName.Value -like "Stop Job*"}
$stopTimeStamp = $stopTimeStamp | Select-Object -First 1 | Foreach-Object {$_.EventTimeStamp}
# Get-Date returns a local time, we project it to the same time zone (universal) as the result of Get-AzActivityLog that we extracted above
$minutesSinceStopped = ((Get-Date).ToUniversalTime()- $stopTimeStamp).TotalMinutes
# nishcs edit: Conditions no longer reliant on the M and N minute. Just on the predefined start/ stop time that have been set.
if ($currentUTCstringtime -eq $restartTime)
{
Write-Output "asaRobotPause - Job $($jobName) was paused $([int]$minutesSinceStopped) minutes ago, set interval is $($restartThresholdMinute), it is now starting..."
Start-AzStreamAnalyticsJob -ResourceGroupName $resourceGroupName -Name $asaJobName -OutputStartMode LastOutputEventTime
}
else{
Write-Output "asaRobotPause - Job $($jobName) was paused $([int]$minutesSinceStopped) minutes ago, set interval is $($restartThresholdMinute), it will not be restarted yet."
}
}
else {
Write-Output "asaRobotPause - Job $($jobName) is not in a state I can manage: $($currentJobState). Let's wait a bit, but consider helping is that doesn't go away!"
}
# Final ASA job status check
$newJobState = Get-AzStreamAnalyticsJob -ResourceGroupName $resourceGroupName -Name $asaJobName | Foreach-Object {$_.JobState}
Write-Output "asaRobotPause - Job $($asaJobName) was $($currentJobState), is now $($newJobState). Job completed."
}
catch
{
throw $_.Exception.Message
}
Context
I have a daywise TUMBLINGWINDOW (similar to the one shown below)
SELECT
DATEADD(day, -1, System.Timestamp()) AS WindowStart
System.Timestamp() AS WindowEnd,
TollId,
COUNT(*)
FROM Input TIMESTAMP BY EntryTime
GROUP BY TumblingWindow(day, 1), TollId
I have been reading the autopause documentation and have started following the steps included within it. I have a test stream job along with a function app which can host the autopause PowerShell code all setup so that I can playaround without impacting the actual job, since I am using a separate test job for now). The PowerShell code has been left as is (no changes except parameter values) however I am yet to actually start the test stream job and am planning to do so once I have got a little more of a clue as to the parameters and trigger time to use for the auto-pause stuff.
Here is a previous stackoverflow post which provide additional useful explanation for understanding purposes along with what I am trying to achieve (I created the post):
Post explaining how start time works with specific examples as to how I'd want the job to pause
Summary of Background Scenario in other posts
Aim is to be able to allow the stream job to run once in a day (long enough
to allow the full days TUMBLINGWINDOW output to come out each day with the full days worth of
data). To
ensure enough time is given for this purpose I was, thinking that the
job can remain off majority of the day (from 00:30 UTC
) except for
23:30 UTC
when it should turn
on and "catch up with the backlog" for the day (00:00-23:30 UTC
) after
which the day wise window
outputs at 00:00 UTC
and subsequently switch off, say at 00:30 UTC
(having had enough time to ensure output).
This process would then repeat in a cycle
My Question
Do the main parameters I have chosen (added below) fit with my intentions (as described in above context) and if so how do I set the trigger time of the function app so that this code runs as intended per these parameters?
Would I set the trigger to run the script at 23:30
and 00:30
(mentioned in docs this is done using CRON expressions) since at both of those points it would either need to start or stop the job respectively?
# This snippet is taken from the auto-pause doc linked above
# Set my own values in minutes based on above discussion
$restartThresholdMinute = 1380 # This is M (1380min = 23hours ie time left off 00:30-23:30 UTC)
$stopThresholdMinute = 60 # This is N (60min = 1hours ie time left on 23:30-00:30 UTC)
# Have left these as default due to present advice
$maxInputBacklog = 0 # The amount of backlog we tolerate when stopping the job (in event count, 0 is a good starting point)
$maxWatermark = 10 # The amount of watermark we tolerate when stopping the job (in seconds, 10 is a good starting point at low SUs)
Side point:
If my parameters are not a good choice to start with what are some other suggestions? (Bearing in mind the primary constraints that I have discussed in the context section)
Edit: Update 2022-03-16
@Florian I have a few thoughts based on my understanding of what you mentioned in your post however not sure exactly the best way to handle this. If you could add an adaptation to your code for this implementation in your answer that would be good.
- The overall structure of the PowerShell script can remain the same. In
the end its probably best to change the console write outputs etc too but
havent added that.
- A primary difference could be the if-else logic which starts/ stops
the jobs these would have a condition which compares the time to some
predefined set value rather than relying on M and N.
- Perhaps the watermark and backlog checks can be kept just to output to
the console for reference but removed from all condition sections.
- Have kept the
-OutputStartMode LastOutputEventTime
as the start_time
option (think its basically when last stopped
) to ensure that we
don't lose out on any data and have the full days worth of data as
you mentioned in a previous post.
- For initial concept purposes I have kept almost all the documentation
code (even though it may not be needed) and just added a few variables
and changed the stop/ start if-else conditions.
- The changes I have made have a
nishcs edit
comment near them.
- I am using the final function app code as a starting point since
currently have setup with function app in mind for hosting.
# Input bindings are passed in via param block.
Param($Timer)
# Stop on error
$ErrorActionPreference = 'stop'
# Write an information log with the current time.
$currentUTCtime = (Get-Date).ToUniversalTime()
$currentUTCstringtime = Get-Date -Date $currentUTCtime -UFormat %R # nishcs edit: Getting the 24hour UTC time format as a string
Write-Host "asaRobotPause - PowerShell timer trigger function is starting at time: $currentUTCtime"
# Set variables
[string]$restartTime = $env:restartTime # nishcs edit: Set this to '23:30' These can infact be hard coded (perhaps best practice to have these as set variables in function app settings, not sure.)
[string]$stopTime = $env:stopTime # nishcs edit: Set this to '00:30'. These can infact be hard coded (perhaps best practice to have these as set variables in function app settings, not sure.)
$maxInputBacklog = $env:maxInputBacklog
$maxWatermark = $env:maxWatermark
$restartThresholdMinute = $env:restartThresholdMinute
$stopThresholdMinute = $env:stopThresholdMinute
$subscriptionId = $env:subscriptionId
$resourceGroupName = $env:resourceGroupName
$asaJobName = $env:asaJobName
$resourceId = "/subscriptions/$($subscriptionId )/resourceGroups/$($resourceGroupName)/providers/Microsoft.StreamAnalytics/streamingjobs/$($asaJobName)"
# Check if managed identity has been enabled and granted access to a subscription, resource group, or resource
$AzContext = Get-AzContext -ErrorAction SilentlyContinue
if (-not $AzContext.Subscription.Id)
{
Throw ("Managed identity is not enabled for this app or it has not been granted access to any Azure resources. Please see https://learn.microsoft.com/en-us/azure/app-service/overview-managed-identity for additional details.")
}
try
{
# throw "This is an error."
# Check current ASA job status
$currentJobState = Get-AzStreamAnalyticsJob -ResourceGroupName $resourceGroupName -Name $asaJobName | Foreach-Object {$_.JobState}
Write-Output "asaRobotPause - Job $($asaJobName) is $($currentJobState)."
# Switch state
if ($currentJobState -eq "Running")
{
# Get-AzActivityLog issues warnings about deprecation coming in future releases, here we ignore them via -WarningAction Ignore
# We check in 1000 record of history, to make sure we're not missing what we're looking for. It may need adjustment for a job that has a lot of logging happening.
# There is a bug in Get-AzActivityLog that triggers an error when Select-Object First is in the same pipeline (on the same line). We move it down.
$startTimeStamp = Get-AzActivityLog -ResourceId $resourceId -MaxRecord 1000 -WarningAction Ignore | Where-Object {$_.EventName.Value -like "Start Job*"}
$startTimeStamp = $startTimeStamp | Select-Object -First 1 | Foreach-Object {$_.EventTimeStamp}
# Get-AzMetric issues warnings about deprecation coming in future releases, here we ignore them via -WarningAction Ignore
$currentBacklog = Get-AzMetric -ResourceId $resourceId -TimeGrain 00:01:00 -MetricName "InputEventsSourcesBacklogged" -DetailedOutput -WarningAction Ignore
$currentWatermark = Get-AzMetric -ResourceId $resourceId -TimeGrain 00:01:00 -MetricName "OutputWatermarkDelaySeconds" -DetailedOutput -WarningAction Ignore
# Metric are always lagging 1-3 minutes behind, so grabbing the last N minutes means checking N+3 actually. This may be overly safe and fined tune down per job.
$Backlog = $currentBacklog.Data | `
Where-Object {$_.Maximum -ge 0} | `
Sort-Object -Property Timestamp -Descending | `
Where-Object {$_.Timestamp -ge $startTimeStamp} | `
Select-Object -First $stopThresholdMinute |
Measure-Object -Sum Maximum
$BacklogSum = $Backlog.Sum
$Watermark = $currentWatermark.Data | `
Where-Object {$_.Maximum -ge 0} | `
Sort-Object -Property Timestamp -Descending | `
Where-Object {$_.Timestamp -ge $startTimeStamp} | `
Select-Object -First $stopThresholdMinute | `
Measure-Object -Average Maximum
$WatermarkAvg = [int]$Watermark.Average
Write-Output "asaRobotPause - Job $($asaJobName) is running since $($startTimeStamp) with a sum of $($BacklogSum) backlogged events, and an average watermark of $($WatermarkAvg) sec, for $($Watermark.Count) minutes."
# nishcs edit: Conditions no longer reliant on the M and N minute. Just on the predefined start/ stop time that have been set.
if (
($currentUTCstringtime -eq $stopTime)
)
{
Write-Output "asaRobotPause - Job $($asaJobName) is stopping..."
Stop-AzStreamAnalyticsJob -ResourceGroupName $resourceGroupName -Name $asaJobName
}
else {
Write-Output "asaRobotPause - Job $($asaJobName) is not stopping yet, it needs to have less than $($maxInputBacklog) backlogged events and under $($maxWatermark) sec watermark for at least $($stopThresholdMinute) minutes."
}
}
elseif ($currentJobState -eq "Stopped")
{
# Get-AzActivityLog issues warnings about deprecation coming in future releases, here we ignore them via -WarningAction Ignore
# We check in 1000 record of history, to make sure we're not missing what we're looking for. It may need adjustment for a job that has a lot of logging happening.
# There is a bug in Get-AzActivityLog that triggers an error when Select-Object First is in the same pipeline (on the same line). We move it down.
$stopTimeStamp = Get-AzActivityLog -ResourceId $resourceId -MaxRecord 1000 -WarningAction Ignore | Where-Object {$_.EventName.Value -like "Stop Job*"}
$stopTimeStamp = $stopTimeStamp | Select-Object -First 1 | Foreach-Object {$_.EventTimeStamp}
# Get-Date returns a local time, we project it to the same time zone (universal) as the result of Get-AzActivityLog that we extracted above
$minutesSinceStopped = ((Get-Date).ToUniversalTime()- $stopTimeStamp).TotalMinutes
# nishcs edit: Conditions no longer reliant on the M and N minute. Just on the predefined start/ stop time that have been set.
if ($currentUTCstringtime -eq $restartTime)
{
Write-Output "asaRobotPause - Job $($jobName) was paused $([int]$minutesSinceStopped) minutes ago, set interval is $($restartThresholdMinute), it is now starting..."
Start-AzStreamAnalyticsJob -ResourceGroupName $resourceGroupName -Name $asaJobName -OutputStartMode LastOutputEventTime
}
else{
Write-Output "asaRobotPause - Job $($jobName) was paused $([int]$minutesSinceStopped) minutes ago, set interval is $($restartThresholdMinute), it will not be restarted yet."
}
}
else {
Write-Output "asaRobotPause - Job $($jobName) is not in a state I can manage: $($currentJobState). Let's wait a bit, but consider helping is that doesn't go away!"
}
# Final ASA job status check
$newJobState = Get-AzStreamAnalyticsJob -ResourceGroupName $resourceGroupName -Name $asaJobName | Foreach-Object {$_.JobState}
Write-Output "asaRobotPause - Job $($asaJobName) was $($currentJobState), is now $($newJobState). Job completed."
}
catch
{
throw $_.Exception.Message
}
发布评论
评论(2)
我认为您需要一种不同的调度逻辑,中描述的调度逻辑文章。
来自文章:
我认为您需要的是:
实现用例的最简单方法是创建 2 个简单作业,一个用于启动,一个用于停止。就触发器而言:
0 30 23 * * *
0 30 0 * * *
如果您需要帮助调整代码,请告诉我。
I think you need a different scheduling logic that the one described in the article.
From the article:
What I think you need:
The simplest way to implement your use case is to create 2 simple jobs, one for starting, one for stopping. In terms of triggers:
0 30 23 * * *
0 30 0 * * *
Let me know if you need help adapting the code.
在此发布是为了问题的完整性。我提供了修改后的脚本来处理特定时间的停止和启动。
这是根据 @Florian 的建议得出的。
方法 1:函数应用方法
如果您计划使用函数应用来托管代码,则可以在单个函数应用中创建 2 个单独的函数。一种用于停止,一种用于重新启动流作业。 下面我附上了每个函数的 PowerShell 代码 (run.ps1)。
可以将函数的参数添加到函数应用程序的配置部分,并使用环境变量语法将其拉入此处的脚本中。
功能 1(重新启动作业):asa-autorestart
功能 2(停止作业):asa-autostop
方法 2:自动化作业方法
如果您计划使用自动化帐户来托管代码,您可以创建 2 个单独的 帐户自动化帐户内的操作手册。一种用于停止,一种用于重新启动流作业。 下面我附上了每个 Runbook 的 PowerShell 代码。
发布 Runbook 并且您计划每本书在特定时间运行后,就可以添加 Runbook 的参数。然后可以使用标准参数语法将其拉入脚本中。
Runbook 1(重新启动作业):asa-autorestart
Runbook 2(停止作业):asa-autostop
Posting here for completeness of the question. I have provided the modified script to handle stopping and starting at a particular time.
This follows from the suggestions by @Florian.
Method 1: Function App Method
If you plan on using a function app to host the code you can create 2 separate functions within a single function app. One for stopping and one for restarting the stream job. Below I have attached the PowerShell code for each of the functions (run.ps1).
The parameters for the function can be added to the config section of the function app and pulled into the script here using the environment variable syntax.
Function 1 (restart job): asa-autorestart
Function 2 (stopping job): asa-autostop
Method 2: Automation Job Method
If you plan on using a Automation account to host the code you can create 2 separate runbooks within a automation account. One for stopping and one for restarting the stream job. Below I have attached the PowerShell code for each of the Runbooks .
The parameters for the runbook can be added once the runbook has been published and you are scheduling the each book to run at a specific time. This can then be pulled into the script using the standard parameter syntax.
Runbook 1 (restarting job): asa-autorestart
Runbook 2 (stopping job): asa-autostop