Microsoft HPC 任务的错误处理策略
我有一个 .NET 应用程序,它将生成在 MS HPC 集群上运行的任务。我们没有使用任何花哨的 DryadLINQ 东西,只是在集群上远程执行一个 exe 并通过命令行传递参数。该任务将是 .NET 代码,我希望调用应用程序在 HPC 上发生错误时获取实际的 Exception 对象。
实现这一目标的最佳通用技术是什么?
如果您需要更多信息,请告诉我。
谢谢!
I have a .NET app that will be spawning tasks to run on an MS HPC cluster. We're not using any of that fancy DryadLINQ stuff, just remotely executing an exe on the cluster and passing arguments via the command line. The task will be .NET code, and I'd like the calling app to get an actual Exception object when an error occurs on HPC.
What's the best general technique for accomplishing this?
Let me know if you need any more info.
Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
使用批处理调度程序时,无法将异常从可执行文件传递回客户端 HPC 应用程序。如果足以知道您排队的任务或作业之一失败,那么您可以保留 SchedulerJob 对象并向 OnJobState 或 OnTaskState 事件添加回调。每当您的作业(或该作业中的任务)更改状态时,您都会在回调中获得 jobid/taskid 和状态更改信息;然后您可以检查状态是否更改为“失败”并根据该信息采取行动。
要将任务或作业标记为“失败”,请让可执行文件以非零退出代码退出。如果您需要有关实际异常的详细信息,最好将其打印到标准输出。
如果您确实需要所有异常详细信息,另一种选择可能是使用 SOA 框架进行计算。
优点是:
您的计算请求看起来像 WCF
方法调用
当你得到详细的异常信息时
你的代码抛出
您可以使用 SOA 调试器
扩展至 Visual Studio 进行调试
您的代码
缺点是:
以下是一些可帮助您入门的资源(搜索“Windows HPC SOA”应该会为您提供更多信息):
MSDN SOA 文档
You can't pass the exception back from your executable to the client HPC app when you're using the batch scheduler. If it's good enough to know that one of the tasks or jobs that you queued failed, then you can hold onto a SchedulerJob object and add a callback to the OnJobState or OnTaskState event. Whenever your job (or a task in that job) changes state you'll get the jobid/taskid and state change information in your callback; then you can check if the state was changed to "Failed" and act on that information.
To mark a task or job as "Failed", have your executable exit with a non-zero exit code. If you need details on the actual exception, the best you can do is print it to stdout.
If you really need all the exception details, an alternative might be to use the SOA framework for your computations.
Advantages would be:
your compute requests look like WCF
method calls
you get detailed exceptions back when
your code throws
you can use the SOA debugger
extension to Visual Studio to debug
your code
Disadvantages would be:
Here are some resources to get you started (a search for "Windows HPC SOA" should get you much more):
MSDN SOA documentation