任务关键流程防错步骤
我正在编写一个程序,它将连续处理放入常用文件夹中的文件。
该程序应具有 100% 的正常运行时间,无需管理员干预。 换句话说,它不应该因“愚蠢”的错误而失败。 即有人删除了输出目录,它应该简单地重新创建它并继续。
我正在考虑做的是对整个程序进行编码,然后仔细查找“错误点”,然后添加代码来处理错误。
我试图避免的是添加错误或不必要的错误处理,甚至将错误处理构建到程序的控制流中(即错误处理控制程序的流程)。 或许它可以在一定程度上控制流程,但这会构成糟糕的设计(主观)。
“关键”流程“防错”的方法有哪些?
I'm writing a program that will continuously process files placed into a hot folder.
This program should have 100% uptime with no admin intervention. In other words it should not fail on "stupid" errors. i.e. Someone deletes the output directory it should simply recreate it and move on.
What I'm thinking about doing is to code the entire program and then go through and look for "error points" and then add code to handle the error.
What I'm trying to avoid is adding erroneous or unnecessary error handling or even building error handling into the control flow of the program (i.e. the error handling controls the flow of the program). Well perhaps it could control the flow to a certain extent, but that would constitute bad design (subjective).
What are some methodologies for "error proofing" a "critical" process?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
如果您的流程必须防错并且没有管理员干预,您必须处理所有可能的错误。 如果您留下任何停止程序的机会,它就会发生(墨菲定律),而您却不会知道。
即使处理所有可能的错误,我认为您也需要一些日志记录,甚至需要带有(邮件?)警报的监视器,以确保您的进程始终运行良好。
If your process must be error-proof and have no admin intervention, you must handle all possible errors. If you leave any chance of stopping the program, it will happen (Murphy's Law) and you will not know.
Even handling all possible errors, I think you'll need some logging and even a monitor with (mail?) alerts to be sure your process is always running fine.
要做的最重要的事情是以单元测试的形式记录您的假设。您应该编写一个违反每个假设的测试,然后证明您的程序成功恢复或采取措施使这种状态又是真的。
要使用您的示例,如果有人可以删除关键文件夹,请进行模拟此的测试,然后表明您的程序可以处理这种情况而不会崩溃。
The most important thing to do is to document your assumptions in the form of unit tests. You should write a test that violates each assumption, and then prove that your program successfully recovers or takes action to make this state true again.
To use your example, if someone could delete the critical folder, make a test that simulates this and then show that your program handles this case without crashing.
单元测试。
Unit testing.
彻底分析的技术是HAZOP 研究,其中对于流程的每个部分,您都考虑关键字对于那个过程。 对于加工厂中的化学品,这些可能是“更多”、“更少”、“缺失”、“更热”、“更冷”、“泄漏”、“压力”等等。
当将HAZOP应用于软件时,您会考虑关键字适合您软件中的对象。
例如,对于读取文件,您可能会认为“更多”是缓冲区溢出,“更少”丢失数据,“丢失”不存在,“泄漏”缺少文件句柄,等等。
On technique for thorough analysis is a HAZOP study, where for each part of the process you consider keywords for that process. For a chemical in a process plant, these might be 'more' 'less', 'missing', 'hotter' 'colder' 'leak' 'pressure' and so-one.
When applying HAZOP to software, you would consider keywords appropriate to the objects in your software.
For example, for a reading a file you might consider 'more' to be buffer overrun, 'less' missing data, 'missing' not existing, 'leak' lack of file handles, and so on.