您如何评估软件的可靠性?
我们目前正在为即将进行的贸易研究制定评估标准。
我们选择的标准之一是可靠性(和/或稳健性 - 这些相同吗?)。
如果您无法花太多时间评估软件,您如何评估该软件是否可靠?
编辑:按照 KenG 给出的答复,缩小问题的焦点: 您可以从 50 种现有软件解决方案中进行选择。 您需要评估它们的可靠性,但无法测试它们(至少在最初)。 您可以使用哪些有形指标或其他指标来评估所述可靠性?
We are currently setting up the evaluation criteria for a trade study we will be conducting.
One of the criterion we selected is reliability (and/or robustness - are these the same?).
How do you assess that software is reliable without being able to afford much time evaluating it?
Edit: Along the lines of the response given by KenG, to narrow the focus of the question:
You can choose among 50 existing software solutions. You need to assess how reliable they are, without being able to test them (at least initially). What tangible metrics or other can you use to evaluate said reliability?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(10)
可靠性和稳健性是系统的两个不同属性:
可靠性
稳健性
因此,可靠的系统按照其设计的约束条件执行其功能; 如果发生意外/意外情况,稳健系统将继续运行。
如果您可以访问正在评估的软件的任何历史记录,则可以从报告的缺陷、随时间推移发布的“补丁”版本数量、甚至代码库的变动中推断出一些可靠性概念。
产品是否有自动化测试流程? 测试覆盖率可能是信心的另一个标志。
一些使用敏捷方法的项目可能不太符合这些标准 - 预计会进行频繁的发布和大量的重构
与软件/产品的当前用户核实以获取真实世界的信息。
Reliability and robustness are two different attributes of a sytem:
Reliability
Robustness
So a reliable system performs its functions as it was designed to within constraints; A robust system continues to operate if the unexpected/unanticipated occurs.
If you have access to any history of the software you're evaluating, some idea of reliability can be inferred from reported defects, number of 'patch' releases over time, even churn in the code base.
Does the product have automated test processes? Test coverage can be another indication of confidence.
Some projects using agile methods may not fit these criteria well - frequent releases and a lot of refactoring are expected
Check with current users of the software/product for real world information.
这取决于您正在评估的软件类型。 网站可靠性的主要(也许是唯一)标准可能是其正常运行时间。 NASA 对其软件的可靠性将有一个完全不同的定义。 您的定义可能介于两者之间。
如果您没有太多时间来评估可靠性,那么自动化测量过程绝对至关重要。 您可以使用持续集成工具来确保您只需手动查找一次错误。
我建议您或您公司中的某人阅读持续集成:提高软件质量并降低风险< /a>. 我认为这将有助于您找到自己对软件可靠性的定义。
It depends on what type of software you're evaluating. A website's main (and maybe only) criteria for reliability might be its uptime. NASA will have a whole different definition for reliability of its software. Your definition will probably be somewhere in between.
If you don't have a lot of time to evaluate reliability, it is absolutely critical that you automate your measurement process. You can use continuous integration tools to make sure that you only ever have to manually find a bug once.
I recommend that you or someone in your company read Continuous Integration: Improving Software Quality and Reducing Risk. I think it will help lead you to your own definition of software reliability.
与已经使用它的人交谈。 您可以测试自己的可靠性,但这很困难、昂贵,而且可能非常不可靠,具体取决于您所测试的内容,尤其是在您时间紧迫的情况下。 大多数公司都会愿意让您与当前客户联系,如果这有助于向您出售他们的软件,并且他们能够向您提供有关软件如何处理的真实想法。
Talk to people already using it. You can test yourself for reliability, but it's difficult, expensive, and can be very unreliable depending on what you're testing, especially if you're short on time. Most companies will be willing to put you in contact with current clients if it will help sell you their software and they will be able to give you a real-world idea of how the software handles.
与任何事情一样,如果您没有时间自己评估某件事,那么您就必须依赖他人的判断。
As with anything, if you don't have the time to assess something yourself, then you have to rely on the judgement of others.
可靠性是事物有效性的三个方面之一。另外两个是可维护性和可用性...
一篇有趣的论文... http://www.barringer1.com/pdf/ARMandC.pdf 更详细地讨论了这一点,但一般来说,
可靠性是基于系统崩溃的概率。它损坏的可能性越大,可靠性就越低...在其他系统(软件除外)中,它通常以平均故障间隔时间(MTBF)来衡量,这是硬盘等设备的常见指标...( 10000 小时 MTBF)在软件中,我想您可以用关键系统故障之间、应用程序崩溃之间、不可恢复的错误之间、或妨碍或对正常系统生产力产生不利影响的任何类型的错误之间的平均时间来衡量......
可维护性是衡量当它确实发生故障时修复它需要多长时间/多昂贵(多少工时和/或其他资源)。 在软件中,您可以在这个概念中添加增强或扩展软件的时间/成本(如果这是一个持续的需求)
可用性是前两者的组合,并向规划者指示,如果我有 100这些运行了十年的设备,在计算出故障以及每个故障单元在修理、维修等过程中不可用的时间后,平均而言,这 100 个设备中有多少会同时启动并运行? 20%,还是98%?
Reliability is one of three aspects of somethings' effectiveness.. The other two are Maintainability and Availability...
An interesting paper... http://www.barringer1.com/pdf/ARMandC.pdf discusses this in more detail, but generally,
Reliability is based on the probability that a system will break.. i.e., the more likely it is to break, the less reliable it is... In other systems (other than software) it is often measured in Mean Time Between Failure (MTBF) This is a common metric for things like a hard disk... (10000 hrs MTBF) In software, I guess you could measure it in Mean Time between critical system failures, or between application crashes, or between unrecoverable errors, or between errors of any kind that impede or adversely affect normal system productivity...
Maintainability is a measure of how long/how expensive (how many man-hours and/or other resources) it takes to fix it when it does break. In software, you could add to this concept how long/how expensive it is to enhance or extend the software (if that is an ongoing requirement)
Availability is a combination of the first two, and indicates to a planner, if I had a 100 of these things running for ten years, after figuring the failures and how long each failed unit was unavailable while it was being fixed, repaired, whatever, How many of the 100, on average, would be up and running at any one time? 20% , or 98% ?
好吧,关键字“可靠”可能会导致不同的答案......在考虑可靠性时,我想到两个方面:
无论哪种方式,我认为归根结底一些可重复的测试。 如果相关应用程序不是使用一套强大的单元和验收测试构建的,您仍然可以提出一组手动或自动测试来重复执行。
测试总是返回相同的结果这一事实表明,第 2 方面已得到考虑。 对于第 1 方面,这实际上取决于测试编写者:提出能够暴露错误或缺陷的良好测试。
抱歉,在不知道该应用程序的内容的情况下,我无法提供更具体的信息。 例如,如果消息总是被传递、从不丢失、从不包含错误等等,那么消息传递系统就是可靠的……计算器对可靠性的定义将有很大不同。
Well, the keyword 'reliable' can lead to different answers... When thinking of reliability, I think of two aspects:
Either way, I think it boils down to some repeatable tests. If the application in question is not built with a strong suite of unit and acceptance tests, you can still come up with a set of manual or automated tests to perform repeatedly.
The fact that the tests always return the same results will show that aspect #2 is taken care of. For aspect #1 it really is up to the test writers: come up with good tests that would expose bugs or imperfections.
I can't be more specific without knowing what the application is about, sorry. For instance, a messaging system would be reliable if messages were always delivered, never lost, never contain errors, etc etc... a calculator's definition of reliability would be much different.
我的建议是遵循围绕 SLI、SLO 和 SLA 的 SRE 方法,免费电子书中有最好的总结:
从您需要的工具角度更多地看待可靠性:
My advice is to follow SRE methodology around SLI, SLO and SLA, best summarized in free ebooks:
Looking at the reliability more from tool perspective you need:
在进入这个过程时,您必须理解并完全接受您将做出妥协,如果可靠性是关键标准并且您没有(或不愿意投入)资源来适当评估,这可能会产生负面影响基于此。
话虽如此,确定软件可靠性至关重要的关键要求是什么,然后根据这些要求设计测试进行评估。
稳健性和可靠性彼此之间存在交叉关系,但不一定相同。
如果您的数据服务器无法处理超过 10 个连接,而您预计有 100000 个连接,那么它就不稳健。 如果>>就死掉了,那就不靠谱了。 10 个连接。 如果同一台服务器可以处理所需的连接数量,但间歇性地死机,您可以说它仍然不健壮且不可靠。
我的建议是,您咨询一位经验丰富的 QA 人员,他对您将进行的研究领域具有丰富的知识。 该人将能够帮助您设计关键领域的测试 - 希望在您的资源限制范围内。 我建议找一个中立的第三方(而不是软件编写者或供应商)来帮助您决定需要测试的关键功能才能做出决定。
You will have to go into the process by understanding and fully accepting that you will be making a compromise, which could have negative effects if reliability is a key criterion and you don't have (or are unwilling to commit) the resources to appropriately evaluate based on that.
Having said that - determine what the key requirements are that make software reliability critical, then devise tests to evaluate based on those requirements.
Robustness and reliability cross in their relationship to each other, but are not necessarily the same.
If you have a data server that cannot handle more than 10 connections and you expect 100000 connections - it is not robust. It will be unreliable if it dies at > 10 connections. If that same server can handle the number of required connections but intermittently dies, you could say that it is still not robust and not reliable.
My suggestion is that you consult with an experienced QA person who is knowledgeable in the field for the study you will conduct. That person will be able to help you devise tests for key areas -hopefully within your resource constraints. I'd recommend a neutral 3rd party (rather than the software writer or vendor) to help you decide on the key features you'll need to test to make your determination.
如果您无法对其进行测试,则必须依赖开发人员的声誉以及他们在此应用程序上遵循与其他测试的应用程序相同的实践的程度。 示例:Microsoft 在其应用程序的版本 1 方面做得并不好,但在 3 和 2 版中却做得很好。 4 通常都相当不错(Windows ME 版本为 0.0001)。
If you can't test it, you'll have to rely on the reputation of the developer(s) along with how well they followed the same practices on this application as their other tested apps. Example: Microsoft does not do a very good job with the version 1 of their applications, but 3 & 4 are usually pretty good (Windows ME was version 0.0001).
根据您正在评估的服务类型,您可能会获得可靠性指标或 SLI - 服务级别指标 - 衡量服务/产品运行情况的指标。 例如 - 在 1 秒内处理 99% 的请求。
基于 SLI,您可以设置服务级别协议 - 您和软件提供商之间关于您想要的 SLO(服务级别目标)的合同,以及如果他们不提供这些目标的后果。
Depending on the type of service you are evaluating, you might get reliability metrics or SLI - service level indicators - metrics capturing how well the service/product is doing. For example - process 99% of requests under 1sec.
Based on the SLI you might setup service level agreements - a contract between you and the software provider on what SLO (service level objectives) you would like with the consequences of not them not delivering those.