python 的 `assert` 语句可接受的用例是什么?
我经常使用 python 的断言语句来检查用户输入,并在处于损坏状态时快速失败。我知道当 python 带有 -o
(optimized) 标志时,断言会被删除。我个人不会在优化模式下运行任何应用程序,但感觉我应该远离断言以防万一。
感觉写起来
assert filename.endswith('.jpg')
比
if not filename.endswith('.jpg'):
raise RuntimeError
Is this a valid use case for assert? 干净得多。如果不是,Python 的 assert
语句的有效用例是什么?
I often use python's assert statement to check user input and fail-fast if we're in a corrupt state. I'm aware that assert gets removed when python with the -o
(optimized) flag. I personally don't run any of my apps in optimized mode, but it feels like I should stay away from assert just in-case.
It feels much cleaner to write
assert filename.endswith('.jpg')
than
if not filename.endswith('.jpg'):
raise RuntimeError
Is this a valid use case for assert? If not, what would a valid use-case for python's assert
statement be?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
断言应用于表达不变量或前提条件。
在您的示例中,您使用它们来检查意外输入 - 这是完全不同的类异常。
根据需求,在错误输入时引发异常并停止应用程序可能是完全可以的;然而,代码应该始终针对表达能力进行定制,并且引发
AssertionError
并不那么明确。更好的方法是引发您自己的异常或
ValueError
。Assertions should be used for expressing invariants, or preconditions.
In your example, you are using them for checking unexpected input - and that's a completely different class of exceptions.
Depending on the requirements, it may be perfectly OK to raise an exception on wrong input, and stop the application; however the code should always be tailored for expressiveness, and raising an
AssertionError
is not that explicit.Much better would be to raise your own exception, or a
ValueError
.如果优雅是不可能的,那就戏剧化
这是代码的正确版本:
这是一个有效的案例,恕我直言,当:
否则,应对不测事件,因为你可以预见它的到来。
ASSERT ==“这应该在现实中永远不会发生,如果发生了,我们放弃”
当然,这与
但我总是这样做
那样我得到一个很好的错误。在您的示例中,我将使用
if
版本并将文件转换为 jpg!If being graceful is impossible, be dramatic
Here is the correct version of your code:
It is a valid case, IMHO when:
Otherwise, deal with the eventuality because you can see it coming.
ASSERT == "This should never occur in reality, and if it does, we give up"
Of course, this isn't the same as
But I always do
That way I get a nice error. In your example, I would use the
if
version and convert the file to jpg!完全有效。该断言是关于程序状态的正式声明。
你应该将它们用于无法证明的事情。然而,它们对于您认为可以证明的事情很方便,可以检查您的逻辑。
另一个例子。
还有一个。最后的断言有点愚蠢,因为它应该是可证明的。
还有一个。
做出正式断言的理由有很多很多。
Totally valid. The assertion is a formal claim about the state of your program.
You should use them for things which can't be proven. However, they're handy for things you think you can prove as a check on your logic.
Another example.
Yet another. The final assertion is a bit silly, since it should be provable.
Yet another.
There are lots and lots of reasons for making formal assertions.
assert
最适合在测试期间应该处于活动状态的代码,而您肯定要在没有-o
的情况下运行您个人可能永远不会使用
-o
运行> 但是如果您的代码最终出现在一个更大的系统中并且管理员想要使用-o
运行它,会发生什么?系统可能看起来运行良好,但存在一些微妙的错误,这些错误是通过使用
-o
运行来打开的assert
is best used for code that should be active during testing when you are surely going to run without-o
You may personally never run with
-o
but what happens if your code ends up in a larger system and the admin wants to run it with-o
?It's possible that the system will appear to run fine, but there are subtle bugs which have been switched on by running with
-o
就我个人而言,我使用
assert
来处理意外错误,或者在现实使用中不会发生的事情。每当处理来自用户或文件的输入时,都应该使用异常,因为它们可以被捕获,并且您可以告诉用户“嘿,我正在期待一个 .jpg 文件!”Personally, I use
assert
for unexpected errors, or things you don't expect to happen in real world usage. Exceptions should be used whenever dealing with input from a user or file, as they can be caught and you can tell the user "Hey, I was expecting a .jpg file!!"Python Wiki 有一个关于有效使用断言的很棒的指南。
上面的答案不必澄清运行 Python 时的 -O 反对意见。
引用上面的页面:
The Python Wiki has a great guide on using Assertion effectively.
The answers above don't necessary clarify the -O objection when running Python.
The quote the above page:
S.Lott 的答案是最好的。但这太长了,无法只是添加对他的评论,所以我把它放在这里。无论如何,这就是我对断言的看法,基本上它只是执行 #ifdef DEBUG 的一种简写方式。
无论如何,关于输入检查有两种思想流派。您可以在目标处执行此操作,也可以在源处执行此操作。
在代码内部的目标上执行此操作:
现在,这有很多优点。也许编写 sqrt 的人最了解他的算法的有效值。在上面,我不知道我从用户那里得到的值是否有效。必须有人检查它,并且在最了解有效内容的代码(sqrt 算法本身)中进行检查是有意义的。
然而,这会带来性能损失。想象一下这样的代码:
现在,这个函数将调用 sqrt 100k 次。每次,sqrt 都会检查该值是否 >= 0。但是我们已经知道它是有效的,因为我们如何生成这些数字 - 那些额外的有效检查只是浪费了执行时间。摆脱他们不是很好吗?然后有一个会抛出 ValueError,所以我们会捕获它,并意识到我们犯了一个错误。我编写的程序依赖于子函数来检查我,所以我只担心它不起作用时的恢复。
第二种想法是,您不是由目标函数检查输入,而是向定义添加约束,并要求调用者确保它使用有效数据进行调用。该函数承诺,如果数据良好,它将返回合约所说的内容。这避免了所有这些检查,因为调用者比目标函数更了解其发送的数据、数据来自何处以及其固有的约束。这些的最终结果是代码契约和类似的结构,但最初这只是约定俗成的,即在设计中,就像下面的评论:
当然,错误会发生,并且契约可能会被违反。但到目前为止,结果大致相同 - 在这两种情况下,如果我调用 sqrt(-1.0),我将在 sqrt 代码本身内得到一个异常,可以遍历异常堆栈,并找出我的错误在哪里。
然而,还有更多阴险的情况......例如,假设我的代码生成一个列表索引,存储它,稍后查找列表索引,提取一个值,并进行一些处理。假设我们偶然得到了 -1 列表索引。所有这些步骤实际上可能没有任何错误地完成,但在测试结束时我们得到了错误的数据,而且我们不知道为什么。
那么为什么要断言呢?当我们测试和证明我们的合约时,如果有一些东西可以让我们更接近失败的调试信息,那就太好了。这与第一种形式几乎完全相同 - 毕竟,它进行完全相同的比较,但它在语法上更简洁,并且在验证合同方面稍微更专业。一个附带的好处是,一旦您相当确定您的程序可以正常工作,并且正在优化并寻求更高的性能与可调试性,那么所有这些现在冗余的检查都可以删除。
S.Lott's answer is the best one. But this was too long to just add to a comment on his, so I put it here. Anyways, it's how I think about assert, which is basically that it's just a shorthand way to do #ifdef DEBUG.
Anyways, there are two schools of thought about input checking. You can do it at the target, or you can do it at the source.
Doing it at the target is inside the code:
Now, this has a lot of advantages. Probably, the guy who wrote sqrt knows the most about what are valid values for his algorithm. In the above, I have no idea if the value I got from the user is valid or not. Someone has to check it, and it makes sense to do it in the code that knows the most about what's valid - the sqrt algorithm itself.
However, there is a performance penalty. Imagine code like this:
Now, this function is going to call sqrt 100k times. And every time, sqrt is going to check to see if the value is >= 0. But we already know it's valid, because of how we generate those numbers - those extra valid checks are just wasted execution time. Wouldn't it be nice to get rid of them? And then there's one, that will throw a ValueError, and so we'll catch it, and realize we made a mistake. I write my program relying on the subfunction to check me, and so I just worry about recovering when it doesn't work.
The second school of thought is that, instead of the target function checking inputs, you add constraints to the definition, and require the caller to ensure that it's calling with valid data. The function promises that with good data, it will return what its contract says. This avoids all those checks, because the caller knows a lot more about the data it's sending than the target function, where it came from and what constraints it has inherently. The end result of these is code contracts and similar structures, but originally this was just by convention, i.e. in the design, like the comment below:
Of course, bugs happens, and the contract might get violations. But so far, the result is about the same - in both cases, if I call sqrt(-1.0), i will get an exception inside the sqrt code itself, can walk up the exception stack, and figure out where my bug is.
However, there are much more insidious cases...suppose, for instance, my code generates a list index, stores it, later looks up the list index, extracts a value, and does some processing. and let's say we get a -1 list index by accident. all those steps may actually complete without any errors, but we have wrong data at the end of the test, and we have no idea why.
So why assert? It would be nice to have something with which we could get closer-to-the-failure debug information while we are in testing and proving our contracts. This is pretty much exactly the same as the first form - after all, it's doing the exact same comparison, but it is syntactically neater, and slightly more specialized towards verify a contract. A side benefit is that once you are reasonably sure your program works, and are optimizating and looking for higher performance vs. debuggability, all these now-redundant checks can be removed.
底线:
assert
及其语义是早期语言的遗产。ASSERT
语句最初创建的预期用例是:free()
是否传递了之前从malloc()
获取的指针,以及内部结构完整性在进行重要操作后进行检查等。这在过去是一件大事,并且仍然存在于为性能而设计的环境中。这就是 C++/C# 中其语义的全部原因:
然而,Python 有意识地、有意地为了程序员的性能而牺牲了代码性能(不管你信不信,我最近通过将一些代码从 Python 移植到 Cython 获得了 100 倍的加速 - 甚至没有禁用边界检查!)。 Python 代码在“安全”环境中运行,您无法完全“破坏”您的进程(或整个系统),导致无法追踪的段错误/BSoD/砖块 - 最糟糕的情况是出现未处理的异常,并带有大量调试信息附件,全部以可读的形式优雅地呈现给您。
此外,Python 对始终提供源代码的强大影响(透明编译、回溯中的源代码行、库存调试器期望它们与 .pyc 一起发挥作用)非常模糊了界限在“开发”和“使用”之间(这就是为什么
setuptools
' 独立的.egg
s 引起了强烈反对 - 以及为什么 pip 总是在未打包的情况下安装它们:如果安装了打包工具,则源代码不再可用并且存在问题 - 可诊断)。结合起来,这些属性几乎破坏了“仅调试”代码的任何用例。
而且,您猜对了,关于重新利用
assert 的想法
作为通用检查最终浮出水面。Botton line:
assert
and its semantics are a heritage from earlier languages.The intended use case which
ASSERT
statements were originally created in mind with is:free()
is passed a pointer that was previously got frommalloc()
and such, internal structure integrity checks after non-trivial operations with them etc.This was a big deal in older days and still is in environments designed for performance. That's the entire reason for its semantics in, say, C++/C#:
Python, however, consciously and intentionally sacrifices code performance for programmer's performance (believe it or not, I recently got a 100x speedup by porting some code from Python to Cython - without even disabling boundary checks!). Python code runs in a "safe" environment in that you cannot utterly "break" your process (or an entire system) to an untraceable segfault/BSoD/bricking - the worst you'll get is an unhandled exception with a load of debugging information attached, all gracefully presented to you in a readable form.
In addition, Python's strong impact on providing the sources at all times (transparent compilation, source lines in traceback, the stock debugger expecting them alongside
.pyc
's to be of any use) very much blurs the line between "development" and "use" (that's one reason whysetuptools
' self-contained.egg
s created a backlash - and why pip always installs them unpacked: if one is installed packed, the source is no longer readily available and problemes with it - diagnosable).Combined, these properties all but destroy any use cases for "debug-only" code.
And, you guessed it, an idea about repurposing
assert
as a general-purpose check eventually surfaced.