有意义的诊断消息
看了几篇文章,我觉得出现许多问题是因为编译器/实现多次(但并非总是)发出非常有意义的消息。对于模板来说尤其如此,其中错误消息至少可能非常令人畏惧。一个典型的例子可能是讨论主题
因此,我想了解一些事情:
a)为什么编译器有时无法给出更有意义/有用的错误消息?纯粹是实际原因还是技术原因,还是还有其他原因。 (我没有编译器背景)
b) 为什么他们不能提供最相关符合 C++ 标准章节/部分的参考,以便开发者社区可以学好C++?
编辑:
请参阅线程此处了解另一个示例。
编辑:
请参阅线程此处了解另一个示例。
Looking at several posts, I get a feel that many of the questions arise because compilers/implemenetation do not emit a very meaningful message many times (but not always). This is especially true in the case of templates where error messages could be at the least very daunting. A case in point could be the discussion topic
Therefore, I would like to understand a few things:
a) Why is it that compilers are sometimes unable to give more meaningful/helpful error messages? Is the reason purely practical or technical or is there something else. (I don't have a compiler background)
b) Why can't they give a reference to the most relevant conforming C++ Standard Verse/section, so that developer community can learn C++ better?
EDIT:
Refer the thread here for another example.
EDIT:
Refer the thread here for another example.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
根本问题是编译器诊断处理的是您未编写的内容。
为了给您一个有意义的错误消息,编译器必须猜测您的意思,然后告诉您您的代码与此有何不同。
如果缺少分号,编译器显然在任何地方都看不到该分号。当然,它可以做的事情之一就是猜测“也许用户缺少一个分号。毕竟这是一个常见的错误”。但分号应该在哪里呢?因为您犯了错误,所以代码无法解析为语法树,因此没有明确的指示“树中缺少此节点”。并且可能不止一个地方可以插入分号,以便周围的代码能够正确解析。此外,一旦发现可能的错误,您将尝试解析/重新编译多少代码?编译器可以插入分号,但至少必须重新开始解析该代码块。但也许它在代码中进一步引入了错误。因此,也许整个程序应该重新编译,只是为了确保编译器提出的修复实际上是正确的。但这也不是一个选择。时间太长了。
假设你有一些这样的代码:
这里有什么错误?看着它,你和我会说“你在类定义后面缺少分号”。但是编译器怎么知道呢?
void
可能是一个拼写错误。也许您实际上打算编写foo
类型的实例的名称。那么真正的错误是它后面跟着现在看起来像函数调用的内容。所以编译器必须猜测。 “这看起来可能是一个类定义,而它后面的内容看起来像是一个类型的名称。如果这是真的,则用户缺少一个分号来分隔它们”。
猜测并不是一门非常精确的科学。事情变得更加复杂,因为每次编译器试图聪明地进行猜测时,如果猜测错误,只会增加混乱。
因此,有时,最好输出一条简短的消息,仅说明我们确定的内容(例如,类定义后面不能跟类型名称)。这并不像说“你在类定义后缺少一个分号”那么有用,但如果编译器猜测错误,它的危害也较小。
如果它告诉您缺少一个分号,而错误实际上是其他内容,那么它只是在误导您。因此,在最坏的情况下,简洁且不太有用的错误消息可能会更好,即使在最好的情况下它并不那么好。
编写好的编译器错误并不容易,尤其是在像 C++ 这样混乱的语言中。
但话虽如此,一些编译器(包括 MSVC 和 GCC)可能会好得多。我相信更好的编译器诊断是 Clang 的主要目标之一。
The fundamental problem is that compiler diagnostics deal with things you haven't written.
In order to give you a meaningful error message, the compiler has to guess what you meant, and then tell you how your code differs from that.
If you're missing a semicolon, the compiler obviously can't see that semicolon anywhere. Of course, one of the things it can do is to guess "maybe the user is missing a semicolon. That's a common mistake, after all". But where should that semicolon have been? Because you made an error, the code can't be parsed into a syntax tree, so there's no clear indicator that "this node is missing from the tree". And there might be more than one place where a semicolon could be inserted so that the surrounding code would parse correctly. And moreover, how much code are you going to try to parse/recompile once you've found what might be the error? The compiler could insert the semicolon, but then at the very least it has to restart parsing of that block of code. But maybe it introduced errors further down in the code. So maybe the entire program should be recompiled, just to make sure the fix the compiler came up with was actually the right one. But that's hardly an option either. It takes too long.
Say you have some code like this:
what is the error here? Looking at it, you and I would say "you're missing the semicolon after the class definition". But how can the compiler tell?
void
could be a typo. Perhaps you actually intended to write the name of an instance of typefoo
. then the real error would be that it is followed by what now looks like a function call.So the compiler has to guess. "This looks like it could have been a class definition, and what comes after it looks like it the name of a type. If that is true, the user is missing a semicolon to separate them".
And guessing isn't a very precise science. And matters are further complicated because every time the compiler tries to be clever and makes a guess, it's only going to add confusion if the guess is wrong.
So sometimes, it might be better to output a short, terse message saying only what we're sure of (say, that a class definition cannot be followed by a type name). That's not as helpful as saying "you're missing a semicolon after the class definition", but it's less harmful if the compiler guesses wrong.
If it tells you you're missing a semicolon, and the error was actually something else, it's just misleading you. So maybe a terse and less helpful error message is better in the worst case, even if it isn't as nice in the best case.
Writing good compiler errors isn't easy, especially not in a messy language like C++.
But when that is said, some compilers (including MSVC and GCC) could be a lot better. I believe that better compiler diagnostics are one of the primary goals of Clang.
我将尝试解释诊断背后的一些基本原理(如标准所称):
编译器必须遵守该标准。该标准或多或少定义了编译器需要诊断的所有内容(例如语法错误),因为这些是不变量,供应商需要记录的东西(称为实现定义,因为供应商在如何记录方面有一些回旋余地),他们调用的东西未指定(供应商可以在没有文档的情况下逃脱),然后是未定义的行为(如果标准无法定义它,编译器可能会吐出什么错误消息?)。
并非每个人都拥有该副本
标准。
相反,编译器尝试执行的操作
要做的就是按类别对错误进行分组,
然后修复一个人类可以理解的
足够通用的错误消息
处理其中的各种错误
类别,同时仍然是
有意义的。
此外,并非所有编译器都是标准的
符合。悲伤,但却是事实。
一些编译器实现的功能超过
一个标准。你真的期待吗
他们引用了 3 个标准的 C&V
简单的“缺少
;
”的文本错误?
最后,该标准简洁明了
比人类可读性差
委员会想认为(好吧,
这是一句半开玩笑的话,但是
很好地反映了事态
准确!)
并再次阅读顶部的引用;)
PS:就模板错误消息而言,我必须提供以下内容:
I'll try to explain some rationale behind diagnostics (as the standard calls them):
Compilers are bound to obey the standard. The standard defines more or less everything that the compiler needs to diagnose (e.g. syntax errors) because these are invariants, stuff that the vendor needs to document (called implementation defined as the vendor has some leeway as to how to document), stuff they call unspecified (the vendor can get away without documenting) and then undefined behavior (if the standard can't define it, what error message can the compiler possibly spit out?).
Not everyone has a copy of the
standard.
Instead, what the compiler tries to
do is group errors by categories and
then fixes a human-understandable
error message that is generic enough
to handle all sorts of errors in that
category while still being
meaningful.
Also, not all compilers are standards
compliant. Sad, but true.
Some compilers implement more than
one standard. Do you really expect
them to quote C&V of 3 standards
texts for a simple "missing
;
"error?
Finally, the standard is terse and
less human readable than the
committee would like to think (okay,
this is a tongue-in-cheek remark but
reflects the state of affairs pretty
accurately!)
And read the quote at the top once more ;)
PS: As far as template error messages are concerned, I have to offer the following:
有些编译器比其他编译器更好。我听说 comeau 的编译器给出了更好的错误。您可以在 http://www.comeaucomputing.com/tryitout/ 尝试一下
There are some compilers that are better than others. The compiler from comeau I've heard gives significantly nicer errors. You can try it out at http://www.comeaucomputing.com/tryitout/
编译器作者的选择并不是因为他们的英语能力,也不是因为写作机会而选择他们的作品。
也就是说,我认为错误消息在过去十年中不断得到改善。对于 GCC,问题通常是筛选太多信息。您链接的讨论是关于“无匹配功能”消息的。这是一个常见的错误,通常会出现大量候选函数。
在这种情况下,参考有关重载解析的标准规则甚至可能会适得其反。为了解决这个问题,我将找到我想要的候选人并将其与呼叫站点进行比较。 99% 的情况下,我想要一场简单、朴素的比赛,而 99% 的复杂解析机制并不适用。必须查看标准中的解决规则通常表明您正在陷入困境。
无论如何,我认为只有少数程序员真正倾向于或完全能够导航和解释 ISO 标准。
好的一面是,总有办法联系任何积极维护的编译器的作者。如果您有任何改进措辞的建议,请发送!
Compiler authors aren't chosen for their English abilities, and don't choose their work for the writing opportunities.
That said, I think error messages have consistently improved over the last decade. With GCC, the problem is usually sifting through too much information. The discussion you linked was about a "no matching function" message. That's a common error which is usually followed by a torrent of candidate functions.
Being referred to the standard's rules on overload resolution would be possibly even counterproductive in this case. To resolve the issue, I'll find the candidate I want and compare it to the call site. 99% of the time, I want a simple no-frills match, and 99% of the sophisticated resolution machinery won't apply. Having to review the resolution rules in the standard often indicates you're getting into deep doo-doo.
I think only a minority of programmers are really inclined or fully able to navigate and interpret the ISO standard, anyway.
On the bright side, there are always avenues to contact the authors of any actively-maintained compiler. If you have any kind of suggestion for improved wording, send it in!
恕我直言,很多时候重要的不是消息的文本,而是将其与来源联系起来的能力。 VS2005 中的 C++ 编译器似乎显示错误消息,指示发生错误的文件,但不指示包含错误的文件。当一个头文件中的错误导致下一个头文件中的编译错误时,这可能是一种真正的痛苦。确定预处理器宏发生了什么也很困难。
IMHO, often times what matters is not the text of the message, but the ability to relate it to the source. The C++ compiler in VS2005 seems to show error messages indicating the file where the error occurred, but not the file it was included from. That can be a real pain when e.g. a mistake in one header file causes compilation errors in the next one. It can also be difficult to ascertain what's going on with preprocessor macros.
我读过的其他答案中没有提到的一个因素:C++ 编译器本身有一项非常复杂的工作,并且不要通过将它们编译的代码分类为“预期”内容和“意外”内容来进一步使其复杂化。例如,我们作为程序员理解 std::string 是 std::basic_string 的特定实例,具有各种字符类型、特征、分配器等。因此,当出现错误时,我们只想知道它涉及一个字符串,而不是看到所有其他内容。但是,假设我们被要求调试客户端在使用我们的库时遇到的错误消息。我们可能需要准确地查看模板是如何实例化的,以便找出问题所在,而仅仅查看其代码中的某些 typedef(我们甚至可能无法访问)就会使错误消息变得毫无用处。因此,软件堆栈中不同级别的程序员希望看到不同的东西,并且大多数编译器不想猜测这一点或允许自定义,他们只是吐出所有内容并相信程序员会很快学会专注于他们需要的水平的东西。大多数时候,程序员很快就能学会这样做,但有时比其他人更难。
另一个因素是,有时错误代码可能有许多小的变化,而这些变化都是有效的,因此编译器知道程序员的意图并显示有关该增量的消息是不切实际的。然而,程序员通常不知道代码可能几乎有意义的其他方式,只是认为编译器很愚蠢,因为没有从他们的角度看待它。
干杯,
托尼
A factor not mentioned in the other answers I've read: C++ compilers have a very complicated job as is, and don't further complicate it by classifying the code they're compiling into "expected" stuff and "unexpected". For example, we as programmers understand that std::string is a particular instantiation of std::basic_string with various character types, traits, allocators - whatever. So, when there's an error we just want to know it involves a string and not see all that other stuff. But, say we're asked to debug an error message a client encountered when using our library. We may need to see exactly how a template has been instantiated in order to see where the problem is, and simply seeing some typedef that's inside their code - that we may not even have access to - would make the error messages useless. So, programmers at different levels in the software stack want to see different things, and most compilers don't want to buy into guessing about this or allowing customisations, they just spit everything out and trust the programmer will quickly learn to focus in on the stuff at the level they need to. Most of the time, programmers quickly learn to do that, but sometimes it's harder than others.
Another factor is that sometimes there may be many small variations on the erroneous code that would all be valid, so it's impractical for the compiler to know what the programmer intended and display a message about that delta. Programmers however are often unaware of the other ways the code might almost have made sense, and just think the compiler is dumb for not seeing it from their perspective.
Cheers,
Tony