在 C++ 中使用尖括号会带来哪些语法问题?模板?

发布于 2024-12-02 23:29:00 字数 574 浏览 6 评论 0 原文

在 C++ 中,模板使用尖括号 vector 进行实例化,Java 和 C# 语言对其泛型采用了相同的语法。

然而,D 的创建者对于尖括号带来的问题直言不讳,他们制定了一种新语法 foo!(int) ——但我从未见过太多关于尖括号带来的问题的细节确切地说,括号带来。

其中之一是用另一个模板 vector> 实例化一个模板时,这会导致某些(较旧的?)编译器将尾随 '>>` 与位混淆移位或流式运算符。解决方案是在两个尖括号之间插入一个空格,但是现在编译器还不能解析该语法吗?

另一个问题是使用大于运算符 foo<3> 时。 2>。解析器会认为该操作符实际上关闭了模板实例化;修复方法是引入括号 foo<(3 > 2)>。但我认为在很多情况下您都不需要这样做,无论如何,我宁愿在需要时输入额外的括号,而不是引入新的语法并且总是 必须输入感叹号。

尖括号还有哪些其他问题促使 D 开发人员创建新语法?

In C++ templates are instantiated with angle brackets vector<int> and the Java and C# languages have adopted the same syntax for their generics.

The creators of D, however, have been quite vocal about the problems that angle brackets bring and they made a new syntax foo!(int) — but I've never seen too many details about what problems angle brackets bring, exactly.

One of them was when instantiating a template with another template vector<vector<int>>, which would cause some (older?) compilers to confuse the trailing '>>` with the bit-shift or streaming operators. The solution was to insert a space between the two angle brackets, but haven't compilers become able to parse that syntax, nowadays?

Another problem was when using the greater-than operator foo<3 > 2>. The parser would think that the operator actually closes the template instantiation; the fix was to introduce parentheses foo<(3 > 2)>. But I don't think there that many cases where you need to do this and, at any rate, I'd rather have to type the extra parentheses when they are needed, instead of introducing new syntax and always having to type the exclamation mark.

What other problems are there with angle brackets that made the D developers create a new syntax?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

冰魂雪魄 2024-12-09 23:29:00

就我个人而言,我见过的最可怕的问题是在依赖上下文中调用模板函数:

template <typename T>
void foo(T t) {
  t.bar<3>();
}

这看起来确实很简单,但实际上是不正确的。 C++ 标准要求引入 template 关键字来消除 t.bar t.bar 的歧义。 3 与方法调用屈服:

t.template bar<3>(); // iirk

litb 发表了一些关于编译器可能提出的可能解释的非常有趣的帖子。

关于 >> 问题,它已在 C++0x 中修复,但需要更聪明的编译器。

Personally, the most hideous problem I have seen is the invocation of template functions in dependent context:

template <typename T>
void foo(T t) {
  t.bar<3>();
}

This looks admittedly simple, but in fact is incorrect. The C++ Standard requires the introduction of the template keyword to disambiguate t.bar < 3 vs a method invocation yielding:

t.template bar<3>(); // iirk

litb made some very interesting posts regarding the possible interpretation a compiler could come up with.

Regarding the >> issue, it's fixed in C++0x, but requires more clever compilers.

空气里的味道 2024-12-09 23:29:00

但是现在编译器还不能解析该语法吗?

当然。但这绝非小事。特别是,它阻止您在上下文无关的词法分析器和解析器之间实现清晰的分离。对于语法荧光笔和其他需要解析 C++ 但不希望/无法实现成熟的语法分析器的支持工具来说,这尤其令人厌烦。

它使 C++ 变得更加难以解析,以至于很多工具根本不会打扰。这是生态系统的净损失。换句话说:它使得开发解析工具变得更加昂贵。

例如,ctags 对于某些模板定义失败,这使得它无法在我们当前的 C++ 项目中使用。很烦人。

但我认为在很多情况下您不需要[区分尖括号和小于号]

您需要这样做的频率并不重要。您的解析器仍然需要处理这个问题。

D 放弃角背的决定是理所当然的。任何一个理由都足够了,因为它是一项净收益。

but haven't compilers become able to parse that syntax, nowadays?

Of course. But it’s far from trivial. In particular, it prevents you from implementing a clean separation between context-unaware lexer and parser. This is particularly irksome for syntax highlighters and other support tools that need to parse C++, but don’t want/can implement a fully-fledged syntactical analyser.

It makes C++ so much harder to parse that a lot of tools simply won’t bother. This is a net loss for the ecosystem. Put differently: it makes developing a parsing tool much more expensive.

For instance, ctags fails for some template definitions, which makes it unusable with our current C++ project. Very annoying.

But I don't think there that many cases where you need to [distinguish between angle brackets and less-than]

It doesn’t matter how often you need to do this. Your parser still needs to handle this.

D’s decision to drop angle backets was a no-brainer. Any one reason would have sufficed, given that it’s a net benefit.

冰雪梦之恋 2024-12-09 23:29:00

问题在于使语言语法与上下文无关。当词法分析器对程序进行标记时,它使用一种名为 maximal munch 的技术,这意味着它总是采用可以指定标记的最长字符串。这意味着 >> 被视为正确的位移运算符。因此,如果您有类似 vector> 的内容,则末尾的 >> 会被视为正确的位移运算符,而不是部分模板实例化。为了让它在此上下文中以不同的方式对待 >>,它必须是上下文敏感的,而不是上下文无关的 - 也就是说,它具有真正关心正在解析的标记的上下文。这使得词法分析器和解析器变得相当复杂。词法分析器和解析器越复杂,出现错误的风险就越高,更重要的是,工具实现它们的难度越大,这意味着工具就越少。当 IDE 或代码编辑器中的语法突出显示之类的东西实现起来变得复杂时,就会出现问题。

通过使用 !() - 这将导致相同声明的 vector!(pair!(int, int)) - D 避免了上下文敏感性问题。 D 在其语法中明确地做出了许多这样的选择,其目的是让工具在需要时更容易实现词法分析或解析,以便完成它们所做的事情。而且,由于对模板使用 !() 确实没有什么缺点,除了对于在其他使用 <>< 的语言中使用过模板或泛型的程序员来说有点陌生。 /code>,这是一个合理的语言设计选择。

在使用尖括号语法时,您使用或不使用模板的频率会产生歧义 - 例如 vector> - 与语言并不真正相关。无论如何,工具必须实现它。使用 !() 而不是 <> 的决定完全是为了简化工具的语言,而不是为了程序员。尽管您可能特别喜欢也可能不特别喜欢 !() 语法,但它非常易于使用,因此除了学习它之外它最终不会给程序员带来任何问题,并且事实上它可能违背他们的习惯个人喜好。

The issue is making the language grammar context-free. When a program is tokenized by the lexer, it uses a technique called maximal munch, which means that it always takes the longest string possible which could designate a token. That means that >> is treated as the right bitshift operator. So, if you have something like vector<pair<int, int>>, the >> on the end is treated as the right bitshift operator instead of part of a template instantiation. For it to treat >> differently in this context, it must be context-sensitive instead of context-free - that is it has to actually care about the context of the tokens being parsed. This complicates the lexer and parser considerably. The more complicated the lexer and parser are, the higher the risk of bugs - and more importantly, the harder it is for tools to implement them, which means fewer tools. When stuff like syntax highlighting in an IDE or code editor becomes complicated to implement, it's a problem.

By using !() - which would result in vector!(pair!(int, int)) for the same declaration - D avoids the context sensitivity issue. D has made a number of such choices in its grammar explicitly with the idea of making it easier for tools to implement lexing or parsing when they need to in order to do what they do. And since there's really no downside to using !() for templates other than the fact that it's a bit alien to programmers who have used templates or generics in other languages which use <>, it's a sound language design choice.

And how often you do or don't use templates which would create ambiguities when using the angle bracket syntax - e.g. vector<pair<int, int>> - isn't really relevant to the language. The tools must implement it regardless. The decision to use !() rather than <> is entirely a matter of simplifying the language for tools, not for the programmer. And while you may or may not particularly like the !() syntax, it's quite easy to use, so it ultimately doesn't cause programmers any problems beyond learning it and the fact that it may go against their personal preference.

柒七 2024-12-09 23:29:00

在 C++ 中,另一个问题是预处理器不理解尖括号,因此会失败:

#define FOO(X) typename something<X>::type

FOO(std::map<int, int>)

问题是预处理器认为 FOO 正在使用两个参数调用:std::mapint>。这是一个更广泛问题的示例,即符号是运算符还是括号通常是不明确的。

In C++ another problem is that the preprocessor doesn't understand angle brackets, so this fails:

#define FOO(X) typename something<X>::type

FOO(std::map<int, int>)

The problem is that the preprocessor thinks FOO is being called with two arguments: std::map<int and int>. This is an example of the wider problem, that it's often ambiguous whether the symbol is an operator or a bracket.

梦幻之岛 2024-12-09 23:29:00

弄清楚它的作用很有趣:

bool b = A< B>::C == D<E >::F();
bool b = A<B>::C == D<E>::F();

上次我检查时,您可以通过更改范围内的内容来使其以任何一种方式进行解析。

使用 <> 作为匹配和非匹配标记是一场灾难。至于 !() 使 D 使用时间更长:对于具有单个参数的常见情况,() 是可选的,例如这是合法的:

Set!int foo;

Have fun figuring out what this does:

bool b = A< B>::C == D<E >::F();
bool b = A<B>::C == D<E>::F();

Last time I checked, you could make it parse either way by changing what's in scope.

Using < and > as both matching and non matching tokens is a disaster. As to the !() making the D usage longer: for the common case of having a single argument, the () are optional, e.g. this is legal:

Set!int foo;
獨角戲 2024-12-09 23:29:00

我相信这些是唯一的案例。

然而,与其说这是一个用户问题,不如说它是一个实现者问题。这种看似微不足道的差异使得为 C++ 构建正确的解析器变得更加困难(与 D 相比)。 D 还被设计为对实现者友好,因此他们尽力避免产生不明确的代码。

(旁注:我确实发现shift-感叹号组合有点尴尬......尖括号的一个优点绝对是易于打字!)

I believe those were the only cases.

However, it's not so much a user problem as it is an implementer problem. This seemingly trivial difference makes it much harder to build a correct parser for C++ (as compared to D). D was also designed to be implementer-friendly, and as such they tried their best to avoid making ambiguous code possible.

(Side note: I do find the shift-exclamation point combination to be somewhat awkward... one advantage of angle brackets is definitely ease of typing!)

段念尘 2024-12-09 23:29:00

>= 大于或等于歧义 是另一种未提及的情况:

失败:

template <int>
using A = int;
void f(A<0>=0);

有效:

void f(A<0> =0);

我认为这在 C++11 中并没有像 >>.

有关更多详细信息,请参阅此问题: 为什么“A<0>=”中的 template-id 0" 由于大于或等于运算符“>=”,没有空格就无法编译?

>= greater-than or equals ambiguity is another case that wasn't mentioned:

Fails:

template <int>
using A = int;
void f(A<0>=0);

Works:

void f(A<0> =0);

I think this did not change in C++11 like >>.

See this question for more details: Why does the template-id in "A<0>=0" not compile without space because of the greater-or-equal-than operator ">="?

み格子的夏天 2024-12-09 23:29:00

最终,任何编译器都必须将您的半英语源代码(无论何种语言)转换为计算机可以实际操作的真实机器代码。这最终是一系列极其复杂的数学变换。

嗯,数学告诉我们,编译所需的映射是“到”或“满射”。这意味着每个合法程序都可以明确地映射到汇编。这就是语言关键字和标点符号“;”的含义。存在的原因以及为什么每种语言都有它们。然而,像 C++ 这样的语言使用相同的符号,如“{}”和“<>”对于多个事物,因此编译器必须添加额外的步骤来生成整体,并转换它需要的内容(这就是当您乘以矩阵时在线性代数中所做的事情)。这会增加编译时间,引入显着的复杂性,其本身可能包含错误,并且会限制编译器优化输出的能力。

例如,Strousoup 可以使用“@”作为模板参数 - 这是一个未使用的字符,非常适合让编译器知道“这是并且永远都是某种模板”。这实际上是一对一的转换,非常适合分析工具。但他没有;他使用了已经映射到大于和小于的符号。仅此一点就立即引入了歧义,而且只会变得更糟。

听起来“D”决定将序列“!()”作为一个特殊符号,仅用于模板,就像上面的“@”示例一样。我愿意猜测其高度模板化的代码编译速度更快并且错误更少。

Ultimately, what any compiler has to do it translate your semi-English source code- in whatever language- into the real machine code a computer can actually operate on. This is ultimately a series of incredibly complex mathematical TRANSFORMS.

Well, mathematics tells us that the mapping we need for compilation are "onto" or "surjective". All that means is that every legal program CAN be mapped unambiguously to assembly. This is what language keywords and punctuation like ";" exist for, and why every language has them. However, languages like C++ use the same symbols like "{}" and "<>" for multiple things, so the compiler has to add extra steps to produce the overall, onto transform it needs (this is what you're doing in linear algebra when you multiply matrices). That adds to compile times, introduces significant complexity that itself can harbor bugs, and can limit the compiler's ability to optimize the output.

For example, Strousoup could've used '@' for templates argument- it was an unused character that would've been perfect for letting compilers know that "this is, and only ever will be, some kind of template". That is actually a 1-to-1 transform, which is perfect for analytic tools. But he didn't; he used symbols that already mapped to greater-than and less-than. That alone immediately introduces ambiguity, and it only gets worse from there.

It sounds like "D" decided to make the sequence '!()' a special symbol, used only for templates, like my '@' example above. I'm willing to guess that its highly templated code compiles faster and with fewer bugs as a result.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文