现在还使用二合字母和三合字母吗?

发布于 2024-12-05 10:13:55 字数 1432 浏览 0 评论 0原文

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

入画浅相思 2024-12-12 10:13:55

我不确定,但您很可能会发现在 IBM 大型机环境中使用二合字母和三合字母。 EBCDIC 字符集不包含 C 所需的某些字符。

二合字母和三字母的其他理由,7-用重音字母替换一些标点符号的位 ASCII 字符集在今天可能不太相关。

在这样的环境之外,我怀疑三字母组更常被错误地使用,而不是故意使用,例如:

puts("What happened??!");

作为参考,三字母组是在 1989 年 ANSI C 标准(实质上成为 1990 年 ISO C 标准)中引入的。它们是:

??= #     ??) ]     ??! |
??( [     ??' ^     ??> }
??/ \     ??< {     ??- ~

替换发生在源代码中的任何位置,包括注释和字符串文字。

二合字母是某些标记的替代拼写,并且不影响注释或文字:

<: [      :>   ]
<% {      %>   }
%: #      %:%: ##

二合字母是由 1990 年 ISO C 标准的 1995 年修正案引入的。

I don't know for sure, but you're most likely to find digraphs and trigraphs being used in IBM mainframe environments. The EBCDIC character set doesn't include some characters that are required for C.

The other justification for digraphs and trigraphs, 7-bit ASCII-ish character sets that replace some punctuation characters with accented letters, is probably less relevant today.

Outside such environments, I suspect that trigraphs are more commonly used by mistake than deliberately, as in:

puts("What happened??!");

For reference, trigraphs were introduced in the 1989 ANSI C standard (which essentially became the 1990 ISO C standard). They are:

??= #     ??) ]     ??! |
??( [     ??' ^     ??> }
??/ \     ??< {     ??- ~

The replacements occur anywhere in source code, including comments and string literals.

Digraphs are alternate spellings of certain tokens, and do not affect comments or literals:

<: [      :>   ]
<% {      %>   }
%: #      %:%: ##

Digraphs were introduced by the 1995 amendment to the 1990 ISO C standard.

紫轩蝶泪 2024-12-12 10:13:55

有一个 C+ 提案待决+1z(C++1y 之后的下一个标准将标准化为 - 希望 - C++14),旨在从标准中删除三字母组。他们对一个未公开的大型代码库进行了案例研究:

案例研究

在一个大型代码库中使用类似三字母的结构是
检查了。我们发现:

923 个逃脱的 ? 实例在字符串文字中以避免三字符组
替换: string pattern() const { return "foo-??????\?-of-??????"; }

在测试代码中故意使用三字母组的 4 个实例:两个
编译器的测试套件,测试套件中的其他两个
boost 的预处理器库。

0 个三字母组实例
故意在生产代码中使用。三字母继续构成
C++ 用户的负担。

该提案指出(原始提案中的粗体强调):

如果三字母组从语言中完全删除,
希望支持他们的实施机构可以继续这样做:其
实现定义的从物理源文件字符到
基本源字符集可以包括三字符翻译(和
甚至可以避免在原始字符串文字中这样做)。 我们不需要
向后兼容标准中的三字母

There is a proposal pending for C++1z (the next standard after C++1y will be standardized into -hopefully- C++14) that aims to remove trigraphs from the Standard. They did a case study on an otherwise undisclosed large codebase:

Case study

The uses of trigraph-like constructs in one large codebase were
examined. We discovered:

923 instances of an escaped ? in a string literal to avoid trigraph
replacement: string pattern() const { return "foo-????\?-of-?????"; }

4 instances of trigraphs being used deliberately in test code: two in
the test suite for a compiler, the other two in a test suite for
boost's preprocessor library.

0 instances of trigraphs being
deliberately used in production code. Trigraphs continue to pose a
burden on users of C++.

The proposal notes (bold emphasis from the original proposal):

If trigraphs are removed from the language entirely, an
implementation that wishes to support them can continue to do so: its
implementation-defined mapping from physical source file characters to
the basic source character set can include trigraph translation (and
can even avoid doing so within raw string literals). We do not need
trigraphs in the standard for backwards compatibility
.

深陷 2024-12-12 10:13:55

三联图和二联图的使用不是在今天编写的,它仅存在于在非常有限的环境中创建的非常古老的代码中。任何包含三字母的代码,如果您尝试在像 VS 这样的现代编译器上编译它们,除非您指定链接器选项,否则通常不会编译。我知道对于 Visual Studio,该选项是“/Zc:trigraphs”

它们之所以存在,是因为 C++ 委员会从不发布会“破坏”遗留代码的更改。无论好坏。有一个轶事称,删除这些内容是有人提议并支持的,但被一名 IBM 代表阻止了。

The use of tri and di-graphs isn't written in this day, it exists only in very old code that was created in a very limited environment. Any code that contains trigraphs, if you attempt to compile them on a modern compiler like VS's,it will usually not compile unless you specify a linker option. I know for Visual Studio, that option is "/Zc:trigraphs"

Why they exist, is because the C++ committee never issues changes that would 'break' legacy code. For better or for worse. There is an anecdote that their removal was proposed and supported, and it was stopped by a lone IBM representative.

往日情怀 2024-12-12 10:13:55

我知道这是一个老问题,但现在可以说有一个合法的用途:没有实际键盘的触摸屏。例如,如果您通过平板电脑或类似的东西进行任何编码,则典型的美国键盘布局不一定以完整形式提供,诚然,由于它的繁琐,这种情况希望很少见(在我的分配运算符上单击三下) 。如果可能的话,我个人不会使用它们,但它们在缺乏它们要代表的实际标记的情况下很有用。

再次,我真的希望人们尽可能避免这种情况,但这是了解和使用它们的原因之一。

I know this is an old question, but there is arguably a legitimate use these days: touch screens without an actual keyboard. For example, the typical US keyboard layout isn't necessarily available in full form if you do any coding via tablet or something like that, which admittedly is hopefully rare due to how cumbersome it can be (three clicks on mine for an assignment operator). I personally don't use them if possible, but they are useful in absence of the actual tokens they're meant to represent.

Again, I really hope people avoid this where possible, but it is one reason to know and use them.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文