Escape(\) 字符背后的魔力是什么

发布于 2024-07-09 08:27:23 字数 70 浏览 11 评论 0原文

C/C++编译器如何操作源代码中的转义字符["\"]? 编译器语法是如何编写来处理该字符的? 编译器遇到该字符后会做什么?

How does the C/C++ compiler manipulate the escape character ["\"] in source code? How is compiler grammar written for processing that character? What does the compiler do after encountering that character?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

走野 2024-07-16 08:27:23

大多数编译器分为几个部分:编译器前端称为词法分析器或扫描器。 编译器的这一部分读取实际字符并创建标记。 它有一个状态机,在看到转义字符时决定它是否是真实的(例如当它出现在字符串中时)或修改下一个字符。 该标记相应地作为转义字符或其他一些标记(例如制表符或换行符)输出到编译器的下一部分(解析器)。 状态机可以将多个字符分组为一个令牌。

Most compilers are divided into parts: the compiler front-end is called a lexical analyzer or a scanner. This part of the compiler reads the actual characters and creates tokens. It has a state machine which decides, upon seeing an escape character, whether it is genuine (for example when it appears inside a string) or it modifies the next character. The token is output accordingly as the escape character or some other token (such as a tab or a newline) to the next part of the compiler (the parser). The state machine can group several characters into a token.

暗喜 2024-07-16 08:27:23

关于这个主题的一个有趣的注释是关于信任信任[ PDF 链接]

该论文描述了编译器可以准确处理此问题的一种方法,展示了 c-writing-in-c 编译器如何不将代码显式转换为 ASCII 值; 以及如何将新的转义代码引导到编译器中,以便隐式地理解新代码的 ASCII 值。

An interesting note on this subject is On Trusting Trust [PDF link].

The paper describes one way a compiler could handle this problem exactly, shows how the c-written-in-c compiler does not have an explicit translation of the codes into ASCII values; and how to bootstrap a new escape code into the compiler so that the understanding of the ASCII value for the new code is also implicit.

奢华的一滴泪 2024-07-16 08:27:23

它通常转义以下字符:

  • 在字符串文字或字符文字中,它意味着转义下一个字符。 \a 表示“警报”(闪烁终端、蜂鸣声或其他),\n 表示“换行”,\xNUM 表示十六进制数例如。
  • 如果它作为换行符之前的最后一个可见字符出现,无论是否在字符串中(甚至在行宽注释中!),它都充当行继续符:以下换行符将被忽略,下一行将被忽略。与当前行合并。

It generally escapes the following character:

  • In a string literal or character literal, it means escape the next character. \a means 'alert' (flashing the terminal, beeping or whatever), \n means 'linefeed', \xNUM means an hexadecimal number for example.
  • If it appears as the last visible character before a newline, whether within a string or not (and even within a line-wide comment!), it acts as a line-continuation: The following newline character is ignored, and the next line is merged with the current line.
沒落の蓅哖 2024-07-16 08:27:23

带有后续字符的转义字符(如 \n)对于 C 编译器来说是单个字符 - 扫描器将其作为字符标记呈现给解析器,因此解析器中不需要特殊的语法规则来转义字符。

escape character with a following character (like \n) is a single character for C compiler - scanner presents it to parser as character token, so there is no need in special syntax rules in parser for escape character.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文