回车作为以 c++ 结尾的行文件

发布于 2024-12-14 00:35:19 字数 1270 浏览 0 评论 0原文

我一直在阅读 ISO 14882:2003。它说:

s-char:
源字符集的任何成员,除了双引号 "、反斜杠 \ 或换行符 转义序列
通用字符名称

现在,关于换行符,当行结尾为 '\r' 时,我发现一个问题
我写了一个小的cpp程序:

#include <fstream>
#include <string>
int main()
{
    const char* program=""
        "#include <string>\n"
        "int main()\n"
        "{\n"
        "  std::string s;\n"
        "  //s=\"\r"
        "  //\r"
        "  //\r"
        "  //\r"
        "  //\";\n"
        "  s=\"\\xAE\\xfffactory\\xAE\\xffaction\";\n"
        "  return 0;\n"
        "}\n"
        ;
    std::ofstream file("file.cpp", std::ios_base::trunc);
    file << program;
    file.close();
    return 0;
}

在Windows上,file.cpp(在VS编辑器中读取)是:

#include <string>
int main()
{
  std::string s;
  //s="
  //
  //
  //
  //";
  s="\xAE\xfffactory\xAE\xffaction";
  return 0;
}

当编译file.cpp时,VS在第6行触发和错误,而不是第10行。

在Linux上,file.cpp(在VS编辑器中读取)在 emacs 中)是:

#include <string>
int main()
{
  std::string s;
  //s="^M  //^M  //^M  //^M  //";
  s="\xAE\xfffactory\xAE\xffaction";
  return 0;
}

使用 gcc 编译 file.cpp 我在第 10 行中收到错误,而不是第 6 行中的错误。

我应该从中得出什么结论?

I have been reading the ISO 14882:2003. It says:

s-char:
any member of the source character set except the double-quote ", backslash \, or new-line character
escape-sequence
universal-character-name

Now, about new-line character I see a problem when the line ending is '\r'
I wrote a small cpp program:

#include <fstream>
#include <string>
int main()
{
    const char* program=""
        "#include <string>\n"
        "int main()\n"
        "{\n"
        "  std::string s;\n"
        "  //s=\"\r"
        "  //\r"
        "  //\r"
        "  //\r"
        "  //\";\n"
        "  s=\"\\xAE\\xfffactory\\xAE\\xffaction\";\n"
        "  return 0;\n"
        "}\n"
        ;
    std::ofstream file("file.cpp", std::ios_base::trunc);
    file << program;
    file.close();
    return 0;
}

On Windows, file.cpp (as read in VS editor) is:

#include <string>
int main()
{
  std::string s;
  //s="
  //
  //
  //
  //";
  s="\xAE\xfffactory\xAE\xffaction";
  return 0;
}

When compiling file.cpp, VS triggers and error in line 6, instead of line 10.

On Linux, file.cpp (as read in emacs) is:

#include <string>
int main()
{
  std::string s;
  //s="^M  //^M  //^M  //^M  //";
  s="\xAE\xfffactory\xAE\xffaction";
  return 0;
}

Compiling file.cpp with gcc I get an error in line 10, not in line 6.

What should I conclude from this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

泼猴你往哪里跑 2024-12-21 00:35:19

您应该得出这样的结论:

  1. VS 编辑器可以理解任何行结束符,因此将其显示为多行(嗯,这是一个已知的功能)。
  2. MSVC 编译器不理解 \r 行结束符,因此它实际上将 "; 行计为第 6 行。
  3. emacs 不会理解 \r 行结尾(至少默认情况下),这样它就可以在一行中显示源代码
  4. 因此它不会丢失计数。

, 你提供的标准是无关的。 new-line 指的是源字符集,而不是字符串中的 \r\n 您引用的语法规则只是排除字符串。字面意思如:

const char* s = "some text, here comes 'new-line'
    ha ha ";

You should conclude that:

  1. VS editor understands any line-endings and so displays it as multiple lines (well, this is a known feature).
  2. MSVC compiler doesn't understand \r line-endings, so it actually counts the "; line as the 6th line.
  3. emacs doesn't understand \r line-endings (at least by default) so it shows you the source in a single line.
  4. GCC understands any line endings, so it doesn't loose the count.

Ah, also the quote you provided from the standard is unrelated. The new-line there refers to the source character set, not the \r and \n in strings. The grammar rule you quoted just excludes string literal such as:

const char* s = "some text, here comes 'new-line'
    ha ha ";
绝對不後悔。 2024-12-21 00:35:19

第 2.1 节 [lex.phases]。第一阶段的翻译是:

如有必要,物理源文件字符将以实现定义的方式映射到基本源字符集(引入换行符作为行尾指示符)。 ...

换句话说,实现可以自由地使用它想要的任何行结束约定,并在翻译的第一阶段将其转换为换行符。

实际上,在任何现代编译器上使用换行符作为行结尾都应该是安全的。

Section 2.1 [lex.phases]. The first phase of translation is:

Physical source file characters are mapped, in an implementation-defined manner, to the basic source character set (introducing new-line characters for end-of-line indicators) if necessary. ...

In other words, the implementation is free to use whatever line ending convention it wants, and turn that into newline characters during the first phase of translation.

Practically speaking, you should be safe using the newline character for line endings on any modern compiler.

一紙繁鸢 2024-12-21 00:35:19

Windows 和 Linux 使用不同的行结束约定。在 Linux 上,行尾为 0x0A,在 Windows 上为 0x0D, 0x0A。 C/C++ 程序本身就是文本文件,并且通常可以跨平台互操作,只要符合平台上的文本约定即可。

dos2unix(1) 工具就是专门为此任务而构建的。

或者,由于您是在自己的工具中动态生成此代码,因此您可以提供一个选项来告诉它使用一种行结束样式或另一种。

Windows and linux use different line ending conventions. On linux, the end of line is 0x0A, and on windows its 0x0D, 0x0A. C/C++ programs are themselves text files, and are often interoperable across platforms, so long as you conform to the text conventions on the platform .

the dos2unix(1) tool is purpose build for just this task.

Alternatively, since you're producing this code dynamically in your own tool, you could provide an option that tells it to use one line-ending style or the other.

野鹿林 2024-12-21 00:35:19

<块引用>

现在,关于换行符,当行结尾为“\r”时,我发现一个问题...

'\r' 是回车符而不是换行符 - 所以我不确定问题是什么?

Windows 选择将 \r 表示为换行符,但这并不意味着它们实际上是换行符

Now, about new-line character I see a problem when the line ending is '\r'...

'\r' is a carriage return and not a newline -- so Im not sure what the problems is?

Windows chose to make some magic of representing \r as newlines, but that does not mean that they actually are newlines

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文