不确定其他实现,但海湾合作委员会使用(显然是?)随机使用空间和选项卡。
请参阅例如:您会看到它使用选项卡字符进行凹痕,但并非到处,原因是文件使用每个级别的两个空格凹痕,但将选项卡字符插入大部分8个空间的地方。
这意味着编辑器显示的凹痕取决于编辑器显示每个选项卡的空间有多少,这显然迫使用户以与空格一致的方式设置其编辑器。通过检查链接文件,您可以看到要进行良好的格式,TAB必须长8个字符。
是否有任何理由为什么没有人在整个代码库上运行 s/\ t//g
?
由于我真的不希望使用 __
是需要的,所以是 “问这个问题,以防万一我缺少一些重要的东西,以便答案不是“因为并非每个人都同意空间比选项卡更好” 。
Let me clarify one point: given a file generated like this
echo -en 'first line\n second line\n\tthird line\n'
which has this content
first line
second line
third line # there's a tab and no spaces at the beginning of this ine
no editor in the world, ever, knows what the correct way of showing this file is, because that depends on the convention. Stackoverflow似乎假定一个选项卡为4个空格,但是GCC代码库假定一个选项卡为8个空格。
这是一种惯例,因此,在不同的代码库之间可能不一致,没有编辑能够以确定性的方式推断我们的约定。给定上面的文件,没有编辑器知道它是否必须显示有关第二行的第三行(因此猜测一个选项卡超过2个空格)(因此猜测一个选项卡正好是2个空格),除非编辑器用户通过选项传达该信息。
显然,每个编辑器都可以应用一些启发式;例如,如果文件是C ++源文件,并且包含这两行,则
if (true) // <space><space>
std::cout << "bye"; // <tab>
编辑器可以相当确信每个选项卡至少为3个空间,以确保相对于第一行的最小凹痕;它还可以推断出标签至少4个字符,并应用了“没有人使用1个空间缩进”的欧洲主义者;但是可以做更多吗?可以得出结论,标签是4、6或8个空间?不,不能。全停止。
Not sure about other implementations, but GCC employs an (apparently?) random use of spaces and tabs.
See for example this file: you'll see that it uses tab characters for indentation, but not everywhere, the reason being that the files use a two spaces indentation per level, but the tab character is inserted in place of a chunk of 8 spaces.
This means that the indentation an editor shows is dependent upon how many spaces an editor shows for each tab, which obviously forces the user to set up their editor in a such a way that tabs are consistent with spaces. By inspecting the linked file you'll see that for a good formatting, tab has to be 8 characters long.
Is there any reason why nobody ever runs a s/\t/ /g
on the whole codebase?
Since I didn't really expect that the use __
was required, which it is, I'm asking this question just in case I'm missing something crucial so that the answer is not "because not everybody agrees that spaces are better than tabs".
Let me clarify one point: given a file generated like this
echo -en 'first line\n second line\n\tthird line\n'
which has this content
first line
second line
third line # there's a tab and no spaces at the beginning of this ine
no editor in the world, ever, knows what the correct way of showing this file is, because that depends on the convention. Stackoverflow seems to assume a tab is 4 spaces, but GCC codebase assumes a tab is 8 spaces.
It is a convention and, as such, it can be inconsistent between different codebases, and no editor is able to deduce our convention in a deterministic way. Given the file above, no editor knows if it has to show the third line indented with respect to the second line (thus guessing a tab is more than 2 spaces) or not (thus guessing a tab is exactly 2 spaces), unless the editor user communicates that information via options.
Clearly, each editor can apply some heuristic; for instance, if a file is a C++ source file and it contains these two lines
if (true) // <space><space>
std::cout << "bye"; // <tab>
the editor can be fairly confident that each tab is at least 3 spaces, to guarantee a minimum indentation to the second line with respect to the first; it could also deduce that the tab is at least 4 characters, applying the euristic that "nobody uses 1-space indenting"; but can it do more? Can it conclude that the tab is 4, 6, or 8 spaces? No, it can't. Full stop.
发布评论
评论(2)
我相信这最初是为了减少源文件大小的“优化”。通过用制表符替换一系列空格,文件的显示效果完全相同,但最多可缩短 7 个字节。由于源代码往往包含大量空白,因此这可以大大减少。
好吧,无论如何,在 GCC 早期(大约 1987 年),这种减少是相当大的,当时 30 MB 是一个很大的硬盘驱动器,而 RAM 每兆字节超过 100 美元。
特别是,直到今天,GNU Emacs 格式化文件默认方式。考虑到它们的共同作者,GCC 的大部分内容很有可能是使用 GNU Emacs 编写的。 ..
虽然文件大小的节省不再非常有意义,但当前的 GCC 维护者可能认为这很好,并且认为没有必要更改它。如果您抱怨您的编辑器不能正确处理它,我相信他们会很乐意推荐您可以使用的另一个编辑器:)
I believe this originated as an "optimization" to reduce the size of the source file. By substituting a sequence of spaces with a tab character, the file displays just the same, but is up to 7 bytes shorter. Since source code tends to contain a lot of whitespace, this can add up to a substantial reduction.
Well, anyway, a reduction that would have been substantial in the early days of GCC (circa 1987), when 30 MB was a good-sized hard drive and RAM was over $100 per megabyte.
In particular, to this day, GNU Emacs formats files this way by default. And it's a good bet that much of GCC was written using GNU Emacs, given their common authorship...
Although the savings in file size are no longer very meaningful, the current GCC maintainers probably think it's fine and don't see a need to change it. And if you complain that your editor doesn't handle it properly, I'm sure they'll happily suggest another editor you could use :)
当
IndentWidth
与TabWidth
不同时,clang-format
就会发生这种情况。在内部
clang-format
使用空格,仅在末尾处使用空格制表符。有些设置会转换空格(如果有) - 请参阅 文档。IndentWidth=2
、TabWidth=8
将生成观察到的格式。不过,我不熟悉 gcc 的 git 存储库格式化政策。
This will happen with
clang-format
whenIndentWidth
differs fromTabWidth
.Internally
clang-format
uses spaces, only at the end are spaces tabified. There are settings which spaces, if any, are converted - seeUseTab
in docs.IndentWidth=2
,TabWidth=8
will produce the observed formatting.I am not familiar with gcc's git repository policy on formatting though.