是否有众所周知的“个人资料”？ C标准的？

发布于 2024-12-01 02:21:00 字数 1246 浏览 2 评论 0原文

我编写的 C 代码对实现做出了某些假设，例如：

char 是 8 位。
有符号整数类型是二进制补码。
>> 对有符号整数进行符号扩展。
整数除法将负商舍入为零。
double 是 IEEE-754 双精度数，可以在 uint64_t 之间进行类型双关，并获得预期的结果。
涉及 NaN 的比较始终评估为 false。
空指针的位全部为零。
所有数据指针都具有相同的表示形式，并且可以转换为 size_t 并再次转换回来，而不会丢失信息。
char* 上的指针算术与 size_t 上的普通算术相同。
函数指针可以转换为 void* 并再次转换回来，而不会丢失信息。

现在，所有这些都是 C 标准无法保证的，所以严格来说我的代码是不可移植的。然而，它们恰好适用于我当前目标的架构和 ABI，经过仔细考虑，我决定它们无法保留我未来需要目标的某些架构的风险是可以接受的低与我现在做出假设所带来的实际好处相比。

问题是：我如何最好地记录这个决定？我的许多假设几乎是每个人都做出的（非八位字节char？或符号数值整数？关于未来商业上成功的架构？）。其他的则更有争议——最危险的可能是关于函数指针的。但是，如果我只是列出我认为超出标准的所有内容，读者的眼睛就会变得呆滞，他可能不会注意到真正重要的内容。

那么，是否存在一些关于成为“有点正统”架构的众所周知的假设，我可以通过引用将其纳入其中，然后仅明确记录我超出的范围？（实际上，这样的“概要文件”将定义一种新语言，它是 C 的超集，但它可能不会用这么多字来承认这一点 - 并且它可能也不是一种实用的思考方式）。

澄清：我正在寻找一种记录我的选择的简便方法，而不是自动测试给定编译器是否符合我的期望的方法。后者显然也很有用，但并不能解决所有问题。例如，如果业务合作伙伴联系我们说：“我们正在制造一款基于 Google 新 G2015 芯片的设备；你们的软件可以在上面运行吗？” ——那么如果能够回答“我们还没有使用过那个架构，但如果它有一个满足此类要求的 C 编译器，那应该不是问题”。

进一步澄清，因为有人投票结束为“没有建设性”：我不是在这里寻求讨论，只是为了指向实际的、现有的、正式的文档，这些文档可以通过合并来简化我的文档参考。

原文

I write C code that makes certain assumptions about the implementation, such as:

char is 8 bits.
signed integral types are two's complement.
>> on signed integers sign-extends.
integer division rounds negative quotients towards zero.
double is IEEE-754 doubles and can be type-punned to and from uint64_t with the expected result.
comparisons involving NaN always evaluate to false.
a null pointer is all zero bits.
all data pointers have the same representation, and can be converted to size_t and back again without information loss.
pointer arithmetic on char* is the same as ordinary arithmetic on size_t.
functions pointers can be cast to void* and back again without information loss.

Now, all of these are things that the C standard doesn't guarantee, so strictly speaking my code is non-portable. However, they happen to be true on the architectures and ABIs I'm currently targeting, and after careful consideration I've decided that the risk they will fail to hold on some architecture that I'll need to target in the future is acceptably low compared to the pragmatic benefits I derive from making the assumptions now.

The question is: how do I best document this decision? Many of my assumptions are made by practically everyone (non-octet chars? or sign-magnitude integers? on a future, commercially successful, architecture?). Others are more arguable -- the most risky probably being the one about function pointers. But if I just list everything I assume beyond what the standard gives me, the reader's eyes are just going to glaze over, and he may not notice the ones that actually matter.

So, is there some well-known set of assumptions about being a "somewhat orthodox" architecture that I can incorporate by reference, and then only document explicitly where I go beyond even that? (Effectively such a "profile" would define a new language that is a superset of C, but it might not acknowledge that in so many words -- and it may not be a pragmatically useful way to think of it either).

Clarification: I'm looking for a shorthand way to document my choices, not for a way to test automatically whether a given compiler matches my expectations. The latter is obviously useful too, but does not solve everything. For example, if a business partner contacts us saying, "we're making a device based on Google's new G2015 chip; will your software run on it?" -- then it would be nice to be able to answer "we haven't worked with that arch yet, but it shouldn't be a problem if it has a C compiler that satisfies such-and-such".

Clarify even more since somebody has voted to close as "not constructive": I'm not looking for discussion here, just for pointers to actual, existing, formal documents that can simplify my documentation by being incorporated by reference.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

送你一个梦 2024-12-08 02:21:00

我将引入一个 STATIC_ASSERT 宏并将所有假设放入此类断言中。

回复收藏 0 原文

z祗昰~ 2024-12-08 02:21:00

不幸的是，不仅缺乏结合了 20 世纪 90 年代事实上的标准的扩展的 C 方言标准（补码、通用排名指针等），而且编译器的趋势正朝着相反的方向发展。方向。给定函数的以下要求：

* Accept int parameters x,y,z:
* Return 0 if x-y is computable as "int" and is less than Z
* Return 1 if x-y is computable as "int" and is not less than Z
* Return 0 or 1 if x-y is not computable */

20 世纪 90 年代的绝大多数编译器都允许：

int diffCompare(int x, int y, int z)
{ return (x-y) >= z; }

在某些平台上，如果 xy 之间的差异无法计算为 int，则计算速度会更快xy 的“包装”二进制补码值并进行比较，而在其他情况下，使用大于 int 的类型执行计算并进行比较会更快。然而，到 20 世纪 90 年代末，几乎每个 C 编译器都会实现上述代码，以使用其中一种在其硬件平台上更高效的方法。

然而，自 2010 年以来，编译器编写者似乎采取了这样的态度：如果计算溢出，编译器不应该以其平台正常的任何方式执行计算并让发生的事情发生，也不应该进行可识别的陷阱（这会破坏一些代码），但可以防止某些类型的错误程序行为），但相反，它们应该以溢出为借口来否定时间和因果律。因此，即使程序员对 20 世纪 90 年代的编译器产生的任何行为都非常满意，程序员也必须用以下内容替换代码：

{ return ((long)x-y) >= z; }

这会大大降低许多平台上的效率，或者

{ return x+(INT_MAX+1U)-y >= z+(INT_MAX+1U); }

需要程序员指定一堆计算实际上并不希望优化器会忽略它们（使用有符号比较来使它们变得不必要），并且会降低许多平台（尤其是 DSP）的效率，其中使用 (long) 会更有效率。

如果有标准的配置文件可以让程序员避免使用 INT_MAX+1U 来避免像上面那样令人讨厌的可怕的组装，那将会很有帮助，但如果趋势继续下去，它们将变得越来越必要。

Unfortunately, not only is there a lack of standards for a dialect of C that combines the extensions which have emerged as de facto standards during the 1990s (two's-complement, universally-ranked pointers, etc.) but compilers trends are moving in the opposite direction. Given the following requirements for a function:

* Accept int parameters x,y,z:
* Return 0 if x-y is computable as "int" and is less than Z
* Return 1 if x-y is computable as "int" and is not less than Z
* Return 0 or 1 if x-y is not computable */

The vast majority of compilers in the 1990s would have allowed:

int diffCompare(int x, int y, int z)
{ return (x-y) >= z; }

On some platforms, in cases where the difference between x-y was not computable as int, it would be faster to compute a "wrapped" two's-complement value of x-y and compare that, while on others it would be faster to perform the calculation using a type larger than int and compare that. By the late 1990s, however, nearly every C compiler would implement the above code to use one of whichever one of those approaches would have been more efficient on its hardware platform.

Since 2010, however, compiler writers seem to have taken the attitude that if computations overflow, compilers shouldn't perform the calculations in whatever fashion is normal for their platform and let what happens happens, nor should they recognizably trap (which would break some code, but could prevent certain kinds of errant program behavior), but instead they should overflows as an excuse to negate laws of time and causality. Consequently, even if a programmer would have been perfectly happy with any behavior a 1990s compiler would have produced, the programmer must replace the code with something like:

{ return ((long)x-y) >= z; }

which would greatly reduce efficiency on many platforms, or

{ return x+(INT_MAX+1U)-y >= z+(INT_MAX+1U); }

which requires specifying a bunch of calculations the programmer doesn't actually want in the hopes that the optimizer will omit them (using signed comparison to make them unnecessary), and would reduce efficiency on a number of platforms (especially DSPs) where the form using (long) would have been more efficient.

It would be helpful if there were standard profiles which would allow programmers to avoid the need for nasty horrible kludges like the above using INT_MAX+1U, but if trends continue they will become more and more necessary.

回复收藏 0 原文

情场扛把子 2024-12-08 02:21:00

大多数编译器文档都包含一个描述依赖于实现的功能的特定行为的部分。您能指出 gcc 或 msvc 文档的那一部分来描述您的假设吗？

回复收藏 0 原文

波浪屿的海角声 2024-12-08 02:21:00

您可以编写一个头文件“document.h”，在其中收集所有假设。
然后，在您知道做出非标准假设的每个文件中，您可以#include这样的文件。
也许 "document.h" 根本没有真正的句子，而只有注释文本和一些宏。

   // [T] DOCUMENT.H
   //

   #ifndef DOCUMENT_H
   #define DOCUMENT_H 
   // [S] 1. Basic assumptions.
   // 
   // If this file is included in a compilation unit it means that
   // the following assumptions are made:
   //      [1] A char has 8 bits.
   // [#]

   #define MY_CHARBITSIZE 8

   //      [2] IEEE 754 doubles are addopted for type: double.
   // ........
   // [S] 2. Detailed information
   //
   #endif

括号中的标签： [T] [S] [#] [1] [2] 代表：

* [T]: Document Title
* [S]: Section
* [#]: Print the following (non-commented) lines as a code-block.
* [1], [2]: Numbered items of a list.

现在，这里的想法是以不同的方式使用文件“document.h”：

解析文件，以便将“document.h”中的注释转换为某些可打印文档或一些基本 HTML。

因此，标签 [T] [S] [#] 等旨在由解析器解释，该解析器将任何注释转换为 HTML 文本行（例如），并生成

`< /h1>、`（或任何你想要的），当标签出现时。

如果您将解析器保留为一个简单的小程序，这可以让您快速处理此类文档。

You can write a header file "document.h" where you collect all your assumptions.
Then, in every file that you know that non-standard assumptions are made, you can #include such a file.
Perhaps "document.h" would not have real sentences at all, but only commented text and some macros.

   // [T] DOCUMENT.H
   //

   #ifndef DOCUMENT_H
   #define DOCUMENT_H 
   // [S] 1. Basic assumptions.
   // 
   // If this file is included in a compilation unit it means that
   // the following assumptions are made:
   //      [1] A char has 8 bits.
   // [#]

   #define MY_CHARBITSIZE 8

   //      [2] IEEE 754 doubles are addopted for type: double.
   // ........
   // [S] 2. Detailed information
   //
   #endif

The tags in brackets: [T] [S] [#] [1] [2] stand for:

* [T]: Document Title
* [S]: Section
* [#]: Print the following (non-commented) lines as a code-block.
* [1], [2]: Numbered items of a list.

Now, the idea here is to use the file "document.h" in a different way:

To parse the file in order to convert the comments in "document.h" to some printable document, or some basic HTML.

Thus, the tags [T] [S] [#] etc., are intended to be interpreted by a parser that convert any comment into an HTML line of text (for example), and generate <h1></h1>, <b></b> (or whatever you want), when a tag appears.

If you keep the parser as a simple and small program, this can give you a short hand to handle this kind of documentation.

回复收藏 0 原文

~没有更多了~