使用 C 预处理器计算 8 位 CRC？

发布于 2025-01-07 11:39:40 字数 471 浏览 3 评论 0原文

我正在为一个只有几个字节 RAM 的微型 8 位微控制器编写代码。它的工作很简单，就是传输 7 个 16 位字，然后传输这些字的 CRC。字的值是在编译时选择的。 CRC 具体来说是“除法的余数” 字 0 到字 6 作为无符号数除以多项式 x^8+x²+x+1（初始值 0xFF）。”

是否可以使用 C 预处理器？

#define CALC_CRC(a,b,c,d,e,f,g)    /* what goes here? */

#define W0    0x6301
#define W1    0x12AF
#define W2    0x7753
#define W3    0x0007
#define W4    0x0007
#define W5    0x5621
#define W6    0x5422
#define CRC   CALC_CRC(W0, W1, W2, W3, W4, W5, W6)

原文

I'm writing code for a tiny 8-bit microcontroller with only a few bytes of RAM. It has a simple job which is to transmit 7 16-bit words, then the CRC of those words. The values of the words are chosen at compile time. The CRC specifically is "remainder of division of
word 0 to word 6 as unsigned number divided by the polynomial x^8+x²+x+1 (initial value 0xFF)."

Is it possible to calculate the CRC of those bytes at compile time using the C preprocessor?

#define CALC_CRC(a,b,c,d,e,f,g)    /* what goes here? */

#define W0    0x6301
#define W1    0x12AF
#define W2    0x7753
#define W3    0x0007
#define W4    0x0007
#define W5    0x5621
#define W6    0x5422
#define CRC   CALC_CRC(W0, W1, W2, W3, W4, W5, W6)

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

等风来 2025-01-14 11:39:41

可以设计一个在编译时执行 CRC 计算的宏。顺便说一句

 // Choosing names to be short and hopefully unique.
 #define cZX((n),b,v) (((n) & (1 << b)) ? v : 0)
 #define cZY((n),b, w,x,y,z) (cZX((n),b,w)^CzX((n),b+1,x)^CzX((n),b+2,y)^cZX((n),b+3,z))
 #define CRC(n) (cZY((n),0,cZ0,cZ1,cZ2,cZ3)^cZY((n),4,cZ4,cZ5,cZ6,cZ7))

should probably work, and will be very efficient if (n) can be evaluated as a compile-time constant; it will simply evaluate to a constant itself. On the other hand, if n is an expression, that expression will end up getting recomputed eight times. Even if n is a simple variable, the resulting code will likely be significantly larger than the fastest non-table-based way of writing it, and may be slower than the most compact way of writing it.

，我真正希望在 C 和 C++ 标准中看到的一件事是指定重载的方法，只有当特定参数可以被评估为编译时常量时，该重载才会用于内联声明的函数。语义将是这样的，即不“保证”在编译器可能能够确定值的每种情况下都会使用任何此类重载，但可以保证 (1) 不会使用此类重载在任何情况下，必须在运行时评估“编译时常量”参数，并且 (2) 在此类重载中被视为常量的任何参数将在从其调用的任何函数中被视为常量。在很多情况下，如果函数的参数是常量，则可以将其编写为计算编译时常量，但运行时计算绝对是可怕的。例如：

#define bit_reverse_byte(n) ( (((n) & 128)>>7)|(((n) & 64)>>5)|(((n) & 32)>>3)|(((n) & 16)>>1)|
  (((n) & 8)<<1)|(((n) & 4)<<3)|(((n) & 2)<<5)|(((n) & 1)<<7) )
#define bit_reverse_word(n) (bit_reverse_byte((n) >> 8) | (bit_reverse_byte(n) << 8))

在 PIC 上用 C 语言简单渲染一个非循环单字节位反转函数大约需要 17-19 个单周期指令；一个字位反转将是 34，或者大约 10 加上一个字节反转函数（将执行两次）。最佳汇编代码约为 15 个用于字节反转的单周期指令或 17 个用于字反转的单周期指令。计算某些字节变量b的bit_reverse_byte(b)将需要数十条指令，总计数十个周期。计算某些 16 位字的 bit_reverse_word(w)w` 可能需要数百条指令，并需要数百或数千个周期来执行。如果可以在将扩展为总共四个指令（基本上只是加载结果）的场景中使用类似上述公式的内容来标记要内联扩展的函数，但在内联的场景中使用函数调用，那就太好了扩张将是令人发指的。

It is possible to design a macro which will perform a CRC calculation at compile time. Something like

 // Choosing names to be short and hopefully unique.
 #define cZX((n),b,v) (((n) & (1 << b)) ? v : 0)
 #define cZY((n),b, w,x,y,z) (cZX((n),b,w)^CzX((n),b+1,x)^CzX((n),b+2,y)^cZX((n),b+3,z))
 #define CRC(n) (cZY((n),0,cZ0,cZ1,cZ2,cZ3)^cZY((n),4,cZ4,cZ5,cZ6,cZ7))

BTW, one thing I'd really like to see in the C and C++ standard would be a means of specifying overloads which would be used for functions declared inline only if particular parameters could be evaluated as compile-time constants. The semantics would be such that there would be no 'guarantee' that any such overload would be used in every case where a compiler might be able to determine a value, but there would be a guarantee that (1) no such overload would be used in any case where a "compile-time-const" parameter would have to be evaluated at runtime, and (2) any parameter which is considered a constant in one such overload will be considered a constant in any functions invoked from it. There are a lot of cases where a function could written to evaluate to a compile-time constant if its parameter is constant, but where run-time evaluation would be absolutely horrible. For example:

#define bit_reverse_byte(n) ( (((n) & 128)>>7)|(((n) & 64)>>5)|(((n) & 32)>>3)|(((n) & 16)>>1)|
  (((n) & 8)<<1)|(((n) & 4)<<3)|(((n) & 2)<<5)|(((n) & 1)<<7) )
#define bit_reverse_word(n) (bit_reverse_byte((n) >> 8) | (bit_reverse_byte(n) << 8))

A simple rendering of a non-looped single-byte bit-reverse function in C on the PIC would be about 17-19 single-cycle instructions; a word bit-reverse would be 34, or about 10 plus a byte-reverse function (which would execute twice). Optimal assembly code would be about 15 single-cycle instructions for byte reverse or 17 for word-reverse. Computing bit_reverse_byte(b) for some byte variable b would take many dozens of instructions totalling many dozens of cycles. Computing bit_reverse_word(w) for some 16-bit wordw` would probably take hundreds of instructions taking hundreds or thousands of cycles to execute. It would be really nice if one could mark a function to be expanded inline using something like the above formulation in the scenario where it would expand to a total of four instructions (basically just loading the result) but use a function call in scenarios where inline expansion would be heinous.

回复收藏 0 原文

听风吹 2025-01-14 11:39:41

最简单的校验和算法是所谓的纵向奇偶校验，它将数据分成具有固定位数 n 的“字”，然后计算所有这些字的异或。结果作为额外单词附加到消息中。

为了检查消息的完整性，接收方计算其所有字的异或，包括校验和；如果结果不是带有 n 个零的字，则接收方知道发生了传输错误。

（来源：wiki）

在您的示例中：

#define CALC_LRC(a,b,c,d,e,f) ((a)^(b)^(c)^(d)^(e)^(f))

The simplest checksum algorithm is the so-called longitudinal parity check, which breaks the data into "words" with a fixed number n of bits, and then computes the exclusive or of all those words. The result is appended to the message as an extra word.

To check the integrity of a message, the receiver computes the exclusive or of all its words, including the checksum; if the result is not a word with n zeros, the receiver knows that a transmission error occurred.

(souce: wiki)

In your example:

#define CALC_LRC(a,b,c,d,e,f) ((a)^(b)^(c)^(d)^(e)^(f))

回复收藏 0 原文

短叹 2025-01-14 11:39:41

免责声明：这并不是真正的直接答案，而是一系列问题和建议，对于评论来说太长了。

第一个问题：您是否可以控制协议的两端，例如，您可以通过自己或同事控制另一端的代码来选择校验和算法吗？

如果是，则问题#1：

您需要评估为什么需要校验和、什么校验和合适，以及接收带有有效校验和的损坏消息的后果（这会影响内容和原因）。

您的传输介质、协议、比特率等是什么？您是否预计/观察到位错误？例如，对于从同一板上的一个芯片到另一个芯片的 SPI 或 I2C，如果出现位错误，则可能是硬件工程师的错误，或者您需要降低时钟速率，或者两者兼而有之。校验和不会有什么坏处，但实际上并不是必要的。另一方面，在嘈杂的环境中使用红外信号时，出错的可能性会高得多。

不良消息的后果始终是这里最重要的问题。因此，如果您正在编写数字室内温度计的控制器并发送一条消息以每秒 10 次更新显示，那么每 1000 条消息中的一个错误值几乎没有任何真正的危害。没有校验和或校验和较弱应该没问题。

如果这 6 个字节发射导弹、设置机器人手术刀的位置或导致资金转移，您最好确保自己拥有正确的校验和，甚至可能想要查看加密哈希（这可能需要更多 RAM）比你拥有的）。

对于中间的东西，对产品的性能/满意度有明显的损害，但没有真正的伤害，这是你的决定。例如，一台偶尔改变音量而不是频道的电视可能会惹恼客户——如果良好的 CRC 检测到错误，那么这不仅仅是简单地删除命令，而是如果您从事的是廉价/如果山寨电视能够更快地将产品推向市场，那可能还不错。

那么您需要什么校验和？

如果任一端或两端都具有对内置于外设中的校验和的硬件支持（例如在 SPI 中相当常见），那么这可能是一个明智的选择。然后它就变得或多或少可以自由计算。

正如 vulkanino 的答案所建议的，LRC 是最简单的算法。

维基百科有一些关于如何/为什么选择多项式（如果您确实需要 CRC）的不错的信息：
http://en.wikipedia.org/wiki/Cyclic_redundancy_check

如果问题 # 为“否” 1：

对方需要什么CRC算法/多项式？这就是您所坚持的问题，但告诉我们可能会给您带来更好/更完整的答案。

实现思路：

大多数算法在 RAM/寄存器方面都相当轻量，只需要几个额外的字节。一般来说，函数会产生更好、更干净、更易读、调试器友好的代码。

您应该将宏解决方案视为一种优化技巧，并且像所有优化技巧一样，过早地跳到它们可能会浪费开发时间，并且会导致更多问题，而不是其价值。

使用宏还会产生一些您可能尚未考虑到的奇怪含义：
您知道预处理器只能在消息中的所有字节在编译时固定的情况下才能进行计算，对吧？如果其中有变量，编译器必须生成代码。如果没有函数，该代码将在每次使用时被内联（是的，这可能意味着大量的 ROM 使用）。如果所有字节都是可变的，那么该代码可能比用 C 语言编写函数更糟糕。或者使用一个好的编译器，它可能会更好。很难肯定。另一方面，如果根据发送的消息，不同的字节数是可变的，那么您最终可能会得到多个版本的代码，每个版本都针对特定用途进行了优化。

Disclaimer: this is not really a direct answer, but rather a series of questions and suggestions that are too long for a comment.

First Question: Do you have control over both ends of the protocol, e.g. can you choose the checksum algorithm by means of either yourself or a coworker controlling the code on the other end?

If YES to question #1:

You need to evaluate why you need the checksum, what checksum is appropriate, and the consequences of receiving a corrupt message with a valid checksum (which factors into both the what & why).

What is your transmission medium, protocol, bitrate, etc? Are you expecting/observing bit errors? So for example, with SPI or I2C from one chip to another on the same board, if you have bit errors, it's probably the HW engineers fault or you need to slow the clock rate, or both. A checksum can't hurt, but shouldn't really be necessary. On the other hand, with an infrared signal in a noisy environment, and you'll have a much higher probability of error.

Consequences of a bad message is always the most important question here. So if you're writing the controller for digital room thermometer and sending a message to update the display 10x a second, one bad value ever 1000 messages has very little if any real harm. No checksum or a weak checksum should be fine.

If these 6 bytes fire a missile, set the position of a robotic scalpel, or cause the transfer of money, you better be damn sure you have the right checksum, and may even want to look into a cryptographic hash (which may require more RAM than you have).

For in-between stuff, with noticeable detriment to performance/satisfaction with the product, but no real harm, its your call. For example, a TV that occasionally changes the volume instead of the channel could annoy the hell out of customers--more so than simply dropping the command if a good CRC detects an error, but if you're in the business of making cheap/knock-off TVs that might be OK if it gets product to market faster.

So what checksum do you need?

If either or both ends have HW support for a checksum built into the peripheral (fairly common in SPI for example), that might be a wise choice. Then it becomes more or less free to calculate.

An LRC, as suggested by vulkanino's answer, is the simplest algorithm.

Wikipedia has some decent info on how/why to choose a polynomial if you really need a CRC:
http://en.wikipedia.org/wiki/Cyclic_redundancy_check

If NO to question #1:

What CRC algorithm/polynomial does the other end require? That's what you're stuck with, but telling us might get you a better/more complete answer.

Thoughts on implementation:

Most of the algorithms are pretty light-weight in terms of RAM/registers, requiring only a couple extra bytes. In general, a function will result in better, cleaner, more readable, debugger-friendly code.

You should think of the macro solution as an optimization trick, and like all optimization tricks, jumping to them to early can be a waste of development time and a cause of more problems than it's worth.

Using a macro also has some strange implications you may not have considered yet:
You are aware that the preprocessor can only do the calculation if all the bytes in a message are fixed at compile time, right? If you have a variable in there, the compiler has to generate code. Without a function, that code will be inlined every time it's used (yes, that could mean lots of ROM usage). If all the bytes are variable, that code might be worse than just writing the function in C. Or with a good compiler, it might be better. Tough to say for certain. On the other hand, if a different number of bytes are variable depending on the message being sent, you might end up with several versions of the code, each optimized for that particular usage.

回复收藏 0 原文

~没有更多了~