将不同数据类型的值按顺序写入内存？或者，具有多种数据类型的数组？

发布于 2024-12-01 04:08:11 字数 1184 浏览 3 评论 0原文

我对用 C 语言编写相对较新。我使用在网上和印刷品中找到的资源自学。这是我的第一个真正的 C 编程项目。一定要热爱在职培训。

我正在用 C 语言编写一些代码，该代码正在 Texas Instruments C6701 数字信号处理器上使用。具体来说，我正在编写一组通信函数来通过串行端口进行接口。

我正在进行的项目有一个现有的数据包协议，用于通过串行端口发送数据。这是通过传递一个指向要传输的数据及其字节长度的指针来实现的。我所要做的就是将要传输的字节写入内存中的“数组”中（发送器将该字节序列复制到缓冲区中并传输它）。

我的问题涉及如何最好地格式化要传输的数据，我必须发送的数据由几种不同的数据类型组成（无符号字符、无符号整数、浮点等...）。我无法将所有内容扩展到浮点（或整数），因为我的通信带宽有限并且需要使数据包尽可能小。

我最初想使用数组来格式化数据，

unsigned char* dataTx[10];
dataTx[0]=char1;
dataTx[1]=char2;
etc...

这可以工作，除非我的数据不是所有的字符，有些是无符号整数或无符号短整型。

为了处理短整型和整型，我使用了位移位（现在让我们忽略小端与大端）。

unsigned char* dataTx[10];
dataTx[0]=short1>>8;
dataTx[1]=short1;
dataTx[2]=int1>>24;
dataTx[3]=int1>>16;
etc...

然而，我相信另一种（也是更好的？）方法来做到这一点是使用指针和指针算术。

unsigned char* dataTx[10]
*(dataTx+0) = int1;
*(dataTx+4) = short1;
*(dataTx+6) = char1;
etc...

我的问题（最后）是，哪种方法（位移位或指针算术）是更可接受的方法？另外，跑得更快吗？（我也有运行时限制）。

我的要求：数据连续位于内存中，没有间隙、中断或填充。

我对结构的了解还不够，还不知道结构是否可以作为解决方案。具体来说，我不知道结构是否总是连续且不间断地分配内存位置。我读到一些内容表明它们在 8 字节块中分配，并且可能引入填充字节。

现在我倾向于指针方法。感谢您阅读这篇看似很长的文章。

原文

I am relatively new to writing in C. I have self taught myself using what resources I have found online and in print. This is my first real project in C programming. Gotta love on-the-job training.

I am writing some code in C that is being used on a Texas Instruments C6701 Digital Signal Processor. Specifically, I am writing a set of communication functions to interface through a serial port.

The project I'm on has an existing packet protocol for sending data through the serial port. This works by handing over a pointer to the data to be transmitted and its length in bytes. All I have to do is write in the bytes to be transmitted into an "array" in memory (the transmitter copies that sequence of bytes into a buffer and transmits that).

My question pertains to how best to format the data to be transmitted, the data I have to send is composed of several different data types (unsigned char, unsigned int, float etc...). I can't expand everything up to float (or int) because I have a constrained communication bandwidth and need to keep packets as small as possible.

I originally wanted to use arrays to format the data,

unsigned char* dataTx[10];
dataTx[0]=char1;
dataTx[1]=char2;
etc...

This would work except not all my data is char, some is unsigned int or unsigned short.

To handle short and int I used bit shifting (lets ignore little-endian vs big-endian for now).

unsigned char* dataTx[10];
dataTx[0]=short1>>8;
dataTx[1]=short1;
dataTx[2]=int1>>24;
dataTx[3]=int1>>16;
etc...

However, I believe another (and better?) way to do this is to use pointers and pointer arithmetic.

unsigned char* dataTx[10]
*(dataTx+0) = int1;
*(dataTx+4) = short1;
*(dataTx+6) = char1;
etc...

My question (finally) is, is which method (bit shifting or pointer arithmetic) is the more acceptable method? Also, is one faster to run? (I also have run-time constraints).

My requirement: The data be located in memory serially, without gaps, breaks or padding.

I don't know enough about structures yet to know if a structure would work as a solution. Specifically, I don't know if a structure always allocates memory locations serially and without breaks. I read something that indicates they allocates in 8 byte blocks, and possibly introduce padding bytes.

Right now I'm leaning towards the pointer method. Thanks for reading this far into what seems to be a long post.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

习ぎ惯性依靠 2024-12-08 04:08:11

通常您会使用位移位方法，因为许多芯片不允许您将 4 字节整数复制到奇数字节地址（或者更准确地说，复制到从奇数字节地址开始的一组 4 字节））。这称为对齐。如果可移植性是一个问题，或者您的 DSP 不允许错位访问，则需要进行转移。如果您的 DSP 由于未对齐的访问而导致性能受到严重影响，您可能会担心。

但是，我不会编写如图所示的针对不同类型手动完成的转换的代码。我希望使用函数（可能是内联）或宏来处理数据的序列化和反序列化。例如：

unsigned char dataTx[1024];
unsigned char *dst = dataTx;

dst += st_int2(short1, dst);
dst += st_int4(int1, dst);
dst += st_char(str, len, dst);
...

在函数形式中，这些函数可能是：

size_t st_int2(uint16_t value, unsigned char *dst)
{
    *dst++ = (value >> 8) & 0xFF;
    *dst   = value & 0xFF;
    return 2;
}

size_t st_int4(uint32_t value, unsigned char *dst)
{
    *dst++ = (value >> 24) & 0xFF;
    *dst++ = (value >> 16) & 0xFF;
    *dst++ = (value >>  8) & 0xFF;
    *dst   = value & 0xFF;
    return 4;
}

size_t st_char(unsigned char *str, size_t len, unsigned char *dst)
{
    memmove(dst, str, len);
    return len;
}

当然，这样的函数会使代码变得乏味；另一方面，它们也减少了出错的机会。您可以决定名称是否应为 st_uint2() 而不是 st_int2() ——事实上，您可以决定长度是否应以字节为单位（如此处）或以位为单位（如在参数类型中）。只要你始终如一且无聊，你就可以做你想做的事。您还可以将这些函数组合成更大的函数来打包整个数据结构。

现代编译器可能不需要掩码操作（& 0xFF）。很久以前，我似乎记得它们对于避免某些平台上的某些编译器偶尔出现问题是必要的（因此，我的代码可以追溯到 20 世纪 80 年代，其中包含此类屏蔽操作）。上述平台可能已经安息了，所以我认为它们（仍然）在那里可能纯粹是偏执。

请注意，这些函数以大端顺序传递数据。这些函数可以在大端和小端机器上“按原样”使用，并且数据将在两种类型上正确解释，因此您可以使用此代码让不同的硬件通过网络进行通信，并且将会有没有沟通不畅。如果您要传达浮点值，则必须更多地担心网络上的表示形式。尽管如此，您可能应该以平台无关的格式传输数据，以便芯片类型之间的互操作尽可能简单。（这也是我使用其中包含数字的类型大小的原因；特别是“int”和“long”在不同平台上可能意味着不同的事物，但 4 字节有符号整数仍然是 4 字节有符号整数，即使您是不幸 - 或幸运 - 足以拥有一台具有 8 字节整数的机器。）

Usually you would use the bit shifting approach, because many chips do not allow you to copy, for example, a 4-byte integer to an odd byte address (or, more accurately, to a set of 4 bytes starting at an odd byte address). This is called alignment. If portability is an issue, or if your DSP does not allow misaligned access, then shifting is necessary. If your DSP incurs a significant performance hit for misaligned access, you might worry about it.

However, I would not write the code with the shifts for the different types done longhand as shown. I would expect to use functions (possibly inline) or macros to handle both the serialization and deserialization of the data. For example:

unsigned char dataTx[1024];
unsigned char *dst = dataTx;

dst += st_int2(short1, dst);
dst += st_int4(int1, dst);
dst += st_char(str, len, dst);
...

In function form, these functions might be:

size_t st_int2(uint16_t value, unsigned char *dst)
{
    *dst++ = (value >> 8) & 0xFF;
    *dst   = value & 0xFF;
    return 2;
}

size_t st_int4(uint32_t value, unsigned char *dst)
{
    *dst++ = (value >> 24) & 0xFF;
    *dst++ = (value >> 16) & 0xFF;
    *dst++ = (value >>  8) & 0xFF;
    *dst   = value & 0xFF;
    return 4;
}

size_t st_char(unsigned char *str, size_t len, unsigned char *dst)
{
    memmove(dst, str, len);
    return len;
}

Granted, such functions make the code boring; on the other hand, they reduce the chance for mistakes too. You can decide whether the names should be st_uint2() instead of st_int2() -- and, indeed, you can decide whether the lengths should be in bytes (as here) or in bits (as in the parameter types). As long as you're consistent and boring, you can do as you will. You can also combine these functions into bigger ones that package entire data structures.

The masking operations (& 0xFF) may not be necessary with modern compilers. Once upon a very long time ago, I seem to remember that they were necessary to avoid occasional problems with some compilers on some platforms (so, I have code dating back to the 1980s that include such masking operations). Said platforms have probably gone to rest in peace, so it may be pure paranoia on my part that they're (still) there.

Note that these functions are passing the data in big-endian order. The functions can be used 'as is' on both big-endian and little-endian machines, and the data will be interpreted correctly on both types, so you can have diverse hardware talking over the wire, using this code, and there will be no miscommunication. If you have floating point values to convey, you have to worry a bit more about the representations over the wire. Nevertheless, you should probably aim to have the data transferred in a platform-neutral format so that interworking between chip types is as simple as possible. (This is also why I used the type sizes with numbers in them; 'int' and 'long' in particular can mean different things on different platforms, but 4-byte signed integer remains a 4-byte signed integer, even if you are unlucky - or lucky - enough to have a machine with 8-byte integers.)

回复收藏 0 原文

素年丶 2024-12-08 04:08:11

您可能想使用联合数组。

回复收藏 0 原文

溇涏 2024-12-08 04:08:11

处理问题的最简单和最传统的方法是设置要发送的数据，然后将指向数据的指针传递到传输例程。最常见的示例是 POSIX send() 例程：

ssize_t send(int socket, const void *buffer, size_t length, int flags);

对于您的情况，您可以将其简化为：

ssize_t send(const void *buffer, size_t length);

然后使用类似以下内容：

send(&int1, sizeof int1);
send(&short1, sizeof short1);

将其发送出去。适合您情况的示例（但非常幼稚）实现可能是：

ssize_t send(const void *buffer, size_t length)
{
  size_t i;
  unsigned char *data = buffer;

  for (i = 0; i < length; i++)
  {
     dataTx[i] = data[i];
  }
}

换句话说，使用自动转换为 void *，然后返回 char * 以获取字节方式访问您的数据，然后适当地将其发送出去。

The easiest and most traditional way to handle your problem is to set up the data you want to send, and then pass a pointer to your data on to the transmission routine. The most common example would be the POSIX send() routine:

ssize_t send(int socket, const void *buffer, size_t length, int flags);

Which for your case you can simplify to:

ssize_t send(const void *buffer, size_t length);

And then use something like:

send(&int1, sizeof int1);
send(&short1, sizeof short1);

To send it out. An example (but pretty naive) implementation for your situation might be:

ssize_t send(const void *buffer, size_t length)
{
  size_t i;
  unsigned char *data = buffer;

  for (i = 0; i < length; i++)
  {
     dataTx[i] = data[i];
  }
}

In other words, use the automatic conversion to void * and then back to char * to get byte-wise access to your data, and then send it out appropriately.

回复收藏 0 原文

一瞬间的火花 2024-12-08 04:08:11

很长的问题，我会尝试更简短的答案。

不要继续*(dataTx+4)=short1;等等，因为这种方法可能会失败，因为大多数芯片可能只在某些对齐的位置上进行读/写。您可以通过 16 位访问按 2 对齐的位置，并通过 32 位访问按 4 对齐的位置，但举个例子：“int32 char8 int32” - 第二个 int32 的位置为 (dataTx+5) - 这不是 4 字节对齐，您可能会收到“总线错误”或类似的信息（取决于您将使用的 CPU）。希望您能理解这个问题。

第一种方法 - 你可以尝试 struct，如果你声明：

struct
{
    char a;
    int b;
    char c;
    short d;
};

你现在就没有麻烦了，因为编译器本身会关心结构对齐。当然，请阅读编译器中与对齐相关的选项（如果这是 gcc，则简称为对齐），因为可能有一个设置强制对结构字段进行某种对齐或对结构字段进行打包。 GCC 甚至可以定义每个结构的对齐方式（更多此处）。

另一种方法是使用一些“类似缓冲区的方法”——类似于 Carl Norum 的回答帖子（我不会重复该答案），但也考虑在复制更多数据时使用 memcpy() 调用（例如，long long 或 string），因为这可能比逐字节复制更快。

Long question, I'll try shorter answer.

Don't go on *(dataTx+4) = short1; etc. because this method may fail because most chips may do read/write only on some aligned positions. You can access by 16bit to positions aligned by 2, and 32bit on positions aligned by 4, but take an example of: "int32 char8 int32" - the second int32 have a position of (dataTx+5) - which is not 4-byte aligned, and you probably get the "bus error" or something like that (depending of CPU you'll use). Hope you understand this issue.

1st way - you can try struct, if you declare:

struct
{
    char a;
    int b;
    char c;
    short d;
};

you are now out-of-trouble, as the compilator itself would take care about struct alignment. Of course, read about alignment-related options in your compiler (if this is gcc, then this is simply called alignment), because there is probably a setting which force some alignment of struct fields or packing of struct fields. The GCC can even define alignment-per-struct (more here).

The other way is to use some "buffer-like approach" - something like in answer-post of Carl Norum (I won't be duplicating that answer), but also considering of use of memcpy() calls when more data is copied (e.g. long long or string), as this may be faster than copying byte-by-byte.

回复收藏 0 原文

~没有更多了~