估计/计算嵌入式系统上写入函数所需的缓冲区空间的策略

发布于 2024-12-10 19:09:42 字数 2370 浏览 0 评论 0原文

这并不是一个令人惊叹的编程问题，而可能更多的是一个设计模式问题。我原以为这是嵌入式资源有限系统上的一个常见设计问题，但到目前为止我发现的所有问题似乎都不相关（但请指出我可能错过的任何相关问题）。

本质上，我试图制定出估计某些编写器函数所需的最大缓冲区大小的最佳策略，当该编写器函数的输出不固定时，特别是因为某些数据是可变长度的文本字符串。

这是一个在小型 ARM 微控制器上运行的 C 应用程序。应用程序需要通过 TCP 套接字发送各种消息类型。当我想要发送 TCP 数据包时，TCP 堆栈 (Keil RL) 为我提供一个缓冲区（库从其自己的池中分配），我可以将数据包数据有效负载写入其中。该缓冲区大小当然取决于 MSS；所以我们假设最多为 1460，但也可以更小。

一旦我有了这个缓冲区，我就会将此缓冲区及其长度传递给编写器函数，该编写器函数又可以调用各种嵌套编写器函数以构建完整的消息。之所以采用这种结构，是因为我实际上生成了一个小型 XML 文档，其中每个编写器函数通常生成一个特定的 XML 元素。每个写入器函数都想要将多个字节写入我分配的 TCP 数据包缓冲区。我只确切知道给定的编写器函数在运行时写入了多少字节，因为某些封装的内容取决于用户定义的可变长度的文本字符串。

有些消息的大小需要在（例如）2K 左右，这意味着它们可能会被分割到至少两个 TCP 数据包发送操作中。这些消息将通过调用一系列编写器函数来构造，这些函数一次产生例如一百个字节。

在调用每个编写器函数之前，或者可能在编写器函数本身内，我最初需要将可用的缓冲区空间与编写器函数需要的空间进行比较；如果没有足够的可用空间，则传输该数据包并稍后继续写入新的数据包。

我正在考虑的可能的解决方案是：

使用另一个更大的缓冲区来最初写入所有内容。由于资源限制，这不是首选。此外，我仍然希望有一种方法可以通过算法计算出我的消息编写器函数需要多少空间。
在编译时，为每个编写器函数生成一个“最坏情况大小”常量。每个编写器函数通常都会生成一个 XML 元素，例如 [string]，因此我可以使用以下内容：#define SPACE_NEEDED ( START_TAG_LENGTH + START_TAG_LENGTH + MAX_STRING_LENGTH + SOME_MARGIN）。无论如何，我的所有内容编写器函数都是从函数指针表中挑选出来的，因此我可以将每个编写器函数的最坏情况大小估计常量作为该表中的新列存在。在运行时，我根据估计常数检查缓冲区空间。这可能是我目前最喜欢的解决方案。唯一的缺点是它确实依赖于正确的维护才能发挥作用。
我的编写器函数提供了一种特殊的“虚拟运行”模式，它们在其中运行并计算它们想要写入但不写入任何内容的字节数。这可以通过简单地发送 NULL 代替缓冲区指针到函数来实现，在这种情况下，函数的返回值（通常表明写入缓冲区的数量）仅表明它想要写入的数量。我唯一不喜欢的是，在“虚拟”和“真实”调用之间，底层数据可能（至少在理论上）会发生变化。一个可能的解决方案是静态捕获底层数据。

预先感谢您的任何想法和评论。

解决方案

自从发布问题以来我实际上已经开始做的事情是让每个内容编写器函数接受一个状态或“迭代”参数，这允许 TCP 发送多次调用编写器功能。编写器被调用，直到它标记没有更多可写为止。如果 TCP 发送函数在某个迭代后确定缓冲区现在接近满，它将发送数据包，然后使用新的数据包缓冲区继续该过程。我认为这种技术与麦克斯的答案非常相似，因此我接受了。

关键是，在每次迭代中，必须设计一个内容编写器，使其不会向缓冲区写入超过 LENGTH 字节；每次调用写入器后，TCP 发送函数将检查数据包缓冲区中是否还有 LENGTH 空间，然后再次调用写入器。如果没有，它将在新数据包中继续。

我所做的另一个步骤是认真考虑如何构建消息标头。很明显，就像我认为几乎所有使用 TCP 的协议一样，必须在应用程序协议中实现一些指示总消息长度的方法。原因是 TCP 是基于流的协议，而不是基于数据包的协议。这又是一个让人头疼的地方，因为我需要一些预先的方法来了解插入起始标头的总消息长度。对此的简单解决方案是将消息头插入到每个发送的 TCP 数据包的开头，而不是仅在应用程序协议消息的开头（当然可能跨越多个 TCP 套接字），并且基本上实现分段< /em>.因此，在标头中，我实现了两个标志：一个 fragment 标志和一个 last-fragment 标志。因此，每个标头中的长度字段只需要说明特定数据包中有效负载的大小。在接收端，从流中读出各个标头+有效负载块，然后重新组装成完整的协议消息。

毫无疑问，这就是 HTTP 和许多其他协议在 TCP 上工作的非常简单的方式。非常有趣的是，只有当我尝试编写一个在 TCP 上工作的健壮协议时，我才开始意识到真正考虑消息头、帧等方面的消息结构的重要性，以便它能够正常工作通过流协议。

原文

This isn't a show-stopping programming problem as such, but perhaps more of a design pattern issue. I'd have thought it'd be a common design issue on embedded resource-limited systems, but none of the questions I found so far on SO seem relevant (but please point out anything relevant that I could have missed).

Essentially, I'm trying to work out the best strategy of estimating the largest buffer size required by some writer function, when that writer function's output isn't fixed, particularly because some of the data are text strings of variable length.

This is a C application that runs on a small ARM micro. The application needs to send various message types via TCP socket. When I want to send a TCP packet, the TCP stack (Keil RL) provides me with a buffer (which the library allocates from its own pool) into which I may write the packet data payload. That buffer size depends of course on the MSS; so let's assume it's 1460 at most, but it could be smaller.

Once I have this buffer, I pass this buffer and its length to a writer function, which in turn may call various nested writer functions in order to build the complete message. The reason for this structure is because I'm actually generating a small XML document, where each writer function typically generates a specific XML element. Each writer function wants to write a number of bytes to my allocated TCP packet buffer. I only know exactly how many bytes a given writer function writes at run-time, because some of the encapsulated content depends on user-defined text strings of variable length.

Some messages need to be around (say) 2K in size, meaning they're likely to be split across at least two TCP packet send operations. Those messages will be constructed by calling a series of writer functions that produce, say, a hundred bytes at a time.

Prior to making a call to each writer function, or perhaps within the writer function itself, I initially need to compare the buffer space available with how much that writer function requires; and if there isn't enough space available, then transmit that packet and continue writing into a fresh packet later.

Possible solutions I am considering are:

Use another much larger buffer to write everything into initially. This isn't preferred because of resource constraints. Furthermore, I would still wish for a means to algorithmically work out how much space I need by my message writer functions.
At compile time, produce a 'worst case size' constant for each writer function. Each writer function typically generates an XML element such as <START_TAG>[string]</START_TAG>, so I could have something like: #define SPACE_NEEDED ( START_TAG_LENGTH + START_TAG_LENGTH + MAX_STRING_LENGTH + SOME_MARGIN ). All of my content writer functions are picked out of a table of function pointers anyway, so I could have the worst-case size estimate constants for each writer function exist as a new column in that table. At run-time, I check the buffer room against that estimate constant. This probably my favourite solution at the moment. The only downside is that it does rely on correct maintenance to make it work.
My writer functions provide a special 'dummy run' mode where they run though and calculate how many bytes they want to write but don't write anything. This could be achieved by perhaps simply sending NULL in place of the buffer pointer to the function, in which case the functions's return value (which usually states amount written to buffer) just states how much it wants to write. The only thing I don't like about this is that, between the 'dummy' and 'real' call, the underlying data could - at least in theory - change. A possible solution for that could be to statically capture the underlying data.

Thanks in advance for any thoughts and comments.

Solution

Something I had actually already started doing since posting the question was to make each content writer function accept a state, or 'iteration' parameter, which allows the writer to be called many times over by the TCP send function. The writer is called until it flags that it has no more to write. If the TCP send function decides after a certain iteration that the buffer is now nearing full, it sends the packet and then the process continues later with a new packet buffer. This technique is very similar I think to Max's answer, which I've therefore accepted.

A key thing is that on each iteration, a content writer must be designed so that it won't write more than LENGTH bytes to the buffer; and after each call to the writer, the TCP send function will check that it has LENGTH room left in the packet buffer before calling the writer again. If not, it continues in a new packet.

Another step I did was to have a serious think about how I structure my message headers. It became apparent that, like I suppose with almost all protocols that use TCP, it is essential to implement into the application protocol some means of indicating the total message length. The reason for this is because TCP is a stream-based protocol, not a packet-based protocol. This is again where it got a bit of a headache because I needed some upfront means of knowing the total message length for insertion into the start header. The simple solution to this was to insert a message header into the start of every sent TCP packet, rather than only at the start of the application protocol message (which may of course span several TCP sockets), and basically implement fragmentation. So, in the header, I implemented two flags: a fragment flag, and a last-fragment flag. Therefore the length field in each header only needs to state the size of the payload in the particular packet. At the receiving end, individual header+payload chunks are read out of the stream and then reassembled into a complete protocol message.

This of course is no doubt very simplistically how HTTP and so many other protocols work over TCP. It's just quite interesting that, only once I've attempted to write a robust protocol that works over TCP, have I started to realise the importance of really thinking the your message structure in terms of headers, framing, and so forth so that it works over a stream protocol.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

此刻的回忆 2024-12-17 19:09:42

我在一个小得多的嵌入式系统中遇到了一个相关问题，该系统在 PIC 16 微控制器上运行（并用汇编语言而不是 C 语言编写）。我的“缓冲区大小”始终是两字节 UART 传输队列，并且我只有一个“编写器”函数，该函数遍历 DOM 并发出其 XML 序列化。

我想出的解决方案是将问题“从内到外”。写入器函数成为一项任务：每次调用它时，它都会写入尽可能多的字节（根据串行数据传输速率，可能>2），直到传输缓冲区已满，然后返回。然而，它会在状态变量中记住它已经通过 DOM 的程度。下次调用时，它将从先前到达的点继续进行。编写器任务从循环中重复调用。如果没有空闲缓冲区空间，则立即返回而不改变其状态。它从无限循环中重复调用，充当该任务和系统中其他任务的循环调度程序。每次循环时，都会有一段延迟，等待 TMR0 定时器溢出。因此每个任务在固定的时间片内只被调用一次。

在我的实现中，数据由 TxEmpty 中断例程发送，但也可以由另一个任务发送。

我猜想这里的“模式”是程序计数器的一个作用是保存控制流的当前状态，并且这个作用可以从 PC 抽象到另一个数据结构。

显然，这并不立即适用于您更大、更高级别的系统。但这是看待问题的不同方式，可能会激发您自己独特的见解。

祝你好运！

回复收藏 0 原文

~没有更多了~