C 中的字符数组
我是 c 的新手。只是有一个关于c中字符数组(或字符串)的问题:当我想在C中创建字符数组时,是否必须同时给出大小?
因为我们可能不知道我们实际需要的尺寸。以客户端-服务器程序为例,如果我们想声明一个字符数组,供服务器程序接收来自客户端程序的消息,但我们不知道消息的大小,我们可以这样做
char buffer[1000];
recv(fd,buffer, 1000, 0);
:如果实际消息长度只有10,会不会造成大量内存浪费?
I'm new to c. Just have a question about the character arrays (or string) in c: When I want to create a character array in C, do I have to give the size at the same time?
Because we may not know the size that we actually need. For example of client-server program, if we want to declare a character array for the server program to receive a message from the client program, but we don't know the size of the message, we could do it like this:
char buffer[1000];
recv(fd,buffer, 1000, 0);
But what if the actual message is only of length 10. Will that cause a lot of wasted memory?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
是的,即使使用malloc,也必须提前决定维度。
当您从套接字读取数据时(如示例所示),您通常会使用大小合理的缓冲区,并在使用数据后立即将数据分派到其他结构中。无论如何,1000 字节并不是太多的内存浪费,而且肯定比从某些内存管理器一次请求一个字节要快:)
Yes, you have to decide the dimension in advance, even if you use malloc.
When you read from sockets, as in the example, you usually use a buffer with a reasonable size, and dispatch data in other structure as soon you consume it. In any case, 1000 bytes is not a so much memory waste and is for sure faster than asking a byte at a time from some memory manager :)
是的,如果在声明时没有初始化 char 数组,则必须给出大小。解决您的问题的更好方法是确定运行时缓冲区的最佳大小并动态分配内存。
Yes, you have to give the size if you are not initializing the char array at the time of declaration. Better approach for your problem is to identify the optimum size of the buffer at run time and dynamically allocate the memory.
您要问的是如何动态调整缓冲区的大小。这是通过动态分配来完成的,例如使用内存分配器
malloc()
。不过,使用它会给您带来重要的责任:当您使用完缓冲区后,您必须自己将其返回到系统。如果使用 malloc() [或 calloc()],则使用free()
返回它。例如:
剩下的唯一需要解决的问题是如何确定您需要的尺寸。如果您正在接收暗示大小的数据,则需要将通信分成两个 recv() 调用:首先获取所有数据包将具有的最小大小,然后分配完整缓冲区,然后再接收其余的。
What you're asking about is how to dynamically size a buffer. This is done with a dynamic allocation such as using
malloc()
-- a memory allocator. Using it gives you an important responsibility though: when you're done using the buffer you must return it to the system yourself. If using malloc() [or calloc()], you return it withfree()
.For example:
The only problem left to solve is how to determine the size you'll need. If you're recv()'ing data that hints at the size, you'll need to break the communication into two recv() calls: first getting the minimum size all packets will have, then allocating the full buffer, then recv'ing the rest.
当您不知道输入数据的确切数量时,请执行以下操作:
将数据从缓冲区复制到存储
4.1 如果存储空间不足,则重新分配内存(例如,大小比此时大一倍)
执行步骤 3 和 4,除非“END OF STREAM”
您的存储空间现在包含数据。
When you don't know the exact amount of input data, do as follows:
Copy the data from the buffer to the storage
4.1 If there is not enough place in storage, re-allocate the memory (e.g. with a size twice bigger than it is at this point)
Do steps 3 and 4 unless the "END OF STREAM"
Your storage contains the data now.
如果您不知道先验大小,那么您别无选择,只能使用 malloc(或您选择的语言中的任何等效机制)动态创建它。
创建大小为
m
的缓冲区,但只接收大小为 n 且 nn
n 的输入字符串m
并不是浪费内存,而是工程上的妥协。如果您创建的缓冲区大小接近预期输入,则在
m >>> 的情况下,您可能需要多次重新填充缓冲区。 n
。通常,缓冲区上的迭代与 I/O 操作相关,因此现在您可能会节省一些字节(这在当今的硬件中实际上没什么意义),但代价是可能会增加其他端的问题。特别适用于客户端-服务器应用程序。如果我们谈论的是资源受限的嵌入式系统,那就是另一回事了。您应该担心算法是否正确且可靠。然后你会担心,如果可以的话,是否要在这里或那里削减一些字节。
对我来说,我宁愿创建一个比平均输入大 2 到 10 倍的缓冲区(不是您的情况中的最小输入,而是平均值),假设我的输入在大小上往往具有缓慢的标准偏差。否则,我会选择大小的 20 倍或更多(特别是如果内存很便宜,并且这样做可以最大限度地减少对磁盘或 NIC 卡的影响。)
在最基本的设置中,通常会在读取配置项时获取缓冲区的大小关闭文件(或作为参数传递),如果未提供,则默认为
默认编译时值
。然后,您可以根据观察到的输入大小调整缓冲区的大小。更复杂的算法(例如 TCP)会在运行时调整其缓冲区的大小,以更好地适应其大小可能/将随时间变化的输入。
If you don't know the size a-priori, then you have no choice but to create it dynamically using malloc (or whatever equivalent mechanism in your language of choice.)
Creating a buffer of size
m
, but only receiving an input string of sizen
withn < m
is not a waste of memory, but an engineering compromise.If you create your buffer with a size close to the intended input, you risk having to refill the buffer many, many times for those cases where
m >> n
. Typically, iterations over the buffer are tied up with I/O operations, so now you might be saving some bytes (which is really nothing in today's hardware) at the expense of potentially increasing the problems in some other end. Specially for client-server apps. If we were talking about resource-constrained embedded systems, that'd be another thing.You should be worrying about getting your algorithms right and solid. Then you worry, if you can, about shaving off a few bytes here and there.
For me, I'd rather create a buffer that is 2 to 10 times greater than the average input (not the smallest input as in your case, but the average), assuming my input tends to have a slow standard deviation in size. Otherwise, I'd go 20 times the size or more (specially if memory is cheap and doing this minimizes hitting the disk or the NIC card.)
At the most basic setup, one typically gets the size of the buffer as a configuration item read off a file (or passed as an argument), and defaulting to a
default compile time value
if none is provided. Then you can adjust the size of your buffers according to the observed input sizes.More elaborate algorithms (say TCP) adjust the size of their buffers at run-time to better accommodate input whose size might/will change over time.
即使你使用malloc,你也必须先定义大小!因此,您可以提供一个能够接受消息的大量数字,例如:
如果消息较小或较大,您可以重新分配它以释放未使用的位置或占用未使用的位置
注意:请确保包含 stdlib.h 库
Even if you use malloc you also must define the size first! So instead you give a large number that is capable of accepting the message like:
In case of small message or large you can reallocate it to release the unused locations or to occupy the unused locations
Note: make sure to include stdlib.h library