可变大小结构 C++
这是在 C++ 中创建可变大小结构的最佳方法吗? 我不想使用向量,因为初始化后长度不会改变。
struct Packet
{
unsigned int bytelength;
unsigned int data[];
};
Packet* CreatePacket(unsigned int length)
{
Packet *output = (Packet*) malloc((length+1)*sizeof(unsigned int));
output->bytelength = length;
return output;
}
编辑:重命名变量名称并更改代码以使其更正确。
Is this the best way to make a variable sized struct in C++? I don't want to use vector because the length doesn't change after initialization.
struct Packet
{
unsigned int bytelength;
unsigned int data[];
};
Packet* CreatePacket(unsigned int length)
{
Packet *output = (Packet*) malloc((length+1)*sizeof(unsigned int));
output->bytelength = length;
return output;
}
Edit: renamed variable names and changed code to be more correct.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(11)
您可能想要比矢量更轻的东西以获得高性能。 您还需要非常具体地了解数据包的大小,以实现跨平台。 但您也不想担心内存泄漏。
幸运的是,boost 库完成了大部分困难的部分:
You probably want something lighter than a vector for high performances. You also want to be very specific about the size of your packet to be cross-platform. But you don't want to bother about memory leaks either.
Fortunately the boost library did most of the hard part:
对于初始化后将修复的未知大小的数组使用向量没有任何问题。 恕我直言,这正是向量的用途。 一旦你初始化了它,你就可以假装它是一个数组,并且它的行为应该是相同的(包括时间行为)。
There's nothing whatsoever wrong with using vector for arrays of unknown size that will be fixed after initialization. IMHO, that's exactly what vectors are for. Once you have it initialized, you can pretend the thing is an array, and it should behave the same (including time behavior).
免责声明:我编写了一个小型库来探索这个概念:https ://github.com/ppetr/refcounted-var-sized-class
我们想要为
T
类型的数据结构和类型为元素的数组分配单个内存块A
。 在大多数情况下,A
只是char
。为此,我们定义一个 RAII 类来分配和释放这样的内存块。 这带来了几个困难:
char
并将结构体放置在我们自己的块中。 为此std::aligned_storage
将有所帮助。alignof(T) - 1
字节,然后使用std::align
。一旦我们处理了内存分配问题,我们就可以定义一个包装类,在分配的内存块中初始化
T
和一个A
数组。这种类型是可移动的,但显然不可复制。 我们可以提供一个函数,将其转换为具有自定义删除器的
shared_ptr
。 但这需要在内部为引用计数器分配另一小块内存(另请参阅如何实现 std::tr1::shared_ptr ?)。这可以通过引入一种专门的数据类型来解决,该数据类型将在单个结构中保存我们的
Placement
、引用计数器和具有实际数据类型的字段。 有关更多详细信息,请参阅我的 refcount_struct.h 。Disclaimer: I wrote a small library to explore this concept: https://github.com/ppetr/refcounted-var-sized-class
We want to allocate a single block of memory for a data structure of type
T
and an array of elements of typeA
. In most casesA
will be justchar
.For this let's define a RAII class to allocate and deallocate such a memory block. This poses several difficulties:
char
s and place the structure in the block ourselves. For thisstd::aligned_storage
will be helpful.alignof(T) - 1
bytes and then usestd::align
.Once we've dealt with the problem of memory allocation, we can define a wrapper class that initializes
T
and an array ofA
in an allocated memory block.This type is moveable, but obviously not copyable. We could provide a function to convert it into a
shared_ptr
with a custom deleter. But this will need to internally allocate another small block of memory for a reference counter (see also How is the std::tr1::shared_ptr implemented?).This can be solved by introducing a specialized data type that will hold our
Placement
, a reference counter and a field with the actual data type in a single structure. For more details see my refcount_struct.h.您应该声明一个指针,而不是一个未指定长度的数组。
You should declare a pointer, not an array with an unspecified length.
关于您正在做的事情的一些想法:
使用 C 风格的可变长度结构惯用语允许您为每个数据包执行一次免费存储分配,这是
struct Packet
包含一个std::vector
。 如果您要分配非常大量的数据包,那么执行一半的免费存储分配/解除分配可能会非常重要。 如果您还进行网络访问,那么等待网络所花费的时间可能会更显着。该结构代表一个数据包。 您是否打算从套接字直接读取/写入
struct Packet
? 如果是这样,您可能需要考虑字节顺序。 发送数据包时是否必须将主机字节顺序转换为网络字节顺序,而接收数据包时则必须将主机字节顺序转换为网络字节顺序? 如果是这样,那么您可以在可变长度结构中对数据进行字节交换。 如果将其转换为使用向量,则编写用于序列化/反序列化数据包的方法是有意义的。 这些方法会将其传输到连续缓冲区或从连续缓冲区传输,同时考虑字节顺序。同样,您可能需要考虑对齐和打包。
你永远不能子类化
Packet
。 如果这样做,那么子类的成员变量将与数组重叠。您可以使用
Packet* p = ::operator new(size)
和:: 代替
,因为malloc
和free
操作符delete(p)struct Packet
是一种POD类型,目前无法从调用其默认构造函数和析构函数中受益。 这样做的(潜在)好处是全局operator new
使用全局new-handler和/或异常来处理错误,如果这对你很重要的话。可以使可变长度结构惯用法与 new 和 delete 运算符一起使用,但效果不佳。 您可以通过实现
static void*operator new(size_t size, unsigned int bitlength)
来创建一个自定义operator new
来获取数组长度,但您仍然需要设置位长度成员变量。 如果您使用构造函数执行此操作,则可以使用稍微冗余的表达式Packet* p = new(len) Packet(len)
来分配数据包。 与使用全局运算符 new 和运算符删除相比,我看到的唯一好处是代码的客户端只需调用删除 p 而不是 <代码>::运算符删除(p)。 只要正确调用它们,将分配/释放包装在单独的函数中(而不是直接调用delete p
)就可以了。Some thoughts on what you're doing:
Using the C-style variable length struct idiom allows you to perform one free store allocation per packet, which is half as many as would be required if
struct Packet
contained astd::vector
. If you are allocating a very large number of packets, then performing half as many free store allocations/deallocations may very well be significant. If you are also doing network accesses, then the time spent waiting for the network will probably be more significant.This structure represents a packet. Are you planning to read/write from a socket directly into a
struct Packet
? If so, you probably need to consider byte order. Are you going to have to convert from host to network byte order when sending packets, and vice versa when receiving packets? If so, then you could byte-swap the data in place in your variable length struct. If you converted this to use a vector, it would make sense to write methods for serializing / deserializing the packet. These methods would transfer it to/from a contiguous buffer, taking byte order into account.Likewise, you may need to take alignment and packing into account.
You can never subclass
Packet
. If you did, then the subclass's member variables would overlap with the array.Instead of
malloc
andfree
, you could usePacket* p = ::operator new(size)
and::operator delete(p)
, sincestruct Packet
is a POD type and does not currently benefit from having its default constructor and its destructor called. The (potential) benefit of doing so is that the globaloperator new
handles errors using the global new-handler and/or exceptions, if that matters to you.It is possible to make the variable length struct idiom work with the new and delete operators, but not well. You could create a custom
operator new
that takes an array length by implementingstatic void* operator new(size_t size, unsigned int bitlength)
, but you would still have to set the bitlength member variable. If you did this with a constructor, you could use the slightly redundant expressionPacket* p = new(len) Packet(len)
to allocate a packet. The only benefit I see compared to using globaloperator new
andoperator delete
would be that clients of your code could just calldelete p
instead of::operator delete(p)
. Wrapping the allocation/deallocation in separate functions (instead of callingdelete p
directly) is fine as long as they get called correctly.如果您从不添加构造函数/析构函数,则使用 malloc/free 进行分配的赋值运算符或虚拟函数是安全的。
它在 C++ 圈子里是不受欢迎的,但我认为如果你在代码中记录它的话,它的用法是可以的。
对您的代码的一些评论:
如果我没记错的话,声明没有长度的数组是非标准的。 它适用于大多数编译器,但可能会向您发出警告。 如果您想符合要求,请声明长度为 1 的数组。
这可行,但您没有考虑结构的大小。 一旦您向结构中添加新成员,代码就会中断。 最好这样做:
并在数据包结构定义中写入注释,数据必须是最后一个成员。
顺便说一句 - 通过一次分配来分配结构和数据是一件好事。 通过这种方式,您可以将分配数量减半,并且还可以改善数据的局部性。 如果您分配大量包,这可以大大提高性能。
不幸的是,c++ 没有提供一个好的机制来做到这一点,所以你经常在现实世界的应用程序中遇到这样的 malloc/free hacks。
If you never add a constructor/destructor, assignment operators or virtual functions to your structure using malloc/free for allocation is safe.
It's frowned upon in c++ circles, but I consider the usage of it okay if you document it in the code.
Some comments to your code:
If I remember right declaring an array without a length is non-standard. It works on most compilers but may give you a warning. If you want to be compliant declare your array of length 1.
This works, but you don't take the size of the structure into account. The code will break once you add new members to your structure. Better do it this way:
And write a comment into your packet structure definition that data must be the last member.
Btw - allocating the structure and the data with a single allocation is a good thing. You halve the number of allocations that way, and you improve the locality of data as well. This can improve the performance quite a bit if you allocate lots of packages.
Unfortunately c++ does not provide a good mechanism to do this, so you often end up with such malloc/free hacks in real world applications.
这是可以的(并且是 C 语言的标准做法)。
但这对于 C++ 来说并不是一个好主意。
这是因为编译器会围绕类自动为您生成一整套其他方法。 这些方法不明白你已经作弊了。
例如:
使用 std::vector<> 它更安全并且工作正常。
我还敢打赌,它与优化器启动后的实现一样高效。
或者 boost 包含一个固定大小的数组:
http://www.boost.org/doc/libs/ 1_38_0/doc/html/array.html
This is OK (and was standard practice for C).
But this is not a good idea for C++.
This is because the compiler generates a whole set of other methods automatically for you around the class. These methods do not understand that you have cheated.
For Example:
Use the std::vector<> it is much safer and works correctly.
I would also bet it is just as efficient as your implementation after the optimizer kicks in.
Alternatively boost contains a fixed size array:
http://www.boost.org/doc/libs/1_38_0/doc/html/array.html
如果需要,您可以使用“C”方法,但为了安全起见,编译器不会尝试复制它:
You can use the "C" method if you want but for safety make it so the compiler won't try to copy it:
我可能会坚持使用向量
,除非最小的额外开销(可能是实现上的一个额外单词或指针)确实造成了问题。 没有什么说明你必须在构造向量后对其进行 resize() 。
然而,使用
vector
有几个优点:如果您确实想防止数组在构造后增长,您可能需要考虑拥有自己的类,该类私有地继承自
vector
或具有vector<>
成员,并且仅通过仅 thunk 到向量方法的方法公开您希望客户端能够使用的那些向量位。 这应该可以帮助您快速进行,并很好地保证不存在泄漏和其他问题。 如果您这样做并发现矢量的小开销不适合您,您可以在没有矢量的帮助下重新实现该类,并且您的客户端代码不需要更改。I'd probably just stick with using a
vector<>
unless the minimal extra overhead (probably a single extra word or pointer over your implementation) is really posing a problem. There's nothing that says you have to resize() a vector once it's been constructed.However, there are several The advantages of going with
vector<>
:If you really want to prevent the array from growing once constructed, you might want to consider having your own class that inherits from
vector<>
privately or has avector<>
member and only expose via methods that just thunk to the vector methods those bits of vector that you want clients to be able to use. That should help get you going quickly with pretty good assurance that leaks and what not are not there. If you do this and find that the small overhead of vector is not working for you, you can reimplement that class without the help of vector and your client code shouldn't need to change.这里已经提到了很多好的想法。 但少了一个。 灵活数组是 C99 的一部分,因此不是 C++ 的一部分,尽管某些 C++ 编译器可能提供此功能,但不能保证这一点。 如果您找到了一种在 C++ 中以可接受的方式使用它们的方法,但您的编译器不支持它,您也许可以回退到 “经典”方式
There are already many good thoughts mentioned here. But one is missing. Flexible Arrays are part of C99 and thus aren't part of C++, although some C++ compiler may provide this functionality there is no guarantee for that. If you find a way to use them in C++ in an acceptable way, but you have a compiler that doesn't support it, you perhaps can fallback to the "classical" way
如果您真正使用 C++,那么除了默认成员可见性之外,类和结构之间没有任何实际区别 - 类默认情况下具有私有可见性,而结构默认情况下具有公共可见性。 以下是等效的:
要点是,您不需要 CreatePacket()。 您可以简单地使用构造函数初始化结构对象。
有几点需要注意。 在 C++ 中,使用 new 而不是 malloc。 我采取了一些自由措施,将位长度更改为字节长度。 如果此类表示网络数据包,那么处理字节而不是位会更好(在我看来)。 数据数组是 unsigned char 数组,而不是 unsigned int 数组。 同样,这是基于我的假设,即此类代表网络数据包。 构造函数允许您创建如下所示的 Packet:
当 Packet 实例超出范围并防止内存泄漏时,会自动调用析构函数。
If you are truly doing C++, there is no practical difference between a class and a struct except the default member visibility - classes have private visibility by default while structs have public visibility by default. The following are equivalent:
The point is, you don't need the CreatePacket(). You can simply initialize the struct object with a constructor.
A few things to note. In C++, use new instead of malloc. I've taken some liberty and changed bitlength to bytelength. If this class represents a network packet, you'll be much better off dealing with bytes instead of bits (in my opinion). The data array is an array of unsigned char, not unsigned int. Again, this is based on my assumption that this class represents a network packet. The constructor allows you to create a Packet like this:
The destructor is called automatically when the Packet instance goes out of scope and prevents a memory leak.