直接在 C++ 中使用 read()标准:矢量
我正在用一些 C++ 为嵌入式系统封装用户空间 Linux 套接字功能(是的,这可能再次重新发明轮子)。
我想提供使用向量的读写实现。
写入非常简单,我只需传递 &myvec[0]
即可避免不必要的复制。我想做同样的事情并直接读入向量,而不是读入字符缓冲区然后将所有内容复制到新创建的向量中。
现在,我知道我想要读取多少数据,并且我可以适当地分配(vec.reserve()
)。我还可以阅读 &myvec[0]
,尽管这可能是一个非常糟糕的主意。显然这样做不允许 myvec.size 返回任何合理的内容。有没有办法做到这一点:
- 从安全/C++ 角度来看,并不完全让人感到恶心。
- 不涉及数据块的两个副本 - 一次从内核到用户空间,一次从 C
char * 样式缓冲区转换为 C++ 向量。
I'm wrapping up user space linux socket functionality in some C++ for an embedded system (yes, this is probably reinventing the wheel again).
I want to offer a read and write implementation using a vector.
Doing the write is pretty easy, I can just pass &myvec[0]
and avoid unnecessary copying. I'd like to do the same and read directly into a vector, rather than reading into a char buffer then copying all that into a newly created vector.
Now, I know how much data I want to read, and I can allocate appropriately (vec.reserve()
). I can also read into &myvec[0]
, though this is probably a VERY BAD IDEA. Obviously doing this doesn't allow myvec.size to return anything sensible. Is there any way of doing this that:
- Doesn't completely feel yucky from a safety/C++ perspective
- Doesn't involve two copies of the data block - once from kernel to user space and once from a C
char *
style buffer into a C++ vector.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
使用
resize()
而不是reserve()
。这将正确设置向量的大小 - 之后,&myvec[0]
像往常一样,保证指向连续的内存块。编辑:使用
&myvec[0]
作为底层数组的指针进行读写是安全的,并且保证可以按照 C++ 标准工作。这是 Herb Sutter 不得不说的< /a>:Use
resize()
instead ofreserve()
. This will set the vector's size correctly -- and after that,&myvec[0]
is, as usual, guaranteed to point to a continguous block of memory.Edit: Using
&myvec[0]
as a pointer to the underlying array for both reading and writing is safe and guaranteed to work by the C++ standard. Here's what Herb Sutter has to say:我只是添加一个简短的说明,因为答案已经给出了。参数大于当前大小的 resize() 会将元素添加到集合中,并默认 - 初始化它们。如果你创建
然后调整大小
所有无符号字符将被初始化为0。顺便说一句你可以用构造函数做同样的事情
所以理论上它可能比原始数组慢一点,但如果替代方案是复制数组,那么它是更好的。
Reserve 仅准备内存,因此如果将新元素添加到集合中,则无需重新分配,但您无法访问该内存。
您必须获取有关写入向量的元素数量的信息。载体对此一无所知。
I'll just add a short clarification, because the answer was already given. resize() with argument greater than current size will add elements to the collection and default - initialize them. If You create
and then resize
All unsigned chars will get initialized to 0. Btw You can do the same with a constructor
So theoretically it may be a little bit slower than a raw array, but if the alternative is to copy the array anyway, it's better.
Reserve only prepares the memory, so that there is no reallocation needed, if new elements are added to the collection, but You can't access that memory.
You have to get an information about the number of element written to Your vector. The vector won't know anything about it.
假设它是一个 POD 结构,请调用
resize
而不是reserve
。如果您确实不希望在填充向量之前将数据清零,则可以定义一个空的默认构造函数。它的级别有点低,但 POD 结构的构造语义是故意模糊的。如果允许 memmove 复制构造它们,我不明白为什么套接字读取不应该这样做。
编辑:啊,字节,不是结构。好吧,您可以使用相同的技巧,并定义一个仅包含 char 和默认构造函数的结构,该构造函数会忽略初始化它......如果我猜对了您关心的话,这就是您想要的原因首先调用
reserve
而不是resize
。Assuming it's a POD struct, call
resize
rather thanreserve
. You can define an empty default constructor if you really don't want the data zeroed out before you fill the vector.It's somewhat low level, but the semantics of construction of POD structs is purposely murky. If
memmove
is allowed to copy-construct them, I don't see why a socket-read shouldn't.EDIT: ah, bytes, not a struct. Well, you can use the same trick, and define a struct with just a
char
and a default constructor which neglects to initialize it… if I'm guessing correctly that you care, and that's why you wanted to callreserve
instead ofresize
in the first place.如果您希望向量反映读取的数据量,请调用
resize()
两次。在阅读之前,给自己留出阅读的空间。再次读取后,将向量的大小设置为实际读取的字节数。reserve()
并不好,因为调用 Reserve 不会授予您访问为该容量分配的内存的权限。第一个
resize()
会将向量的元素归零,但这不太可能产生很大的性能开销。如果确实如此,那么您可以尝试 Potatoswatter 的建议,或者您可以放弃反映读取数据大小的向量大小,而只需resize()
一次,然后重新使用它与 C 中分配的缓冲区完全相同。从性能角度来看,如果您在用户模式下从套接字读取数据,很可能您可以轻松处理数据,就像数据传入一样快。如果您连接到另一台机器,则可能不行在千兆位 LAN 上,或者您的计算机经常运行 100% CPU 或 100% 内存带宽。如果您最终要阻止
read
调用,那么一些额外的复制或内存设置没什么大不了的。和你一样,我想避免用户空间中的额外副本,但不是出于性能原因,只是因为如果我不这样做,我就不必为其编写代码......
If you want the vector to reflect the amount of data read, call
resize()
twice. Once before the read, to give yourself space to read into. Once again after the read, to set the size of the vector to the number of bytes actually read.reserve()
is no good, since calling reserve doesn't give you permission to access the memory allocated for the capacity.The first
resize()
will zero the elements of the vector, but this is unlikely to create much of a performance overhead. If it does then you could try Potatoswatter's suggestion, or you could give up on the size of the vector reflecting the size of the data read, and instead justresize()
it once, then re-use it exactly as you would an allocated buffer in C.Performance-wise, if you're reading from a socket in user mode, most likely you can easily handle data as fast as it comes in. Maybe not if you're connecting to another machine on a gigabit LAN, or if your machine is frequently running 100% CPU or 100% memory bandwidth. A bit of extra copying or memsetting is no big deal if you are eventually going to block on a
read
call anyway.Like you, I'd want to avoid the extra copy in user-space, but not for performance reasons, just because if I don't do it, I don't have to write the code for it...