使用istingstream处理可变长度的内存块

发布于 2024-08-21 19:23:49 字数 1011 浏览 14 评论 0原文

我正在尝试使用 istringstream 从某些内存中重新创建编码的 wstring 。内存布局如下：

1 个字节表示 wstring 编码的开始。任意地这是'！'。
n 个字节用于存储文本格式字符串的字符长度，例如 0x31、0x32、0x33 将是“123”，即 123 个字符的字符串
1 个字节分隔符（空格字符）
n 个字节，它们是组成字符串的 wchar字符串，其中 wchar_t 每个为 2 字节。

例如，字节序列：

21 36 20 66 00 6f 00 6f 00

是“!6 foo”（使用点表示 char 0）

我所拥有的只是一个 char* 指针（让我们调用它pData）到其中包含此编码数据的内存块的开头。使用数据来重建 wstring（“foo”）并将指针移动到编码数据末尾之后的下一个字节的“最佳”方法是什么？

我正在考虑使用 istringstream 来允许我使用前缀字节、字符串的长度和分隔符。之后，我可以计算要读取的字节数，并使用流的 read() 函数插入到适当调整大小的 wstring 中。 问题是，我如何首先将此内存放入 istringstream 中？ 我可以尝试先构建一个字符串，然后将其传递到 istringstream 中，例如，

std::string s((const char*)pData);

但是不起作用，因为字符串在第一个空字节处被截断。或者，我可以使用字符串的其他构造函数来显式声明要使用多少字节：

std::string s((const char*)pData, len);

这有效，但前提是我事先知道 len 是什么。鉴于数据的长度是可变的，这很棘手。

这似乎是一个真正可以解决的问题。我在字符串和流方面的菜鸟身份是否意味着我忽略了一个简单的解决方案？或者我用整个字符串方法吠错了树？

原文

I'm trying to use istringstream to recreate an encoded wstring from some memory. The memory is laid out as follows:

1 byte to indicate the start of the wstring encoding. Arbitrarily this is '!'.
n bytes to store the character length of the string in text format, e.g. 0x31, 0x32, 0x33 would be "123", i.e. a 123-character string
1 byte separator (the space character)
n bytes which are the wchars which make up the string, where wchar_t's are 2-bytes each.

For example, the byte sequence:

21 36 20 66 00 6f 00 6f 00

is "!6 f.o.o." (using dots to represent char 0)

All I've got is a char* pointer (let's call it pData) to the start of the memory block with this encoded data in it. What's the 'best' way to consume the data to reconstruct the wstring ("foo"), and also move the pointer to the next byte past the end of the encoded data?

I was toying with using an istringstream to allow me to consume the prefix byte, the length of the string, and the separator. After that I can calculate how many bytes to read and use the stream's read() function to insert into a suitably-resized wstring. The problem is, how do I get this memory into the istringstream in the first place? I could try constructing a string first and then pass that into the istringstream, e.g.

std::string s((const char*)pData);

but that doesn't work because the string is truncated at the first null byte. Or, I could use the string's other constructor to explicitly state how many bytes to use:

std::string s((const char*)pData, len);

which works, but only if I know what len is beforehand. That's tricky given that the data is variable length.

This seems like a really solvable problem. Does my rookie status with strings and streams mean I'm overlooking an easy solution? Or am I barking up the wrong tree with the whole string approach?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

梦醒灬来后我 2024-08-28 19:23:49

尝试设置 stringstream 的 rdbuf：

char* buffer = something;
std::stringbuf *pbuf;
std::stringstream ss;

std::pbuf=ss.rdbuf();
std::pbuf->sputn(buffer, bufferlength);
// use your ss

编辑：我看到这个解决方案将有与您的 string(char*, len) 情况类似的问题。您能告诉我们更多有关您的缓冲区对象的信息吗？如果您不知道长度，并且它不是以空终止的，那么它将很难处理。

Try setting your stringstream's rdbuf:

char* buffer = something;
std::stringbuf *pbuf;
std::stringstream ss;

std::pbuf=ss.rdbuf();
std::pbuf->sputn(buffer, bufferlength);
// use your ss

Edit: I see that this solution will have a similar problem to your string(char*, len) situation. Can you tell us more about your buffer object? If you don't know the length, and it isn't null terminated, it's going to be very hard to deal with.

回复收藏 0 原文

凉城凉梦凉人心 2024-08-28 19:23:49

是否可以修改长度编码方式并使其成为固定大小？

无符号长长度 = 6; // 已知字符串长度 char* buffer = new char[1 + sizeof(unsigned long) + 1 + size]; 缓冲区[0] = '!'; memcpy(缓冲区+1, &size, sizeof(无符号长整型));

缓冲区应保存开始指示符（1 个字节）、实际大小（无符号长整型的大小）、分隔符（1 个字节）和文本本身（大小).
这样，您可以“相当”轻松地获得大小，然后将指针设置为指向超出开销的位置，然后在字符串构造函数中使用 len 变量。
<代码>
无符号长长度；
memcpy(&len, pData+1, sizeof(无符号长整型)); // +1 以避免开始指示器
// len 现在包含 6
char* 实际数据 = pData + 1 + sizeof(unsigned long) + 1;
std::string s(actualData, len);

这是低级别的并且容易出错:)（例如，如果您读取的任何内容未按照您期望的方式编码，则 len 可能会变得相当大），但是您可以避免动态读取细绳。

Is it possible to modify how you encode the length, and make that a fixed size?

unsigned long size = 6; // known string length char* buffer = new char[1 + sizeof(unsigned long) + 1 + size]; buffer[0] = '!'; memcpy(buffer+1, &size, sizeof(unsigned long));

buffer should hold the start indicator (1 byte), the actual size (size of unsigned long), the delimiter (1 byte) and the text itself (size).
This way, you could get the size "pretty" easy, then set the pointer to point beyond the overhead, and then use the len variable in the string constructor.
unsigned long len; memcpy(&len, pData+1, sizeof(unsigned long)); // +1 to avoid the start indicator // len now contains 6 char* actualData = pData + 1 + sizeof(unsigned long) + 1; std::string s(actualData, len);

It's low level and error prone :) (for instance if you read anything that isn't encoded the way that you expect it to be, the len can get pretty big), but you avoid dynamically reading the length of the string.

回复收藏 0 原文

铁轨上的流浪者 2024-08-28 19:23:49

看起来这个订单上的东西应该有效：

std::wstring make_string(char const *input) { 
    if (*input != '!')
       return "";
    char length = *++input;
    return std::wstring(++input, length);
}

困难的部分是处理尺寸的可变长度。如果没有指定长度的内容，很难猜测何时停止将数据视为指定字符串的长度。

至于移动指针，如果要在函数内执行此操作，则需要传递对指针的引用，但否则只需将找到的大小添加到收到的指针即可。

It seems like something on this order should work:

std::wstring make_string(char const *input) { 
    if (*input != '!')
       return "";
    char length = *++input;
    return std::wstring(++input, length);
}

The difficult part is dealing with the variable length of the size. Without something to specify the length it's hard to guess when to stop treating the data as specifying the length of the string.

As for moving the pointer, if you're going to do it inside a function, you'll need to pass a reference to the pointer, but otherwise it's a simple matter of adding the size you found to the pointer you received.

回复收藏 0 原文

两相知 2024-08-28 19:23:49

在这里（ab）使用（已弃用但仍然是标准的）std::istrstream是很诱人的：

// Maximum size to read is 
// 1 for the exclamation mark
// Digits for the character count (digits10() + 1)
// 1 for the space
const std::streamsize max_size = 3 + std::numeric_limits<std::size_t>::digits10;

std::istrstream s(buf, max_size);

if (std::istream::traits_type::to_char_type(s.get()) != '!'){
    throw "missing exclamation";
}

std::size_t size;
s >> size;

if (std::istream::traits_type::to_char_type(s.get()) != ' '){
    throw "missing space";
}

std::wstring(reinterpret_cast<wchar_t*>(s.rdbuf()->str()), size/sizeof(wchar_t));

It's tempting to (ab)use the (deprecated but nevertheless standard) std::istrstream here:

// Maximum size to read is 
// 1 for the exclamation mark
// Digits for the character count (digits10() + 1)
// 1 for the space
const std::streamsize max_size = 3 + std::numeric_limits<std::size_t>::digits10;

std::istrstream s(buf, max_size);

if (std::istream::traits_type::to_char_type(s.get()) != '!'){
    throw "missing exclamation";
}

std::size_t size;
s >> size;

if (std::istream::traits_type::to_char_type(s.get()) != ' '){
    throw "missing space";
}

std::wstring(reinterpret_cast<wchar_t*>(s.rdbuf()->str()), size/sizeof(wchar_t));

回复收藏 0 原文

~没有更多了~