执行二进制序列化时删除字符串流结果的常量是否安全?

发布于 2024-11-29 07:07:14 字数 855 浏览 5 评论 0原文

我遇到了一种情况,我正在对某些项目执行二进制序列化,并将它们写入不透明的字节缓冲区:

int SerializeToBuffer(unsigned char* buffer)
{
    stringstream ss;
    vector<Serializable> items = GetSerializables();
    string serializedItem("");
    short len = 0;
    for(int i = 0; i < items.size(); ++i)
    {
        serializedItem = items[i].Serialize();
        len = serializedItem.length();

        // Write the bytes to the stream
        ss.write(*(char*)&(len), 2);
        ss.write(serializedItem.c_str(), len);

    }
    buffer = reinterpret_cast<unsigned char*>(
                const_cast<char*>(ss.str().c_str()));
    return items.size();
}

ss 中删除 const-ness 是否安全.str().c_str() 然后将结果 reinterpret_cast 作为 unsigned char* 然后将其分配给缓冲区?

注意:代码只是为了让您了解我在做什么,它不一定可以编译。

I have a situation in which I'm performing binary serialization of a some items and I'm writing them to an opaque byte buffer:

int SerializeToBuffer(unsigned char* buffer)
{
    stringstream ss;
    vector<Serializable> items = GetSerializables();
    string serializedItem("");
    short len = 0;
    for(int i = 0; i < items.size(); ++i)
    {
        serializedItem = items[i].Serialize();
        len = serializedItem.length();

        // Write the bytes to the stream
        ss.write(*(char*)&(len), 2);
        ss.write(serializedItem.c_str(), len);

    }
    buffer = reinterpret_cast<unsigned char*>(
                const_cast<char*>(ss.str().c_str()));
    return items.size();
}

Is it safe to remove the const-ness from the ss.str().c_str() and then reinterpret_cast the result as unsigned char* then assign it to the buffer?

Note: the code is just to give you an idea of what I'm doing, it doesn't necessarily compile.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

微凉 2024-12-06 07:07:14

删除固有恒定字符串的常量性不会导致未定义的行为

const char* c_str () const;
获取 C 字符串等效项

生成一个以 null 结尾的字符序列(c 字符串),其内容与字符串对象相同,并将其作为指向字符数组的指针返回。
自动附加终止空字符。
返回的数组指向一个内部位置,该位置具有该字符序列及其终止空字符所需的存储空间,但是该数组中的值不应在程序中修改,并且只能保证保持不变,直到接下来调用字符串对象的非常量成员函数。

No removing const-ness of an inherently contant string will result in Undefined Behavior.

const char* c_str ( ) const;
Get C string equivalent

Generates a null-terminated sequence of characters (c-string) with the same content as the string object and returns it as a pointer to an array of characters.
A terminating null character is automatically appended.
The returned array points to an internal location with the required storage space for this sequence of characters plus its terminating null-character, but the values in this array should not be modified in the program and are only guaranteed to remain unchanged until the next call to a non-constant member function of the string object.

一梦浮鱼 2024-12-06 07:07:14

简短回答:没有

长回答:不。你真的不能那样做。这些对象的内部缓冲区属于对象。引用内部结构绝对是禁忌,并且会破坏封装。无论如何,这些对象(及其内部缓冲区)将在函数结束时被销毁,并且您的 buffer 变量将指向未初始化的内存。

使用 const_cast<> 通常表明您的设计中有问题。
使用reinterpret_cast<>通常意味着你做错了(或者你正在做一些非常低级的事情)。

你可能想写这样的东西:

std::ostream& operator<<(std::ostream& stream, Data const& serializable)
{
    return stream << serializable.internalData;

    // Or if you want to write binary data to the file:

    stream.write(static_cast<char*>(&serializable.internalData), sizeof(serializable.internalData);
    return stream;

}

Short answer: No

Long Answer: No. You really can't do that. The internal buffer of those object belong to the objects. Taking a reference to an internal structure is definitely a no-no and breaks encapsulation. Anyway those objects (with their internal buffer) will be destroyed at the end of the function and your buffer variable will point at uninitialized memory.

Use of const_cast<> is usually a sign that something in your design is wrong.
Use of reinterpret_cast<> usually means you are doing it wrong (or you are doing some very low level stuff).

You probably want to write something like this:

std::ostream& operator<<(std::ostream& stream, Data const& serializable)
{
    return stream << serializable.internalData;

    // Or if you want to write binary data to the file:

    stream.write(static_cast<char*>(&serializable.internalData), sizeof(serializable.internalData);
    return stream;

}
小嗷兮 2024-12-06 07:07:14

这是不安全的,部分是因为您剥离了 const,但更重要的是因为您返回了一个指向数组的指针,该数组将在函数返回时被回收。

当您编写

ss.str().c_str()

c_str() 的返回值时,仅当您调用它的 string 对象仍然存在时才有效。 stringstream::str() 的签名是

string stringstream::str() const;

这意味着它返回一个临时 string 对象。因此,一旦该行

ss.str().c_str()

执行完毕,临时 string 对象就会被回收。这意味着您通过 c_str() 收到的未完成指针不再有效,并且对它的任何使用都会导致未定义的行为。

要解决此问题,如果您确实必须返回一个 unsigned char*,则需要手动将 C 样式字符串复制到其自己的缓冲区中:

/* Get a copy of the string that won't be automatically destroyed at the end of a statement. */
string value = ss.str();

/* Extract the C-style string. */
const char* cStr = value.c_str();

/* Allocate a buffer and copy the contents of cStr into it. */
unsigned char* result = new unsigned char[value.length() + 1];
copy(cStr, cStr + value.length() + 1, result);

/* Hand back the result. */
return result;

此外,正如 @Als 所指出的,剥离 -如果您打算修改内容,关闭 const 是一个坏主意。如果您不修改内容,应该没问题,但是您应该返回 const unsigned char* 而不是 unsigned char*

希望这有帮助!

This is unsafe, partially because you're stripping off const, but more importantly because you're returning a pointer to an array that will be reclaimed when the function returns.

When you write

ss.str().c_str()

The return value of c_str() is only valid as long as the string object you invoked it on still exists. The signature of stringstream::str() is

string stringstream::str() const;

Which means that it returns a temporary string object. Consequently, as soon as the line

ss.str().c_str()

finishes executing, the temporary string object is reclaimed. This means that the outstanding pointer you received via c_str() is no longer valid, and any use of it leads to undefined behavior.

To fix this, if you really must return an unsigned char*, you'll need to manually copy the C-style string into its own buffer:

/* Get a copy of the string that won't be automatically destroyed at the end of a statement. */
string value = ss.str();

/* Extract the C-style string. */
const char* cStr = value.c_str();

/* Allocate a buffer and copy the contents of cStr into it. */
unsigned char* result = new unsigned char[value.length() + 1];
copy(cStr, cStr + value.length() + 1, result);

/* Hand back the result. */
return result;

Additionally, as @Als has pointed out, the stripping-off of const is a Bad Idea if you're planning on modifying the contents. If you aren't modifying the contents, it should be fine, but then you ought to be returning a const unsigned char* instead of an unsigned char*.

Hope this helps!

七禾 2024-12-06 07:07:14

由于此函数的主要使用者似乎是 C# 应用程序,因此使签名对 C# 更加友好是一个好的开始。如果我时间真的很紧迫,没有时间做“正确的方式”的事情,那么我会这样做;-]

using System::Runtime::InteropServices::OutAttribute;

void SerializeToBuffer([Out] array<unsigned char>^% buffer)
{
    using System::Runtime::InteropServices::Marshal;

    vector<Serializable> const& items = GetSerializables();
    // or, if Serializable::Serialize() is non-const (which it shouldn't be)
    //vector<Serializable> items = GetSerializables();

    ostringstream ss(ios_base::binary);
    for (size_t i = 0u; i != items.size(); ++i)
    {
        string const& serializedItem = items[i].Serialize();
        unsigned short const len =
            static_cast<unsigned short>(serializedItem.size());

        ss.write(reinterpret_cast<char const*>(&len), sizeof(unsigned short));
        ss.write(serializedItem.data(), len);
    }

    string const& s = ss.str();
    buffer = gcnew array<unsigned char>(static_cast<int>(s.size()));
    Marshal::Copy(
        IntPtr(const_cast<char*>(s.data())),
        buffer,
        0,
        buffer->Length
    );
}

对于 C# 代码,这将具有签名:

void SerializeToBuffer(out byte[] buffer);

Since it appears that your primary consumer of this function is a C# application, making the signature more C#-friendly is a good start. Here's what I'd do if I were really crunched for time and didn't have time to do things "The Right Way" ;-]

using System::Runtime::InteropServices::OutAttribute;

void SerializeToBuffer([Out] array<unsigned char>^% buffer)
{
    using System::Runtime::InteropServices::Marshal;

    vector<Serializable> const& items = GetSerializables();
    // or, if Serializable::Serialize() is non-const (which it shouldn't be)
    //vector<Serializable> items = GetSerializables();

    ostringstream ss(ios_base::binary);
    for (size_t i = 0u; i != items.size(); ++i)
    {
        string const& serializedItem = items[i].Serialize();
        unsigned short const len =
            static_cast<unsigned short>(serializedItem.size());

        ss.write(reinterpret_cast<char const*>(&len), sizeof(unsigned short));
        ss.write(serializedItem.data(), len);
    }

    string const& s = ss.str();
    buffer = gcnew array<unsigned char>(static_cast<int>(s.size()));
    Marshal::Copy(
        IntPtr(const_cast<char*>(s.data())),
        buffer,
        0,
        buffer->Length
    );
}

To C# code, this will have the signature:

void SerializeToBuffer(out byte[] buffer);
ㄖ落Θ余辉 2024-12-06 07:07:14

这是根本问题:

buffer = ... ;
return items.size();

在倒数第二行中,您为局部变量分配了一个新值,该变量(直到该点)用于保存函数作为参数给出的指针。然后,紧接着,您从函数返回,忘记了刚刚分配给的变量的所有内容。这没有道理!

您可能想要做的是将数据从ss_str().c_str()指向的内存复制到存储在中的指针指向缓冲区。像这样的东西

memcpy(buffer, ss_str().s_str(), <an appropriate length here>)

Here is the underlying problem:

buffer = ... ;
return items.size();

In the second-to last line you're assigning a new value to the local variable that used (up until that point) to hold the pointer your function was given as an argument. Then, immediately after, you return from the function, forgetting everything about the variable you just assigned to. That does not make sense!

What you probably want to do is to copy data from the memory pointed to by ss_str().c_str() to the memory pointed to by the pointer stored in buffer. Something like

memcpy(buffer, ss_str().s_str(), <an appropriate length here>)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文