使用“memcpy()”在具有 union 的类中

发布于 2025-01-14 11:18:57 字数 855 浏览 3 评论 0原文

我有一个类 foo,它使用小缓冲区优化 (SBO) 来管理数据。
尺寸< 16,数据保存在本地(在buffer中),否则存储在堆上,reserved保存分配的空间。

class foo {
    static const int sbo_size = 16;

    long size = 0;
    char *ptr;

    union {
        char buffer[sbo_size];
        long reserved;
    };
public:

    foo()
    {
        for (int i = 0; i < sbo_size; ++i)
            buffer[i] = 0;
    }

    void clone(const foo &f)
    {
        // release 'ptr' if necessary

        if (f.size < sbo_size)
        {
            memcpy(this, &f, sizeof(foo));
            ptr = buffer;
        } else
        {
            // handle non-sbo case
        }
    }
};

关于clone()的问题:
对于 SBO 情况,编译器可能不清楚将使用 union::buffer
使用 memcpy 并相应地设置 ptr 是否正确?

I have a class foo that manages data using small buffer optimization (SBO).
When size < 16, the data is held locally (in buffer), otherwise it is stored on the heap, with reserved holding the allocated space.

class foo {
    static const int sbo_size = 16;

    long size = 0;
    char *ptr;

    union {
        char buffer[sbo_size];
        long reserved;
    };
public:

    foo()
    {
        for (int i = 0; i < sbo_size; ++i)
            buffer[i] = 0;
    }

    void clone(const foo &f)
    {
        // release 'ptr' if necessary

        if (f.size < sbo_size)
        {
            memcpy(this, &f, sizeof(foo));
            ptr = buffer;
        } else
        {
            // handle non-sbo case
        }
    }
};

Question about clone():
With the SBO case, it may not be clear for the compiler that union::buffer will be used.
is it correct to use memcpy and set ptr accordingly?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

吝吻 2025-01-21 11:18:57

如果您可以使用 C++17,我会使用 std::variant 代替联合。

虽然这在内部使用少量存储来跟踪它包含的当前类型,但总体而言这可能是一个胜利,因为您的 ptr 变量可能会消失(尽管它应该位于您的联合内部)代码>无论如何)。

它也是类型安全的,而 union 则不然(因为如果变体不包含所需的类型,std::get 将会抛出异常),并且会跟踪只需分配给它即可包含数据。

生成的类片段可能如下所示(毫无疑问,此代码可以改进):

class foo
{
private:
    static const size_t sbo_size = 16;
    using small_buf = std::array <char, sbo_size>;
    size_t size = 0;
    std::variant <small_buf, char *> buf = { };

public:
    void clone (const foo &f)
    {
        char **bufptr = std::get_if <char *> (&buf);
        if (bufptr)
            delete [] *bufptr;

        size = f.size;
        if (size < sbo_size)
            buf = std::get <small_buf> (f.buf);
        else
        {
            buf = new char [size];
            std::memcpy (std::get <char *> (buf), std::get <char *> (f.buf), size);
        }
    }
};

注意:

  • 您将看到我使用了 std::array 而不是 C 风格的数组,因为 std:array 有很多不错的功能C 风格数组则不然

  • 为什么克隆而不是复制构造函数?

  • 如果您希望 foo 具有 empty 状态(例如在默认构造之后),那么您可以查看奇怪命名的 std::monostate

  • 对于原始存储,std::byte 可能比 char 更受青睐。


完全有效的示例此处


编辑:为了回答所提出的问题,我不是语言律师,但在我看来,在克隆内部,编译器没有任何线索 f 的活跃成员可能是什么,因为它实际上是从外太空跳伞而来的。

在这种情况下,我希望编译器编写者能够谨慎行事,并将联盟的活跃成员设置为“不知道”,直到出现一些具体信息。但是(这是一个很大的但是),我不想把我的衬衫押在这一点上。这是一项复杂的工作,编译器编写者确实会犯错误。

因此,本着分享的精神,这里对原始代码进行了稍微修改的版本,修复了这个问题。我还将 ptr 移到了您的联合内部,因为它显然属于那里:

class foo {
    static const int sbo_size = 16;

    long size = 0;

    union {
        std::array <char, sbo_size> buffer;   // changing this
        char *ptr;
        long reserved;
    };
public:

    foo()
    {
        for (int i = 0; i < sbo_size; ++i)
            buffer[i] = 0;
    }

    void clone(const foo &f)
    {
        // release 'ptr' if necessary

        if (f.size < sbo_size)
        {
            buffer = f.buffer;                // lets me do this
            ptr = buffer.data ();
        } else
        {
            // handle non-sbo case
        }
    }
};

所以您可以看到,通过使用 std::array 作为 buffer (而不是那些黑客 C 风格的数组之一),您可以直接分配给它(而不是必须诉诸 memcpy),编译器将然后使其成为活动的作为您工会的成员,您应该是安全的。

总之,这个问题实际上毫无意义,因为人们不应该(永远)需要编写这样的代码。但毫无疑问有人会立即想出一些东西来证明我错了。

If you can use C++17, I would side-step any potential type-punning problems by using std::variant in place of a union.

Although this uses a small amount of storage internally to keep track of the current type it contains, it's probably a win overall as your ptr variable can disappear (although that should be inside your union anyway).

It's also typesafe, which a union is not (because std::get will throw if the variant doesn't contain the desired type) and will keep track of the type of data it contains simply by assigning to it.

The resulting class fragment might look something like this (no doubt this code can be improved):

class foo
{
private:
    static const size_t sbo_size = 16;
    using small_buf = std::array <char, sbo_size>;
    size_t size = 0;
    std::variant <small_buf, char *> buf = { };

public:
    void clone (const foo &f)
    {
        char **bufptr = std::get_if <char *> (&buf);
        if (bufptr)
            delete [] *bufptr;

        size = f.size;
        if (size < sbo_size)
            buf = std::get <small_buf> (f.buf);
        else
        {
            buf = new char [size];
            std::memcpy (std::get <char *> (buf), std::get <char *> (f.buf), size);
        }
    }
};

Notes:

  • You will see that I've used std::array instead of a C-style array because std:array has lots of nice features that C-style arrays do not

  • Why clone and not a copy constructor?

  • if you want foo to have an empty state (after being default constructed, say), then you can look into the strangely named std::monostate.

  • For raw storage, std::byte is probably to be preferred over char.

Fully worked example here.


Edit: To answer the question as posed, I am no language lawyer but it seems to me that, inside clone, the compiler has no clue what the active member of f might be as it has, in effect, been parachuted in from outer space.

In such circumstances, I would expect compiler writers to play it safe and set the active member of the union to "don't know" until some concrete information comes along. But (and it's a big but), I wouldn't like to bet my shirt on that. It's a complex job and compiler writers do make mistakes.

So, in a spirit of sharing, here's a slightly modified version of your original code which fixes that. I've also moved ptr inside your union since it clearly belongs there:

class foo {
    static const int sbo_size = 16;

    long size = 0;

    union {
        std::array <char, sbo_size> buffer;   // changing this
        char *ptr;
        long reserved;
    };
public:

    foo()
    {
        for (int i = 0; i < sbo_size; ++i)
            buffer[i] = 0;
    }

    void clone(const foo &f)
    {
        // release 'ptr' if necessary

        if (f.size < sbo_size)
        {
            buffer = f.buffer;                // lets me do this
            ptr = buffer.data ();
        } else
        {
            // handle non-sbo case
        }
    }
};

So you can see, by using std::array for buffer (rather than one of those hacky C-style arrays), you can directly assign to it (rather than having to resort to memcpy) and the compiler will then make that the active member of your union and you should be safe.

In conclusion, the question is actually rather meaningless since one shouldn't (ever) need to write code like that. But no doubt someone will immediately come up with something that proves me wrong.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文