了解 UB 和指针算术

发布于 2025-01-16 21:28:02 字数 2007 浏览 3 评论 0原文

我有一个通用的引用计数堆分配包装类。所以我的类基本上只是一个指针：

template <typename T>
class Refcounted {
    struct model { 
        std::atomic<std::size_t> count{1}; 
        T value;
    };
    model* m_self;
public:
    ...
    Refcounted(const Refcounted& other) : m_self(other.m_self) {
        assert(m_self);
        ++m_self->count;
    }
    ...
    T& operator*() { return m_model->value; }
};

现在，即使在前向声明 T 的上下文中，我也希望能够获得 T& 。据我了解，我不能，因为如果 T 只是前向声明，它就无法知道 model 的布局（特别是，它不能知道 value 的偏移量，因为它无法知道 T 的对齐方式）。

我相信，如果我交换 model 的顺序，那么 reinterpret_cast 将是明确定义的行为，对吗？：

template <typename T>
class Refcounted {
    struct model { 
        T value;
        std::atomic<std::size_t> count{1}; 
    };
    model* m_self;
public:
    Refcounted(T&& x) : m_self(new model(std::move(x))) {
        static_assert(offsetof(model, value) == 0, "Using this assumption below to reinterpret_cast");
    }
    ...
    Refcounted(const Refcounted& other) : m_self(other.m_self) {
        assert(m_self);
        ++m_self->count;
    }
    ...
    T& operator*() { return *reinterpret_cast<T*>(m_model); }
};

假设这是正确的，很好......但现在是副本-构造函数需要定义T，因为它需要查找m_self->count。我有一个想法来处理这个问题，但我怀疑它是 UB：如果我设置 model 结构，以便 std::atomic; count 位于第一个，并且它和 T 之间没有填充，然后 Refcounted 保留一个指向 的 void* 指针value 字段，如 m_valptr{&(new model(std::move(x)))->value} ，那么我可以reinterpret_cast(m_valptr) 来获取值（我认为（？）仍然是明确定义的行为）。是否有任何定义的方法可以从该指针转到指向 count 的指针？原则上，它只是将指针递减 std::atomic，但我怀疑它违反了我不完全理解指针可以做什么和不能做什么的规则。

我可以添加第二个指向 Refcounted 的指针，或者可以让模型使用虚拟接口，但这会增加开销。我觉得这应该是可能的，但有一些令人毛骨悚然的语言规则阻碍了。

原文

I have a generic reference-counted heap-allocated wrapper class. So my class is basically just a pointer:

template <typename T>
class Refcounted {
    struct model { 
        std::atomic<std::size_t> count{1}; 
        T value;
    };
    model* m_self;
public:
    ...
    Refcounted(const Refcounted& other) : m_self(other.m_self) {
        assert(m_self);
        ++m_self->count;
    }
    ...
    T& operator*() { return m_model->value; }
};

Now, I'd like to be able to get a T& even in a context where T is forward-declared. As I understand it, I can't, because if T is just forward-declared, it can't know the layout of model (in particular, it can't know the offset of value because it can't know T's alignment).

I believe that if I swapped the order of model, it would be well-defined behavior to reinterpret_cast, correct?:

template <typename T>
class Refcounted {
    struct model { 
        T value;
        std::atomic<std::size_t> count{1}; 
    };
    model* m_self;
public:
    Refcounted(T&& x) : m_self(new model(std::move(x))) {
        static_assert(offsetof(model, value) == 0, "Using this assumption below to reinterpret_cast");
    }
    ...
    Refcounted(const Refcounted& other) : m_self(other.m_self) {
        assert(m_self);
        ++m_self->count;
    }
    ...
    T& operator*() { return *reinterpret_cast<T*>(m_model); }
};

Assuming that's correct, great... but now the copy-constructor requires T to be defined because it needs to find m_self->count. I had a thought for dealing with that, but I suspect it's UB: If I set up the model struct so that std::atomic<std::size_t> count is first and there's no padding between it and the T, then Refcounted keeps a void* pointer to the value field, as in m_valptr{&(new model(std::move(x)))->value}, then I could reinterpret_cast<T*>(m_valptr) to get at the value (which I think (?) is still well-defined-behavior). Is there any defined way to go from that pointer to a pointer to count? In principal, it's just decrementing the pointer by std::atomic<std::size_t>, but I suspect it breaks rules that I don't fully understand about what can and cannot be done with pointers.

I could add a second pointer to Refcounted or I could make model use a virtual interface, but that adds overhead. I feel like this should be possible but that there are spooky language rules getting in the way.

分享到QQ

分享到微博