联合成员可能没有构造函数,但是 `std::pair` 可以吗?

发布于 2024-08-14 20:03:22 字数 1350 浏览 4 评论 0原文

union 成员可能没有析构函数或构造函数。因此,如果 MyClass 有构造函数,我无法在自己的 MyClass 上模板化以下类 Foo

template<class T>
struct Foo {
  T val;
  Foo(T val_) : val(val_) {}
  size_t hash() const {
    union {T f; size_t s;} u = { val };
    return u.s;
  }
};
struct MyClass {
  bool a;
  double b;
  MyClass(bool a_, double b_) : a(a_), b(b_) {}
};

如果我无论如何都这样做,我会收到此错误:

member 'MyClass Foo<T>::hash() const 
[with T = MyClass]::<anonymous union>::f' with constructor 
not allowed in union

为了解决这个问题,我使用一个尴尬的构造函数创建了 MyClass ,它首先复制了周围的东西:

struct MyClass {
  bool a;
  double b;
};
MyClass createMyClass(bool a, double b) {
  MyClass m;
  m.a = a;
  m.b = b;
  return m;
}

但我想知道是否有比使用这个 createMyClass 更好的方法功能。构造函数会更高效,并且作为关键组件,MyClassFoo 在我的代码中被构造了数百万次。

std::pair

我也有点惊讶的是,可以可以在union中使用std::pair

Foo<std::pair<bool, double> > f2(std::make_pair(true, 3.12));

据我所知, std::pair查看代码) 有构造函数吗?

union members may not have destructors or constructors. So I can't template the following class Foo on my own MyClass if MyClass has a constructor:

template<class T>
struct Foo {
  T val;
  Foo(T val_) : val(val_) {}
  size_t hash() const {
    union {T f; size_t s;} u = { val };
    return u.s;
  }
};
struct MyClass {
  bool a;
  double b;
  MyClass(bool a_, double b_) : a(a_), b(b_) {}
};

If I do it anyway I get this error:

member 'MyClass Foo<T>::hash() const 
[with T = MyClass]::<anonymous union>::f' with constructor 
not allowed in union

To get around it I created MyClass with an awkward construction function, which copies the thing around first:

struct MyClass {
  bool a;
  double b;
};
MyClass createMyClass(bool a, double b) {
  MyClass m;
  m.a = a;
  m.b = b;
  return m;
}

But I'm wondering if there is a better way than using this createMyClass function. A constructor would be more efficient, and as a critical component MyClass and Foo<MyClass> are constructed millions of times in my code.

std::pair

I'm also a bit surprised that it is possible to use std::pair in the union:

Foo<std::pair<bool, double> > f2(std::make_pair(true, 3.12));

To my knowledge, std::pair (see code) has a constructor?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

缱倦旧时光 2024-08-21 20:03:22

编辑:我对 std::pair 的最初立场是错误的,联合中不应该允许它。要使类成为联合的有效成员,它必须具有根据标准 9.5.1 的简​​单构造函数。普通构造函数的定义如下,来自第 12.1.5 段:

如果没有用户声明
类 X 的构造函数,默认值
构造函数是隐式声明的。
隐式声明默认值
构造函数是一个内联公共成员
它的类别。构造函数很简单
如果它是隐式声明的
默认构造函数和 if:

  • 它的类没有虚函数,也没有虚基类,并且
  • 该类的所有直接基类都有简单的构造函数,并且
  • 对于其类中属于类类型的所有非静态数据成员
    (或其数组),每个这样的类
    有一个简单的构造函数

第 20.2.2.2 段规定以下构造函数必须成对可用:

pair(const T1& x, const T2& y);

一旦提供此构造函数,就不会隐式声明默认构造函数。

有趣的是,我的编译器(Visual Studio 2008)似乎对 std::pair 进行了特殊处理。如果我从 std::pair 实现复制代码并将其放置在我自己的命名空间 foo 中,则联合将不起作用:)

namespace foo {
    template<class _Ty1, class _Ty2> struct pair {
        typedef _Ty1 first_type;
        typedef _Ty2 second_type;
        pair() : first(_Ty1()), second(_Ty2()) {
        }
    }
}

//This doesn't work in VC2008
union Baz {
    foo::pair<bool, double> a;
    int b;
}
//This works in VC2008
union Buz {
    std::pair<bool, double> a;
    int b;
}

您的解决方案是解决此问题的常见方法。我通常在类名前添加一个 C (construct 的缩写),以部分模仿普通的构造函数语法,在您的情况下,这将变为 CMyClass(a, b)

正如 Steve 和 Matthieu 指出的那样,您没有使用非常好的哈希函数。首先,并不能真正保证联合体中的 fs 甚至会部分占用相同的内存空间(我想,如果我错了,请纠正我),其次即使它们在实践中可能会共享第一个 min(sizeof(s), sizeof(f)) 字节,这意味着对于 MyClass 你只进行了部分散列的值。在这种情况下,您将对 bool a 的值进行哈希处理,在这种情况下,有两个选项:

  1. 您的编译器使用 int 作为 bool< 的内部表示形式/code> 在这种情况下,您的哈希函数将仅返回两个值,一个表示 true,一个表示 false。
  2. 您的编译器使用 char 作为 bool 的内部表示。在这种情况下,该值可能会被填充到至少 sizeof(int),要么用零填充,在这种情况下,您的情况与 1 相同。要么用 时堆栈上的任何随机数据填充>MyClass 已分配,这意味着您将获得同一输入的随机哈希值。

如果您需要对 T 的整个值进行哈希处理,我会像 Steve 建议的那样将数据复制到临时缓冲区中,然后使用讨论的可变长度哈希函数之一 此处

EDIT: My original stance on std::pair was wrong, it shouldn't be allowed in a union. For a class to be a valid member of a union it must have a trivial constructor according to standard 9.5.1. The definition of a trivial constructor is this, from paragraph 12.1.5:

If there is no user-declared
constructor for class X, a default
constructor is implicitly declared.
An implicitly-declared default
constructor is an inline public member
of its class. A constructor is trivial
if it is an implicitly-declared
default constructor and if:

  • its class has no virtual functions and no virtual base classes, and
  • all the direct base classes of its class have trivial constructors, and
  • for all the nonstatic data members of its class that are of class type
    (or array thereof), each such class
    has a trivial constructor

Paragraph 20.2.2.2 states that the following constructor must be available in a pair:

pair(const T1& x, const T2& y);

as soon as this constructor is supplied no default constructor will be implicitly declared.

The funny thing here is that my compiler (Visual Studio 2008) seems to give std::pair special treatment. If I copy the code from the std::pair implementation and place it in my own namespace foo the unions don't work :)

namespace foo {
    template<class _Ty1, class _Ty2> struct pair {
        typedef _Ty1 first_type;
        typedef _Ty2 second_type;
        pair() : first(_Ty1()), second(_Ty2()) {
        }
    }
}

//This doesn't work in VC2008
union Baz {
    foo::pair<bool, double> a;
    int b;
}
//This works in VC2008
union Buz {
    std::pair<bool, double> a;
    int b;
}

Your solution is a common way of getting around this problem. I usually prepend the class name with a C (short for construct) to partially mimic the ordinary constructor syntax, this would in your case become CMyClass(a, b).

As Steve and Matthieu has pointed out you're not using a very good hash function though. Firstly there's no real guarantee (I think, please correct me if I'm wrong) that f and s in the union will even partially occupy the same memory space, and secondly even if they in practice will probably will share the first min(sizeof(s), sizeof(f)) bytes this means that for MyClass you're only hashing on part of the value. In this case you will hash on the value of the bool a, in this case there's two options:

  1. Your compiler uses int as the internal representation for the bool in which case your hash function will only return two values, one for true and one for false.
  2. Your compiler uses char as the internal representation for the bool. In this case the value will probably be padded to at least sizeof(int), either with zeroes in which case you have the same situation as 1. or with whatever random data is on the stack when MyClass is allocated which means you get random hash values for the same input.

If you need to hash by the entire value of T I would copy the data into a temporary buffer like Steve suggests and then use one of the variable-length hash functions discussed here.

冷情 2024-08-21 20:03:22

我将替换这个:

size_t hash() const {
    union {T f; size_t s;} u = { val };
    return u.s;
}

用这个:

size_t hash() const {
    size_t s = 0;
    memcpy(&s, &val, std::min(sizeof(size_t), sizeof(T)));
    return s;
}

复制两个大小中较小的一个而不是较大的一个,如果 memcpy 是编译器的内在函数,那么您看起来很适合优化。但最重要的是,T 有什么构造函数并不重要。

但是,如果 T 是一个大类型,那么它不是一个好的哈希函数。在您的示例 MyClass 中,您可能会发现 boolsize_t 在您的实现中具有相同的大小,因此 double 根本不参与哈希,因此只有两个可能的散列值。

尽管如此,情况可能会更糟。如果 T 有任何虚拟函数,您可能会发现所有实例都散列到相同的值:vtable 的地址...

I would replace this:

size_t hash() const {
    union {T f; size_t s;} u = { val };
    return u.s;
}

With this:

size_t hash() const {
    size_t s = 0;
    memcpy(&s, &val, std::min(sizeof(size_t), sizeof(T)));
    return s;
}

Copies the smaller of the two sizes rather than the larger, and if memcpy is an intrinsic on your compiler then you're looking good for optimisation. Most importantly, though, it doesn't matter what constructors T has.

It's not a good hash function, though, if T is a large type. In your example MyClass, you might find that bool and size_t are the same size in your implementation, hence the double doesn't participate in the hash at all so there are only two possible hashed values.

Still, it could be worse. If T has any virtual functions, you'll probably find that all instances hash to the same value: the address of the vtable...

雪花飘飘的天空 2024-08-21 20:03:22

关于使用 std::pair 作为联合成员,我认为应该禁止。该标准规定(§12.1):

联合成员不得属于具有重要构造函数的类类型(或其数组)。

因此,任何具有用户定义构造函数的类都不能在联合中使用,因为将不再隐式声明默认构造函数。现在,在 std::pair 规范(第 20.2.2 节)中,明确指出pair 实现必须提供参数化构造函数来初始化这两个值。因此,您使用的对实现或联合实现不符合标准。

注意:测试您在 Comeau 上提供的代码会出现以下错误:

"ComeauTest.c", line 8: error: invalid union member -- class
          "std::pair<bool, double>" has a disallowed member function
      union {T f; size_t s;} u = { val };
               ^
          detected during instantiation of "unsigned int Foo<T>::hash() const
                    [with T=std::pair<bool, double>]" at line 22

Regarding the use of an std::pair as a union member, I think it should be disallowed. The standard says (§12.1):

A union member shall not be of a class type (or array thereof) that has a non-trivial constructor.

So any class with a user-defined constructor cannot be used in a union, since the default constructor will no longer be implicitly declared. Now in the specification of std::pair (§20.2.2), it is explicitly stated that pair implementations must provide a parameterized constructor to initialize both values. Consequently, either the pair implementation or union implementation you use does not comply to the standard.

N.B. : Testing the code you gave on Comeau gives the following error:

"ComeauTest.c", line 8: error: invalid union member -- class
          "std::pair<bool, double>" has a disallowed member function
      union {T f; size_t s;} u = { val };
               ^
          detected during instantiation of "unsigned int Foo<T>::hash() const
                    [with T=std::pair<bool, double>]" at line 22
离鸿 2024-08-21 20:03:22

我只有一个问题:为什么使用 union ?

据我了解,哈希应该对应于对象的前几个字节。如果您打算这样做,为什么不呢:

size_t hash() const {
  return reinterpret_cast<size_t>(val);
}

它应该以更高的效率完成相同的技巧(我认为),因为堆栈上没有分配大小为 sizeof(T) 的对象。

I have only one question: why using a union ?

From what I understand, the hash should correspond to the first few bytes of your objects. If you are going to do this, why not:

size_t hash() const {
  return reinterpret_cast<size_t>(val);
}

which should accomplish the same trick (I think) with more efficiency since there is no allocation of an object of size sizeof(T) on the stack.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文