使用通用数据类型的指针、引用、句柄，尽可能通用和灵活

发布于 2024-09-03 20:12:29 字数 1182 浏览 10 评论 0原文

在我的应用程序中，我有很多不同的数据类型，例如汽车、自行车、人……（它们实际上是其他数据类型，但这只是为了示例）。

由于我的应用程序中也有相当多的“通用”代码，并且该应用程序最初是用 C 编写的，因此指向 Car、Bicycle、Person 等的指针通常作为 void 指针传递给这些通用模块，并带有标识类型，如下所示：

Car myCar;
ShowNiceDialog ((void *)&myCar, DATATYPE_CAR);

“ShowNiceDialog”方法现在使用元信息（将 DATATYPE_CAR 映射到接口以从 Car 中获取实际数据的函数）根据给定的数据类型获取汽车的信息。这样，通用逻辑只需编写一次，而不是为每个新数据类型每次都编写一次。

当然，在 C++ 中，您可以通过使用公共根类来使这变得更容易，就像这样

class RootClass
   {
   public:
      string getName() const = 0;
   };

class Car : public RootClass
   {
   ...
   };

void ShowNiceDialog (RootClass *root);

问题是，在某些情况下，我们不想将数据类型存储在类中，而是以完全不同的格式来保存记忆。在某些情况下，我们需要在应用程序中管理数亿个实例，并且我们不想为每个实例创建完整的类。假设我们有一个具有 2 个特征的数据类型：

数量（double，8 字节）
布尔值（1 字节）

虽然我们只需要 9 字节来存储这些信息，但将其放入类中意味着我们至少需要 16 字节（因为填充），并且使用 v 指针，我们甚至可能需要 24 个字节。对于数亿个实例，每个字节都很重要（我有一个 64 位应用程序变体，在某些情况下它需要 6 GB 内存）。

void 指针方法的优点是，我们几乎可以对 void 指针中的任何内容进行编码，并决定如何使用它，如果我们想要从中获取信息（将其用作真正的指针，作为索引，...），但在类型安全的成本。

模板化的解决方案没有帮助，因为通用逻辑构成了应用程序的很大一部分，并且我们不想将所有这些模板化。此外，数据模型可以在运行时扩展，这也意味着模板无济于事。

有没有比 void 指针更好（并且类型更安全）的方法来处理这个问题？有任何与此相关的框架、白皮书、研究材料参考吗？

原文

In my application I have lots of different data types, e.g. Car, Bicycle, Person, ... (they're actually other data types, but this is just for the example).

Since I also have quite some 'generic' code in my application, and the application was originally written in C, pointers to Car, Bicycle, Person, ... are often passed as void-pointers to these generic modules, together with an identification of the type, like this:

Car myCar;
ShowNiceDialog ((void *)&myCar, DATATYPE_CAR);

The 'ShowNiceDialog' method now uses meta-information (functions that map DATATYPE_CAR to interfaces to get the actual data out of Car) to get information of the car, based on the given data type. That way, the generic logic only has to be written once, and not every time again for every new data type.

Of course, in C++ you could make this much easier by using a common root class, like this

class RootClass
   {
   public:
      string getName() const = 0;
   };

class Car : public RootClass
   {
   ...
   };

void ShowNiceDialog (RootClass *root);

The problem is that in some cases, we don't want to store the data type in a class, but in a totally different format to save memory.
In some cases we have hundreds of millions of instances that we need to manage in the application, and we don't want to make a full class for every instance.
Suppose we have a data type with 2 characteristics:

A quantity (double, 8 bytes)
A boolean (1 byte)

Although we only need 9 bytes to store this information, putting it in a class means that we need at least 16 bytes (because of the padding), and with the v-pointer we possibly even need 24 bytes.
For hundreds of millions of instances, every byte counts (I have a 64-bit variant of the application and in some cases it needs 6 GB of memory).

The void-pointer approach has the advantage that we can almost encode anything in a void-pointer and decide how to use it if we want information from it (use it as a real pointer, as an index, ...), but at the cost of type-safety.

Templated solutions don't help since the generic logic forms quite a big part of the application, and we don't want to templatize all this. Additionally, the data model can be extended at run time, which also means that templates won't help.

Are there better (and type-safer) ways to handle this than a void-pointer?
Any references to frameworks, whitepapers, research material regarding this?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

时光无声 2024-09-10 20:12:29

如果您不想要完整的课程，您应该阅读 FlyWeight 模式。它的设计目的是节省内存。

编辑：抱歉，午餐时间暂停;)

典型的 FlyWeight 方法是将大量对象共有的属性与给定实例的典型属性分开。

一般来说，这意味着：

struct Light
{
  kind_type mKind;
  specific1 m1;
  specific2 m2;
};

kind_type 通常是一个指针，但这不是必需的。在你的情况下，这将是一个真正的浪费，因为指针本身将是“有用”信息的 4 倍大。

在这里我认为我们可以利用填充来存储 id。毕竟，正如您所说，即使我们只使用其中的 9 位，它也会扩展到 16 位，所以我们不要浪费其他 7 位！

struct Object
{
  double quantity;
  bool flag;
  unsigned char const id;
};

请注意，元素的顺序很重要：

0x00    0x01    0x02    0x03
[      ][      ][      ][      ]
   quantity       flag     id

0x00    0x01    0x02    0x03
[      ][      ][      ][      ]
   id     flag     quantity

0x00            0x02            0x04
[      ][      ][      ][      ][      ][      ]
   id     --        quantity      flag     --

我不理解“运行时扩展”位。看起来很可怕。这是某种自我修改代码吗？

模板允许创建一个非常有趣的 FlyWeight 形式： Boost.Variant 。

typedef boost::variant<Car,Dog,Cycle, ...> types_t;

该变体可以包含此处引用的任何类型。它可以通过“普通”函数进行操作：

void doSomething(types_t const& t);

可以存储在容器中：

typedef std::vector<types_t> vector_t;

最后，对其进行操作的方式：

struct DoSomething: boost::static_visitor<>
{
  void operator()(Dog const& dog) const;

  void operator()(Car const& car) const;
  void operator()(Cycle const& cycle) const;
  void operator()(GenericVehicle const& vehicle) const;

  template <class T>
  void operator()(T const&) {}
};

注意这里的行为非常有趣。正常的函数重载决策会发生，因此：

如果您有一个 Car 或 Cycle，您将使用它们，GenericVehicle 的所有其他子级都会使用它们第四个版本
可以指定一个模板版本来捕获它们，并适当地指定它。

我要注意的是，非模板方法可以完美地在 .cpp 文件中定义。

为了应用此访问者，您可以使用 boost::apply_visitor 方法：

types_t t;
boost::apply_visitor(DoSomething(), t);

// or

boost::apply_visitor(DoSomething())(t);

第二种方式看起来很奇怪，但这意味着您可以以最有趣的方式使用它，作为谓词：

vector_t vec = /**/;
std::foreach(vec.begin(), vec.end(), boost::apply_visitor(DoSomething()));

阅读变体，这是最有趣的。

编译时检查：您错过了一个 operator() 吗？编译器会抛出“
不需要 RTTI：没有虚拟指针，没有动态类型 -->”与使用联合一样快，但安全性更高

您当然可以通过定义多个变体来分段代码。如果代码的某些部分仅处理 4/5 类型，则为其使用特定的变体:)

If you don't want a full class, you should read up on FlyWeight pattern. It's designed to save up memory.

EDIT: sorry, lunch-time pause ;)

The typical FlyWeight approach is to separate properties that are common to a great number of objects from properties that are typical of a given instance.

Generally, it means:

struct Light
{
  kind_type mKind;
  specific1 m1;
  specific2 m2;
};

The kind_type is often a pointer, however it is not necessary. In your case it would be a real waste because the pointer itself would be 4 times as big as the "useful" information.

Here I think we could exploit padding to store the id. After all, as you said it's going to be expanded to 16 bits even though we only use 9 of them, so let's not waste the other 7!

struct Object
{
  double quantity;
  bool flag;
  unsigned char const id;
};

Note that the order of elements is important:

0x00    0x01    0x02    0x03
[      ][      ][      ][      ]
   quantity       flag     id

0x00    0x01    0x02    0x03
[      ][      ][      ][      ]
   id     flag     quantity

0x00            0x02            0x04
[      ][      ][      ][      ][      ][      ]
   id     --        quantity      flag     --

I don't understand the "extended at runtime" bit. Seems scary. Is this some sort of self-modifying code ?

Template allow to create a very interesting form of FlyWeight: Boost.Variant.

typedef boost::variant<Car,Dog,Cycle, ...> types_t;

The variant can hold any of the types cited here. It can be manipulated by "normal" functions:

void doSomething(types_t const& t);

Can be stored in containers:

typedef std::vector<types_t> vector_t;

And finally, the way to operate over it:

struct DoSomething: boost::static_visitor<>
{
  void operator()(Dog const& dog) const;

  void operator()(Car const& car) const;
  void operator()(Cycle const& cycle) const;
  void operator()(GenericVehicle const& vehicle) const;

  template <class T>
  void operator()(T const&) {}
};

It's very interesting to note the behavior here. Normal function overload resolution occurs, therefore:

If you have a Car or a Cycle you'll use those, every other child of GenericVehicle will us the 4th version
It's possible to specify a template version as a catch them all, and specify it appropriately.

I shall note that non-template methods can perfectly be defined in a .cpp file.

In order to apply this visitor, you use the boost::apply_visitor method:

types_t t;
boost::apply_visitor(DoSomething(), t);

// or

boost::apply_visitor(DoSomething())(t);

The second way seems odd, but it means you can use it in a most interesting fashion, as predicate:

vector_t vec = /**/;
std::foreach(vec.begin(), vec.end(), boost::apply_visitor(DoSomething()));

Read up on variant, it's most interesting.

Compile time check: you missed one operator() ? the compiler throws up
No necessity of RTTI: no virtual pointer, no dynamic type --> as fast as using a union, but with increased safety

You can of course segment your code, by defining multiple variants. If some sections of the code only deal with 4/5 types, then use a specific variant for it :)

回复收藏 0 原文

浅听莫相离 2024-09-10 20:12:29

在这种情况下，听起来您应该简单地使用重载。例如：

#ifdef __cplusplus // Only enable this awesome thing for C++:
#   define PROVIDE_OVERLOAD(CLASS,TYPE) \
    inline void ShowNiceDialog(const CLASS& obj){ \ 
         ShowNiceDialog(static_cast<void*>(&obj),TYPE); \
    }

    PROVIDE_OVERLOAD(Car,DATATYPE_CAR)
    PROVIDE_OVERLOAD(Bicycle,DATATYPE_BICYCLE)
    // ...

#undef PROVIDE_OVERLOAD // undefine it so that we don't pollute with macros
#endif // end C++ only

如果您为各种类型创建重载，那么您将能够以简单且类型安全的方式调用 ShowNiceDialog，但您仍然能够利用它的优化 C 变体。

使用上面的代码，您可以在 C++ 中编写如下内容：

 Car c;
 // ...
 ShowNiceDialog(c);

如果您更改了 c 的类型，那么它仍然会使用适当的重载（如果没有重载，则给出错误）。它不会阻止人们使用现有的类型不安全的 C 变体，但由于类型安全版本更容易调用，我希望其他开发人员无论如何都会更喜欢它。

编辑
我应该补充一点，上面回答的是如何使 API 类型安全的问题，而不是如何使实现类型安全的问题。这将帮助那些使用您的系统的人避免不安全的调用。另请注意，这些包装器提供了一种类型安全方法，用于使用编译时已知的类型...对于动态类型，确实有必要使用不安全版本。但是，另一种可能性是您可以提供如下所示的包装类：

class DynamicObject
{
    public:
         DynamicObject(void* data, int id) : _datatype_id(id), _datatype_data(data) {}
         // ...
         void showNiceDialog()const{ ShowNiceDialog(_datatype_data,_datatype_id); }
         // ...
    private:
         int _datatype_id;
         void* _datatype_data;
};

对于那些动态类型，在构造对象时仍然没有太多安全性，但是一旦构造了对象，您将拥有更安全的机制。将其与类型安全工厂结合起来是合理的，这样 API 的用户就永远不会自己实际构造 DynamicObject 类，因此不需要调用不安全的构造函数。

In this case, it sounds like you should simply use overloading. For example:

#ifdef __cplusplus // Only enable this awesome thing for C++:
#   define PROVIDE_OVERLOAD(CLASS,TYPE) \
    inline void ShowNiceDialog(const CLASS& obj){ \ 
         ShowNiceDialog(static_cast<void*>(&obj),TYPE); \
    }

    PROVIDE_OVERLOAD(Car,DATATYPE_CAR)
    PROVIDE_OVERLOAD(Bicycle,DATATYPE_BICYCLE)
    // ...

#undef PROVIDE_OVERLOAD // undefine it so that we don't pollute with macros
#endif // end C++ only

If you create overloads for your various types, then you will be able to invoke ShowNiceDialog in a simple and type safe manner, but you will still be able to leverage your optimized C variant of it.

With the code above, you could, in C++, write something like the following:

 Car c;
 // ...
 ShowNiceDialog(c);

If you changed the type of c, then it would still use the appropriate overload (or give an error if there was no overload). It doesn't prevent one from using the existing type-unsafe C variant, but since the typesafe version is easier to invoke, I would expect that other developers would prefer it, anyway.

Edit
I should add that the above answers the question of how to make the API typesafe, not about how to make the implementation typesafe. This will help those using your system to avoid unsafe invocations. Also note that these wrappers provide a typesafe means for using types known already at compile-time... for dynamic types, it really would be necessary to use the unsafe versions. However, another possibility is that you could provide a wrapper class like the following:

class DynamicObject
{
    public:
         DynamicObject(void* data, int id) : _datatype_id(id), _datatype_data(data) {}
         // ...
         void showNiceDialog()const{ ShowNiceDialog(_datatype_data,_datatype_id); }
         // ...
    private:
         int _datatype_id;
         void* _datatype_data;
};

For those dynamic types, you would still not have much safety when it comes to constructing the object, but once the object were constructed, you would have a much safer mechanism. It would be reasonable to combine this with a typesafe factory so that users of your API would never actually construct the DynamicObject class themselves, and so would not need to invoke the unsafe constructor.

回复收藏 0 原文

大海や 2024-09-10 20:12:29

完全有可能在 Visual Studio 中更改类的打包 - 您可以使用 __declspec(align(x)) 或 #pragma pack(x)，并且属性页中有一个选项。

我建议解决方案是将您的类分别存储在每个数据成员的向量中，然后每个类将仅保存对主类的引用和这些向量的索引。如果大师班是单例，那么这可以进一步改进。

class VehicleBase {
public:
    virtual std::string GetCarOwnerFirstName() = 0;
    virtual ~VehicleBase();
};
class Car : public VehicleBase {
    int index;
public:
    std::string GetCarOwnerFirstName() { return GetSingleton().carownerfirstnames[index]; }
};

当然，这使得一些实现细节有待改进，例如 Car 数据成员的内存管理。然而，Car 本身很简单，可以随时创建/销毁，并且 GetSingleton 中的向量将非常有效地打包数据成员。

It's perfectly possible to change the packing of a class in, say, Visual Studio- you can use __declspec(align(x)) or #pragma pack(x) and there's an option in the property pages.

I would suggest that the solution is to store your classes in, say, vectors of each data member individually, then each class will hold just a reference to the master class and an index into these vectors. If the master class were to be a singleton, then this could be improved further.

class VehicleBase {
public:
    virtual std::string GetCarOwnerFirstName() = 0;
    virtual ~VehicleBase();
};
class Car : public VehicleBase {
    int index;
public:
    std::string GetCarOwnerFirstName() { return GetSingleton().carownerfirstnames[index]; }
};

Of course, this leaves some implementation details to be desired, such as the memory management of Car's data members. However, Car itself is trivial and can be created/destroyed at any time, and the vectors in GetSingleton will pack data members quite efficiently.

回复收藏 0 原文

醉酒的小男人 2024-09-10 20:12:29

我会使用特征

template <class T>
struct DataTypeTraits
{
};

template <>
struct DataTypeTraits<Car>
{
   // put things that describe Car here
   // Example: Give the type a name
   static std::string getTypeName()
   {
      return "Car";
   }
};
template <>
struct DataTypeTraits<Bicycle>
{
   // the same for bicycles
   static std::string getTypeName()
   {
      return "Bicycle";
   }
};

template <class T>
ShowNiceDialog(const T& t)
{
   // Extract details of given object
   std::string typeName(DataTypeTraits<T>::getTypeName());
   // more stuff
}

这样，每当您添加要应用它的新类型时，您都不需要更改 ShowNiceDialog() 。您所需要的只是新类型的 DataTypeTraits 专门化。

I would use traits

template <class T>
struct DataTypeTraits
{
};

template <>
struct DataTypeTraits<Car>
{
   // put things that describe Car here
   // Example: Give the type a name
   static std::string getTypeName()
   {
      return "Car";
   }
};
template <>
struct DataTypeTraits<Bicycle>
{
   // the same for bicycles
   static std::string getTypeName()
   {
      return "Bicycle";
   }
};

template <class T>
ShowNiceDialog(const T& t)
{
   // Extract details of given object
   std::string typeName(DataTypeTraits<T>::getTypeName());
   // more stuff
}

This way you don't need to change ShowNiceDialog() whenever you add a new type you want to apply it to. All you need is a specialization of DataTypeTraits for the new type.

回复收藏 0 原文

~没有更多了~