变体转换系统

发布于 2024-11-05 04:21:12 字数 478 浏览 5 评论 0原文

我编写了一个变体类,它将用作动态语言中的主要类型,最终将允许 256 种不同类型的值(标头是无符号字节,实际只使用 20 个)。我现在想实现类型之间的转换/转换。

我最初的想法是一个查找表,但是所需的内存量使其实现起来不切实际。

有哪些替代方案?现在,我正在考虑根据其他人的研究和建议进一步采用三种方法:

  1. 将类型分组为更大的子集,例如数字或集合或其他。
  2. 创建一个具有 CanCast(from, to) 和 Cast(Variant) 方法的转换接口,并允许将实现该接口的类添加到列表中,然后可以检查该列表以查看是否有任何转换类可以执行转换。
  3. 与(1)类似,但制作多个母版,并且铸造是从原始类型到母版然后再到最终类型的两步过程。

最好的系统是什么?

编辑:我已经添加了赏金,因为我仍然不确定最好的系统,当前的答案非常好,并且肯定得到了我的+1,但一定有人已经做到了这一点并且可以说出最好的方法是什么。

I have written a variant class, which will be used as the main type in a dynamic language, that will ultimately allow 256 different types of value (header is an unsigned byte, only 20 are actually used). I now want to implement casting/converting between types.

My initial thought was a lookup table, but the shear amount of memory that would need makes it impractical to implement.

What are the alternatives? Right now I am considering a further three methods from research and suggestions from other people:

  1. Group the types into larger subsets, such as numeric or collection or other.
  2. Make a conversion interface that has CanCast(from, to) and Cast(Variant) methods and allow classes that implement that interface to be added to a list, that can then be checked to see if any of the conversion classes can do the cast.
  3. Similar to (1) but make several master types, and casting is a two step process from the original type to the master type and then again to the final type.

What would be the best system?

Edit: I have added the bounty as I am still unsure on the best system, the current answer is very good, and definitely got my +1 but there must be people out there who have done this and can say what the best method is.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

相思碎 2024-11-12 04:21:12

我的系统非常“重”(大量代码),但速度非常快,而且功能非常丰富(跨平台 C++)。我不确定您希望您的设计走多远,但这是我所做的最重要的部分:

DatumState - 持有“枚举”的类“ type”,加上本机值,它是所有原始类型(包括 void*)之间的“联合”。此类与所有类型分离,可用于任何本机/原始类型以及“引用”void* 类型。由于“enum”还具有“VALUE_OF”和“REF_TO”上下文,因此此类可以呈现为“完全包含”float(或某些原始类型),或“引用但不拥有”float(或某些原始类型)。 (我实际上有“VALUE_OF”、“REF_TO”和“PTR_TO”上下文,因此我可以逻辑地存储值,一个不能为空的引用,或者一个可能为空或不为空的指针,我知道我需要它删除或不删除。)

Datum - 完全包含 DatumState 的类,但它扩展了其接口以适应各种“众所周知”的类型(例如 MyDateMyColorMyFileName 等)。已知类型实际上存储在 DatumState 成员内的 void* 中。但是,由于 DatumState 的“enum”部分具有“VALUE_OF”和“REF_TO”上下文,它可以表示“指向 MyDate 的指针”或“MyDate 的值”。

DatumStateHandle - 使用(众所周知的)类型参数化的帮助程序模板类(例如 MyDateMyColor、< code>MyFileName 等)这是 Datum 使用的访问器从众所周知的类型中提取状态。默认实现适用于大多数类,但任何具有特定访问语义的类都只会覆盖该模板类中一个或多个成员函数的特定模板参数化/实现。

宏、辅助函数和其他一些支持内容 - 简化将众所周知的类型“添加”到我的 Datum/Variant 中,我发现将逻辑集中到几个宏中,提供一些支持功能(例如运算符重载),并在代码中建立一些其他约定很方便。

作为此实现的“副作用”,我获得了很多好处,包括引用和值语义、所有类型上的“null”选项以及对所有类型的异构容器的支持。

例如,您可以创建一组整数并对它们进行索引:

int my_ints[10];
Datum d(my_ints, 10/*count*/);
for(long i = 0; i < d.count(); ++i)
{
  d[i] = i;
}

类似地,某些数据类型通过字符串或枚举进行索引:

MyDate my_date = MyDate::GetDateToday();
Datum d(my_date);
cout << d["DAY_OF_WEEK"] << endl;
cout << d[MyDate::DAY_OF_WEEK] << endl; // alternative

我可以存储项目集(本机)或集合Datum< /code>s(包装每个项目)。对于任何一种情况,我都可以递归地“展开”:

MyDate my_dates[10];
Datum d(my_dates, 10/*count*/);
for(long i = 0; i < d.count(); ++i)
{
  cout << d[i][MyDate::DAY_OF_WEEK] << endl;
}

有人可能会说我的“REF_TO”和“VALUE_OF”语义太过分了,但它们对于“set-unwrapping”至关重要”。

我已经用九种不同的设计完成了这个“Variant”的事情,我当前的是“最重的”(大多数代码),但我最喜欢的一个(几乎是速度最快的一个相当小的对象足迹),并且我已经弃用了其他八种设计供我使用。

我的设计的“缺点”是:

  1. 对象是通过以下方式访问的 :
    来自 void*static_cast<>()
    (类型安全并且相当快,但是
    需要间接;但,
    副作用是设计支持
    存储“null”。)
  2. 编译时间较长,因为
    曝光的知名类型
    通过 Datum 接口(但是你
    如果您不这样做,可以使用 DatumState
    想要众所周知的类型 API)。

无论您的设计如何,我都会推荐以下内容:

  1. 使用“enum”或其他东西来告诉
    你的“类型”,与
    ”。 (我知道你可以压缩
    它们变成一个“int”或其他东西
    带位包装,但那就是
    访问速度慢并且非常棘手
    保持新类型
    介绍。)

  2. 依靠模板或其他东西来集中操作,
    特定类型的机制
    (覆盖)处理(假设你想
    处理重要的类型)。

游戏的名称是“添加新类型时简化维护”(或者至少对我来说是这样)。就像一篇好的学期论文一样,如果您不断地删除所需的代码,那么重写、重写、重写来保留或增加您的功能是一个非常好的主意。维护系统(例如,最大限度地减少使新类型适应现有Variant基础设施所需的工作)。

祝你好运!

My system is very "heavy" (lots of code), but very fast, and very feature rich (cross-platform C++). I'm not sure how far you would want to go with your design, but here's the biggest parts of what I did:

DatumState - Class holding an "enum" for "type", plus native value, which is a "union" among all the primitive types, including void*. This class is uncoupled from all types, and can be used for any native/primitive types, and "reference to" void* type. Since the "enum" also has "VALUE_OF" and "REF_TO" context, this class can present as "wholly containing" a float (or some primitive type), or "referencing-but-not-owning" a float (or some primitive type). (I actually have "VALUE_OF", "REF_TO", and "PTR_TO" contexts so I can logically store a value, a reference-that-cannot-be-null, or a pointer-that-may-be-null-or-not, and which I know I need to delete-or-not.)

Datum - Class wholly containing a DatumState, but which expands its interface to accommodate various "well-known" types (like MyDate, MyColor, MyFileName, etc.) These well-known types are actually stored in the void* inside the DatumState member. However, because the "enum" portion of the DatumState has the "VALUE_OF" and "REF_TO" context, it can represent a "pointer-to-MyDate" or "value-of-MyDate".

DatumStateHandle - A helper template class parameterized with a (well-known) type (like MyDate, MyColor, MyFileName, etc.) This is the accessor used by Datum to extract state from the well-known type. The default implementation works for most classes, but any class with specific semantics for access merely overrides its specific template parameterization/implementation for one or more member functions in this template class.

Macros, helper functions, and some other supporting stuff - To simplify "adding" of well-known types to my Datum/Variant, I found it convenient to centralize logic into a few macros, provide some support functions like operator overloading, and establish some other conventions in my code.

As a "side-effect" of this implementation, I got tons of benefits, including reference and value semantics, options for "null" on all types, and support for heterogeneous containers for all types.

For example, you can create a set of integers and index them:

int my_ints[10];
Datum d(my_ints, 10/*count*/);
for(long i = 0; i < d.count(); ++i)
{
  d[i] = i;
}

Similarly, some data types are indexed by strings, or by enums:

MyDate my_date = MyDate::GetDateToday();
Datum d(my_date);
cout << d["DAY_OF_WEEK"] << endl;
cout << d[MyDate::DAY_OF_WEEK] << endl; // alternative

I can store sets-of-items (natively), or sets-of-Datums (wrapping each item). For either case, I can "unwrap" recursively:

MyDate my_dates[10];
Datum d(my_dates, 10/*count*/);
for(long i = 0; i < d.count(); ++i)
{
  cout << d[i][MyDate::DAY_OF_WEEK] << endl;
}

One might argue my "REF_TO" and "VALUE_OF" semantics are overkill, but they were essential for the "set-unwrapping".

I've done this "Variant" thing with nine different designs, and my current is the "heaviest" (most code), but the one I like the best (almost the fastest with a pretty small object footprint), and I've deprecated the other eight designs for my use.

The "downsides" to my design are:

  1. Objects are accessed through
    static_cast<>() from a void*
    (type-safe and fairly fast, but
    indirection is required; but,
    side-effect is that design supports
    storage of "null".)
  2. Compiles are longer because of the
    well-known types that are exposed
    through the Datum interface (but you
    can use DatumState if you do not
    want well-known type APIs).

No matter your design, I'd recommend the following:

  1. Use an "enum" or something to tell
    you the "type", separate from the
    "value". (I know you can compress
    them into one "int" or something
    with bit packing, but that is
    slow-for-access and very tricky to
    maintain as new types are
    introduced.)

  2. Lean on templates or something to centralize operations, with a
    mechanism for type-specific
    (override) processing (assuming you want to
    handle non-trivial types).

The name-of-the-game is "simplified maintenance when adding new types" (or at least, it was for me). Like a good Term Paper, it is a very good idea if you rewrite, rewrite, rewrite, to hold-or-increase your functionality as you constantly remove the code required to maintain the system (e.g., minimize the effort required to adapt new types to your existing Variant infrastructure).

Good luck!

青巷忧颜 2024-11-12 04:21:12

做过类似的事情。

您可以在“标头”中添加另一个字节,指示其真正存储的类型。

C 风格编程语言示例:

typedef
enum VariantInternalType {
  vtUnassigned = 0;
  vtByte = 1;
  vtCharPtr = 2; // <-- "plain c" string
  vtBool = 3;
  // other supported data types
}

// --> real data
typedef
struct VariantHeader {
  void* Reserved; // <-- your data (byte or void*)
  VariantInternalType VariantInternalType;  
}

// --> hides real data
typedef
  byte[sizeof(VariantHeader)] Variant;

// allocates & assign a byte data type to a variant
Variant ByteToVar(byte value)
{
  VariantHeader MyVariantHeader;
  Variant MyVariant;

  MyVariantHeader.VariantInternalType = VariantInternalType.vtByte;
  MyVariantHeader.Reserved = value;  

  memcpy (&MyVariant, &MyVariantHeader, sizeof(Variant));

  return myVariant;
}

// allocates & assign a char array data type to a variant
Variant CharPtrToVar(char* value)
{
  VariantHeader MyVariantHeader;
  Variant MyVariant;

  MyVariantHeader.VariantInternalType = VariantInternalType.vtByte;
  MyVariantHeader.Reserved = strcpy(value);  

  // copy exposed struct type data to hidden array data
  memcpy(&MyVariant, &MyVariantHeader, sizeof(Variant));

  return myVariant;
}

// deallocs memory for any internal data type
void freeVar(Variant &myVariant)
{
  VariantHeader MyVariantHeader;

  // copy exposed struct type data to hidden array data
  memcpy(&MyVariantHeader, &MyVariant, sizeof(VariantHeader));

  switch (MyVariantHeader.VariantInternalType) {
    case vtCharPtr:
      strfree(MyVariantHeader.reserved);
    break;

    // other types

    default:
    break;
  }

  // copy exposed struct type data to hidden array data
  memcpy(&MyVariant, &MyVariantHeader, sizeof(Variant));
}

bool isVariantType(Variant &thisVariant, VariantInternalType thisType)
{
  VariantHeader MyVariantHeader;

  // copy exposed struct type data to hidden array data
  memcpy(&MyVariantHeader, &MyVariant, sizeof(VariantHeader));

  return (MyVariant.VariantInternalType == thisType);
}

// -------

void main()
{
  Variant myVariantStr = CharPtrToVar("Hello World");
  Variant myVariantByte = ByteToVar(42);

  char* myString = null;
  byte  myByte = 0;

  if isVariantType(myVariantStr, vtCharPtr) {
    myString = VarToCharPtr(myVariantStr);
    // print variant string into screen
  }

  // ...    
}

这只是一个建议,未经测试。

Done something similar.

You could add another byte to the "header", indicating the type its really storing.

Example in a C-style programming language:

typedef
enum VariantInternalType {
  vtUnassigned = 0;
  vtByte = 1;
  vtCharPtr = 2; // <-- "plain c" string
  vtBool = 3;
  // other supported data types
}

// --> real data
typedef
struct VariantHeader {
  void* Reserved; // <-- your data (byte or void*)
  VariantInternalType VariantInternalType;  
}

// --> hides real data
typedef
  byte[sizeof(VariantHeader)] Variant;

// allocates & assign a byte data type to a variant
Variant ByteToVar(byte value)
{
  VariantHeader MyVariantHeader;
  Variant MyVariant;

  MyVariantHeader.VariantInternalType = VariantInternalType.vtByte;
  MyVariantHeader.Reserved = value;  

  memcpy (&MyVariant, &MyVariantHeader, sizeof(Variant));

  return myVariant;
}

// allocates & assign a char array data type to a variant
Variant CharPtrToVar(char* value)
{
  VariantHeader MyVariantHeader;
  Variant MyVariant;

  MyVariantHeader.VariantInternalType = VariantInternalType.vtByte;
  MyVariantHeader.Reserved = strcpy(value);  

  // copy exposed struct type data to hidden array data
  memcpy(&MyVariant, &MyVariantHeader, sizeof(Variant));

  return myVariant;
}

// deallocs memory for any internal data type
void freeVar(Variant &myVariant)
{
  VariantHeader MyVariantHeader;

  // copy exposed struct type data to hidden array data
  memcpy(&MyVariantHeader, &MyVariant, sizeof(VariantHeader));

  switch (MyVariantHeader.VariantInternalType) {
    case vtCharPtr:
      strfree(MyVariantHeader.reserved);
    break;

    // other types

    default:
    break;
  }

  // copy exposed struct type data to hidden array data
  memcpy(&MyVariant, &MyVariantHeader, sizeof(Variant));
}

bool isVariantType(Variant &thisVariant, VariantInternalType thisType)
{
  VariantHeader MyVariantHeader;

  // copy exposed struct type data to hidden array data
  memcpy(&MyVariantHeader, &MyVariant, sizeof(VariantHeader));

  return (MyVariant.VariantInternalType == thisType);
}

// -------

void main()
{
  Variant myVariantStr = CharPtrToVar("Hello World");
  Variant myVariantByte = ByteToVar(42);

  char* myString = null;
  byte  myByte = 0;

  if isVariantType(myVariantStr, vtCharPtr) {
    myString = VarToCharPtr(myVariantStr);
    // print variant string into screen
  }

  // ...    
}

This is only a suggestion, and its not tested.

一梦等七年七年为一梦 2024-11-12 04:21:12

也许您已经完成了计算,但查找表所需的内存量并没有那么多。

如果您只需要检查类型是否兼容,那么您需要 (256*256)/2 位。这需要 4k 内存。

如果您还需要一个指向转换函数的指针,那么您需要 (256*256)/2 个指针。这在 32 位机器上需要 128k 内存,在 64 位机器上需要 256k 内存。如果您愿意进行一些低级地址布局,则可能可以在 32 位和 64 位计算机上将其降至 64k。

Maybe you've already done the calculation, but the amount of memory you need for the lookup table isn't that much.

If you just need to check if types are compatible, then you need (256*256)/2 bits. This requires 4k of memory.

If you also need a pointer to a conversion function, then you need (256*256)/2 pointers. This requires 128k of memory on a 32-bit machine and 256k on a 64-bit machine. If you're willing to do some low-level address layout, you can probably get that down to 64k on both 32-bit and 64-bit machines.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文