变体转换系统
我编写了一个变体类,它将用作动态语言中的主要类型,最终将允许 256 种不同类型的值(标头是无符号字节,实际只使用 20 个)。我现在想实现类型之间的转换/转换。
我最初的想法是一个查找表,但是所需的内存量使其实现起来不切实际。
有哪些替代方案?现在,我正在考虑根据其他人的研究和建议进一步采用三种方法:
- 将类型分组为更大的子集,例如数字或集合或其他。
- 创建一个具有 CanCast(from, to) 和 Cast(Variant) 方法的转换接口,并允许将实现该接口的类添加到列表中,然后可以检查该列表以查看是否有任何转换类可以执行转换。
- 与(1)类似,但制作多个母版,并且铸造是从原始类型到母版然后再到最终类型的两步过程。
最好的系统是什么?
编辑:我已经添加了赏金,因为我仍然不确定最好的系统,当前的答案非常好,并且肯定得到了我的+1,但一定有人已经做到了这一点并且可以说出最好的方法是什么。
I have written a variant class, which will be used as the main type in a dynamic language, that will ultimately allow 256 different types of value (header is an unsigned byte, only 20 are actually used). I now want to implement casting/converting between types.
My initial thought was a lookup table, but the shear amount of memory that would need makes it impractical to implement.
What are the alternatives? Right now I am considering a further three methods from research and suggestions from other people:
- Group the types into larger subsets, such as numeric or collection or other.
- Make a conversion interface that has CanCast(from, to) and Cast(Variant) methods and allow classes that implement that interface to be added to a list, that can then be checked to see if any of the conversion classes can do the cast.
- Similar to (1) but make several master types, and casting is a two step process from the original type to the master type and then again to the final type.
What would be the best system?
Edit: I have added the bounty as I am still unsure on the best system, the current answer is very good, and definitely got my +1 but there must be people out there who have done this and can say what the best method is.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我的系统非常“重”(大量代码),但速度非常快,而且功能非常丰富(跨平台 C++)。我不确定您希望您的设计走多远,但这是我所做的最重要的部分:
DatumState
- 持有“枚举”的类“ type”,加上本机值,它是所有原始类型(包括void*
)之间的“联合”。此类与所有类型分离,可用于任何本机/原始类型以及“引用”void*
类型。由于“enum
”还具有“VALUE_OF
”和“REF_TO
”上下文,因此此类可以呈现为“完全包含”float
(或某些原始类型),或“引用但不拥有”float
(或某些原始类型)。 (我实际上有“VALUE_OF
”、“REF_TO
”和“PTR_TO
”上下文,因此我可以逻辑地存储值,一个不能为空的引用,或者一个可能为空或不为空的指针,我知道我需要它删除或不删除。)Datum
- 完全包含DatumState
的类,但它扩展了其接口以适应各种“众所周知”的类型(例如MyDate
、MyColor
、MyFileName
等)。已知类型实际上存储在DatumState
成员内的void*
中。但是,由于DatumState
的“enum
”部分具有“VALUE_OF
”和“REF_TO
”上下文,它可以表示“指向 MyDate 的指针
”或“MyDate 的值
”。DatumStateHandle
- 使用(众所周知的)类型参数化的帮助程序模板类(例如MyDate
、MyColor
、< code>MyFileName 等)这是Datum
使用的访问器从众所周知的类型中提取状态。默认实现适用于大多数类,但任何具有特定访问语义的类都只会覆盖该模板类中一个或多个成员函数的特定模板参数化/实现。宏、辅助函数和其他一些支持内容
- 简化将众所周知的类型“添加”到我的Datum
/Variant 中
,我发现将逻辑集中到几个宏中,提供一些支持功能(例如运算符重载),并在代码中建立一些其他约定很方便。作为此实现的“副作用”,我获得了很多好处,包括引用和值语义、所有类型上的“null”选项以及对所有类型的异构容器的支持。
例如,您可以创建一组整数并对它们进行索引:
类似地,某些数据类型通过字符串或枚举进行索引:
我可以存储项目集(本机)或集合
Datum< /code>s(包装每个项目)。对于任何一种情况,我都可以递归地“展开”:
有人可能会说我的“
REF_TO
”和“VALUE_OF
”语义太过分了,但它们对于“set-unwrapping”至关重要”。我已经用九种不同的设计完成了这个“
Variant
”的事情,我当前的是“最重的”(大多数代码),但我最喜欢的一个(几乎是速度最快的一个相当小的对象足迹),并且我已经弃用了其他八种设计供我使用。我的设计的“缺点”是:
来自
void*
的static_cast<>()
(类型安全并且相当快,但是
需要间接;但,
副作用是设计支持
存储“
null
”。)曝光的知名类型
通过
Datum
接口(但是你如果您不这样做,可以使用
DatumState
想要众所周知的类型 API)。
无论您的设计如何,我都会推荐以下内容:
使用“
enum
”或其他东西来告诉你的“类型”,与
“值”。 (我知道你可以压缩
它们变成一个“
int
”或其他东西带位包装,但那就是
访问速度慢并且非常棘手
保持新类型
介绍。)
依靠模板或其他东西来集中操作,
特定类型的机制
(覆盖)处理(假设你想
处理重要的类型)。
游戏的名称是“添加新类型时简化维护”(或者至少对我来说是这样)。就像一篇好的学期论文一样,如果您不断地删除所需的代码,那么重写、重写、重写来保留或增加您的功能是一个非常好的主意。维护系统(例如,最大限度地减少使新类型适应现有
Variant
基础设施所需的工作)。祝你好运!
My system is very "heavy" (lots of code), but very fast, and very feature rich (cross-platform C++). I'm not sure how far you would want to go with your design, but here's the biggest parts of what I did:
DatumState
- Class holding an "enum" for "type", plus native value, which is a "union" among all the primitive types, includingvoid*
. This class is uncoupled from all types, and can be used for any native/primitive types, and "reference to"void*
type. Since the "enum
" also has "VALUE_OF
" and "REF_TO
" context, this class can present as "wholly containing" afloat
(or some primitive type), or "referencing-but-not-owning" afloat
(or some primitive type). (I actually have "VALUE_OF
", "REF_TO
", and "PTR_TO
" contexts so I can logically store a value, a reference-that-cannot-be-null, or a pointer-that-may-be-null-or-not, and which I know I need to delete-or-not.)Datum
- Class wholly containing aDatumState
, but which expands its interface to accommodate various "well-known" types (likeMyDate
,MyColor
,MyFileName
, etc.) These well-known types are actually stored in thevoid*
inside theDatumState
member. However, because the "enum
" portion of theDatumState
has the "VALUE_OF
" and "REF_TO
" context, it can represent a "pointer-to-MyDate
" or "value-of-MyDate
".DatumStateHandle
- A helper template class parameterized with a (well-known) type (likeMyDate
,MyColor
,MyFileName
, etc.) This is the accessor used byDatum
to extract state from the well-known type. The default implementation works for most classes, but any class with specific semantics for access merely overrides its specific template parameterization/implementation for one or more member functions in this template class.Macros, helper functions, and some other supporting stuff
- To simplify "adding" of well-known types to myDatum
/Variant
, I found it convenient to centralize logic into a few macros, provide some support functions like operator overloading, and establish some other conventions in my code.As a "side-effect" of this implementation, I got tons of benefits, including reference and value semantics, options for "null" on all types, and support for heterogeneous containers for all types.
For example, you can create a set of integers and index them:
Similarly, some data types are indexed by strings, or by enums:
I can store sets-of-items (natively), or sets-of-
Datum
s (wrapping each item). For either case, I can "unwrap" recursively:One might argue my "
REF_TO
" and "VALUE_OF
" semantics are overkill, but they were essential for the "set-unwrapping".I've done this "
Variant
" thing with nine different designs, and my current is the "heaviest" (most code), but the one I like the best (almost the fastest with a pretty small object footprint), and I've deprecated the other eight designs for my use.The "downsides" to my design are:
static_cast<>()
from avoid*
(type-safe and fairly fast, but
indirection is required; but,
side-effect is that design supports
storage of "
null
".)well-known types that are exposed
through the
Datum
interface (but youcan use
DatumState
if you do notwant well-known type APIs).
No matter your design, I'd recommend the following:
Use an "
enum
" or something to tellyou the "type", separate from the
"value". (I know you can compress
them into one "
int
" or somethingwith bit packing, but that is
slow-for-access and very tricky to
maintain as new types are
introduced.)
Lean on templates or something to centralize operations, with a
mechanism for type-specific
(override) processing (assuming you want to
handle non-trivial types).
The name-of-the-game is "simplified maintenance when adding new types" (or at least, it was for me). Like a good Term Paper, it is a very good idea if you rewrite, rewrite, rewrite, to hold-or-increase your functionality as you constantly remove the code required to maintain the system (e.g., minimize the effort required to adapt new types to your existing
Variant
infrastructure).Good luck!
做过类似的事情。
您可以在“标头”中添加另一个字节,指示其真正存储的类型。
C 风格编程语言示例:
这只是一个建议,未经测试。
Done something similar.
You could add another byte to the "header", indicating the type its really storing.
Example in a C-style programming language:
This is only a suggestion, and its not tested.
也许您已经完成了计算,但查找表所需的内存量并没有那么多。
如果您只需要检查类型是否兼容,那么您需要 (256*256)/2 位。这需要 4k 内存。
如果您还需要一个指向转换函数的指针,那么您需要 (256*256)/2 个指针。这在 32 位机器上需要 128k 内存,在 64 位机器上需要 256k 内存。如果您愿意进行一些低级地址布局,则可能可以在 32 位和 64 位计算机上将其降至 64k。
Maybe you've already done the calculation, but the amount of memory you need for the lookup table isn't that much.
If you just need to check if types are compatible, then you need (256*256)/2 bits. This requires 4k of memory.
If you also need a pointer to a conversion function, then you need (256*256)/2 pointers. This requires 128k of memory on a 32-bit machine and 256k on a 64-bit machine. If you're willing to do some low-level address layout, you can probably get that down to 64k on both 32-bit and 64-bit machines.