为了通用性和安全性,应该选择什么数据结构?
假设我有一个很长的数据结构定义
data A = A {
x1 :: String
, x2 :: String
...
, x50 :: String
}
现在我有 3 个任务:
- 创建 A 的草稿实例,如 A { x1 = "this is x1", ... }
- 从其他数据结构创建 A 的实例
- 创建另一个数据实例A 的实例
这三个任务涉及标签 x1、...、x50 的繁琐复制。 更好的解决方案是通用列表,
[
Foo "x1" aValue1
, Foo "x2" aValue2
...
]
因为它将使遍历和创建草稿变得更加容易(列表定义已经是草稿)。缺点是,将其他数据结构映射到此结构或从中映射其他数据结构会更危险,因为你会丢失静态类型检查。
这有道理吗? 有没有通用但安全的解决方案?
编辑:为了让您有更好的想法,它是将业务数据映射到表格和信件等文本表示形式。例如:
data TaxData = TaxData {
taxId :: String
, income :: Money
, taxPayed :: Money,
, isMarried :: Bool
...
}
data TaxFormA = TaxFormA {
taxId :: Text
, isMarried :: Text
...
}
data TaxFormB = TaxFormB {
taxId :: Text
, taxPayedRounded :: Text
...
}
它们被转换成文本流,代表实际的形式。如果我一次从税务数据创建一个表单,明年任何表单字段都会移动,例如会出现一个杂散的“0.0”,我不知道它属于哪里。这就是中间数据结构的用途:它使创建草稿数据变得容易。
所以我需要将实际的 TaxData 映射到那些中间表单数据;我需要将这些表单数据映射到实际的表单文本表示;我需要创建草稿中间表单数据。一方面,我讨厌重复这些数据标签,另一方面,它给了我安全感,我在映射时不会混淆任何标签。有灵丹妙药吗?
Say i have a long data structure definition
data A = A {
x1 :: String
, x2 :: String
...
, x50 :: String
}
Now i have 3 tasks:
- create a draft instance of A like A { x1 = "this is x1", ... }
- create an instance of A from some other data structure
- create another data instance from an instance of A
The three tasks involve the tediuous copying of the lables x1, ..., x50.
A better solution would be a generic list
[
Foo "x1" aValue1
, Foo "x2" aValue2
...
]
because it would make traversal and creating a draft much easier (the list definition is the draft already). The downside is that mapping other data structures to and from this would be more dangerous, since you lose static type checking.
Does this make sense?
Is there a generic but safe solution?
Edit: To give you a better idea, it's about mapping business data to textual representation like forms and letters. E.g.:
data TaxData = TaxData {
taxId :: String
, income :: Money
, taxPayed :: Money,
, isMarried :: Bool
...
}
data TaxFormA = TaxFormA {
taxId :: Text
, isMarried :: Text
...
}
data TaxFormB = TaxFormB {
taxId :: Text
, taxPayedRounded :: Text
...
}
Those get transformed into a stream of text, representing the actual forms. If i would create a form from tax data in one pass and next year any form field would have moved, there would e.g. be a stray "0.0" and i would not know where it belongs. That's what the intermediate datat strcuture is for: it makes it easy to create draft data.
So i need to map the actual TaxData to those intermediate form data; i need to map those form data to the actual form textual representation; i need to create draft intermediate form data. On one hand i hate repeating those data labels, on the other hand it gives me saftey, that i don't confuse any label while mapping. Is there a silver bullet?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
像这样的深层结构化数据在 Haskell 中最惯用的表达方式是嵌套的代数数据类型,正如您所做的那样。为什么?它为数据提供了最大程度的类型结构和安全性,防止函数将数据放入错误的格式。通过对某些类型进行新类型化,以增加每个字段中数据之间的差异,可以获得进一步的安全性。
然而,像这样非常大的 ADT 可能难以命名和操作。例如,编译器设计中的常见情况是指定如此大的 ADT,为了帮助为编译器编写代码,我们倾向于使用许多通用编程技巧:SYB、元编程,甚至 Template Haskell,以生成所有我们需要的样板。
因此,总而言之,我会保留您正在采用的 ADT 方法,但考虑使用泛型(例如 SYB 或 Template Haskell)来生成一些定义和辅助函数。
Deeply structured data like this is most idiomatically expressed in Haskell as nested, algebraic data types, as you have done. Why? It gives the most type structure and safety to the data, preventing functions from putting the data into the wrong format. Further safety can be gained by newtyping some of the types, to increase the differences between data in each field.
However, very large ADTs like this can be unwieldy to name and manipulate. A common situation in compiler design is specifying such a large ADT, for example, and to help write the code for a compiler we tend to use a lot of generic programming tricks: SYB, meta-programming, even Template Haskell, to generate all the boilerplate we need.
So, in summary, I'd keep the ADT approach you are taking, but look at using generics (e.g. SYB or Template Haskell) to generate some of your definitions and helper functions.