处理函数式编程中增量数据建模的变化

发布于 2024-09-01 05:59:35 字数 719 浏览 6 评论 0原文

作为一名开发人员,我在工作中必须解决的大多数问题都与数据建模有关。 例如,在 OOP Web 应用程序世界中,我经常必须更改对象中的数据属性以满足新要求。

如果我幸运的话,我什至不需要以编程方式添加新的“行为”代码(函数、方法)。相反,我可以通过注释属性 (Java) 以声明方式添加验证,甚至 UI 选项。

在函数式编程中,由于模式匹配和数据构造函数(Haskell,ML),添加新的数据属性似乎需要大量代码更改。

我该如何最小化这个问题?

这似乎是一个公认的问题,正如 Xavier Leroy 在“对象”第 24 页上很好地指出的那样以及类与模块” - 对于那些没有 PostScript 查看器的人来说,总结一下,它基本上是说FP 语言比 OOP 语言更适合在数据对象上添加新行为,但 OOP 语言更适合添加新数据对象/属性。

FP 语言中是否使用了任何设计模式来帮助缓解此问题?

我已阅读 Phillip Wadler 的 建议使用 Monad 来帮助解决这个模块化问题,但我不确定我是否理解如何?

Most of the problems I have to solve in my job as a developer have to do with data modeling.
For example in a OOP Web Application world I often have to change the data properties that are in a object to meet new requirements.

If I'm lucky I don't even need to programmatically add new "behavior" code (functions,methods). Instead I can declarative add validation and even UI options by annotating the property (Java).

In Functional Programming it seems that adding new data properties requires lots of code changes because of pattern matching and data constructors (Haskell, ML).

How do I minimize this problem?

This seems to be a recognized problem as Xavier Leroy states nicely on page 24 of "Objects and Classes vs. Modules"
- To summarize for those that don't have a PostScript viewer it basically says FP languages are better than OOP languages for adding new behavior over data objects but OOP languages are better for adding new data objects/properties.

Are there any design pattern used in FP languages to help mitigate this problem?

I have read Phillip Wadler's recommendation of using Monads to help this modularity problem but I'm not sure I understand how?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

情深已缘浅 2024-09-08 05:59:36

正如大流士培根指出的,这本质上是表达问题,是一个长期存在的问题,没有普遍接受的解决方案。不过,缺乏两全其美的方法并不能阻止我们有时想要采取一种或另一种方式。现在,您需要一个“函数式语言的设计模式”,所以让我们尝试一下。下面的示例是用 Haskell 编写的,但对于 Haskell(或任何其他语言)来说不一定是惯用的。

首先快速回顾一下“表达问题”。考虑以下代数数据类型:

data Expr a = Lit a | Sum (Expr a) (Expr a)

exprEval (Lit x) = x
exprEval (Sum x y) = exprEval x + exprEval y

exprShow (Lit x) = show x
exprShow (Sum x y) = unwords ["(", exprShow x, " + ", exprShow y, ")"]

这表示简单的数学表达式,仅包含文字值和加法。使用这里的函数,我们可以获取一个表达式并对其求值,或者将其显示为String。现在,假设我们要添加一个新函数 - 比如说,将一个函数映射到所有文字值:

exprMap f (Lit x) = Lit (f x)
exprMap f (Sum x y) = Sum (exprMap f x) (exprMap f y)

简单!我们可以一整天不费吹灰之力地编写函数!代数数据类型太棒了!

事实上,它们太棒了,我们想让我们的表达类型更具表现力。让我们扩展它来支持乘法,我们只是......呃......哦天哪,这会很尴尬,不是吗?我们必须修改我们刚刚编写的每个函数。绝望!

事实上,也许扩展表达式本身比添加使用它们的函数更有趣。因此,假设我们愿意在另一个方向上进行权衡。我们怎样才能做到这一点?

好吧,半途而废是没有意义的。让我们颠倒一切并反转整个程序。这是什么意思?嗯,这就是函数式编程,还有什么比高阶函数更函数式呢?我们要做的就是将表示表达式值的数据类型替换为表示表达式操作的数据类型。我们不需要选择构造函数,而是需要记录所有可能的操作,如下所示:

data Actions a = Actions {
    actEval :: a,
    actMap  :: (a -> a) -> Actions a }

那么我们如何创建没有数据类型的表达式呢?好吧,我们的函数现在是数据,所以我想我们的数据需要是函数。我们将使用常规函数创建“构造函数”,返回操作记录:

mkLit x = Actions x (\f -> mkLit (f x))

mkSum x y = Actions 
    (actEval x + actEval y) 
    (\f -> mkSum (actMap x f) (actMap y f))

现在我们可以更轻松地添加乘法吗?当然可以!

mkProd x y = Actions 
    (actEval x * actEval y) 
    (\f -> mkProd (actMap x f) (actMap y f))

哦,但是等等 - 我们之前忘记添加一个 actShow 动作,让我们添加它,我们只是......呃,好吧。

无论如何,使用两种不同的风格会是什么样子?

expr1plus1 = Sum (Lit 1) (Lit 1)
action1plus1 = mkSum (mkLit 1) (mkLit 1)
action1times1 = mkProd (mkLit 1) (mkLit 1)

当你不扩展它们时,几乎是一样的。

作为一个有趣的旁注,请考虑在“actions”样式中,表达式中的实际值完全隐藏 - actEval 字段仅承诺为我们提供一些信息正确的类型,如何提供它是它自己的事。由于惰性求值,该字段的内容甚至可能是仅根据需要执行的复杂计算。 Actions a 值对外部检查完全不透明,仅向外界呈现定义的操作。

这种编程风格——用一堆“动作”替换简单的数据,同时将实际的实现细节隐藏在黑匣子中,使用类似构造函数的函数来构建新的数据位,能够将非常不同的“值”与相同的“值”互换一组“动作”等等——很有趣。可能有一个名字,但我似乎不太记得了......

As Darius Bacon noted, this is essentially the expression problem, a long-standing issue with no universally accepted solution. The lack of a best-of-both-worlds approach doesn't stop us from sometimes wanting to go one way or the other, though. Now, you asked for a "design pattern for functional languages", so let's take a shot at it. The example that follows is written in Haskell, but isn't necessarily idiomatic for Haskell (or any other language).

First, a quick review of the "expression problem". Consider the following algebraic data type:

data Expr a = Lit a | Sum (Expr a) (Expr a)

exprEval (Lit x) = x
exprEval (Sum x y) = exprEval x + exprEval y

exprShow (Lit x) = show x
exprShow (Sum x y) = unwords ["(", exprShow x, " + ", exprShow y, ")"]

This represents simple mathematical expressions, containing only literal values and addition. With the functions we have here, we can take an expression and evaluate it, or show it as a String. Now, say we want to add a new function--say, map a function over all the literal values:

exprMap f (Lit x) = Lit (f x)
exprMap f (Sum x y) = Sum (exprMap f x) (exprMap f y)

Easy! We can keep writing functions all day without breaking a sweat! Algebraic data types are awesome!

In fact, they're so awesome, we want to make our expression type more, errh, expressive. Let's extend it to support multiplication, we'll just... uhh... oh dear, that's going to be awkward, isn't it? We have to modify every function we just wrote. Despair!

In fact, maybe extending the expressions themselves is more interesting than adding functions that use them. So, let's say we're willing to make the trade-off in the other direction. How might we do that?

Well, no sense doing things halfway. Let's up-end everything and invert the whole program. What does that mean? Well, this is functional programming, and what's more functional than higher-order functions? What we'll do is replace the data type representing expression values with one representing actions on the expression. Instead of choosing a constructor we'll need a record of all possible actions, something like this:

data Actions a = Actions {
    actEval :: a,
    actMap  :: (a -> a) -> Actions a }

So how do we create an expression without a data type? Well, our functions are data now, so I guess our data needs to be functions. We'll make "constructors" using regular functions, returning a record of actions:

mkLit x = Actions x (\f -> mkLit (f x))

mkSum x y = Actions 
    (actEval x + actEval y) 
    (\f -> mkSum (actMap x f) (actMap y f))

Can we add multiplication more easily now? Sure can!

mkProd x y = Actions 
    (actEval x * actEval y) 
    (\f -> mkProd (actMap x f) (actMap y f))

Oh, but wait--we forgot to add an actShow action earlier, let's add that in, we'll just... errh, well.

At any rate, what does it look like to use the two different styles?

expr1plus1 = Sum (Lit 1) (Lit 1)
action1plus1 = mkSum (mkLit 1) (mkLit 1)
action1times1 = mkProd (mkLit 1) (mkLit 1)

Pretty much the same, when you're not extending them.

As an interesting side note, consider that in the "actions" style, the actual values in the expression are completely hidden--the actEval field only promises to give us something of the correct type, how it provides it is its own business. Thanks to lazy evaluation, the contents of the field may even be an elaborate computation, performed only on demand. An Actions a value is completely opaque to external inspection, presenting only the defined actions to the outside world.

This programming style--replacing simple data with a bundle of "actions" while hiding the actual implementation details in a black box, using constructor-like functions to build new bits of data, being able to interchange very different "values" with the same set of "actions", and so on--is interesting. There's probably a name for it, but I can't quite seem to recall...

旧时光的容颜 2024-09-08 05:59:36

我不止一次听到这样的抱怨,但它总是让我感到困惑。提问者写道:

在函数式编程中似乎
添加新的数据属性
需要大量代码更改,因为
模式匹配和数据
构造函数(Haskell,ML)。

但这总的来说是一个功能,而不是一个错误!例如,当您更改变体中的可能性时,通过模式匹配访问该变体的代码被迫考虑出现了新可能性的事实。这很有用,因为实际上您确实需要考虑是否需要更改代码以对其操作的类型的语义变化做出反应。

我不同意“需要进行大量代码更改”的说法。对于编写良好的代码,类型系统通常会非常出色地突出需要考虑的代码,仅此而已。

也许这里的问题是,如果没有更具体的例子,就很难回答这个问题。考虑在 Haskell 或 ML 中提供一段您不确定如何干净地演变的代码。我想这样你会得到更准确、更有用的答案。

I've heard this complaint more than a few times, and it always confuses me. The questioner wrote:

In Functional Programming it seems
that adding new data properties
requires lots of code changes because
of pattern matching and data
constructors (Haskell, ML).

But this is by and large a feature, and not a bug! When you change the possibilities in a variant, for example, the code that accesses that variant via pattern-matching is forced to consider the fact that new possibilities have arisen. This is useful, because indeed you do need to consider whether that code needs to change to react to the semantic changes in the types it manipulates.

I would argue with the claim that "lots of code changes" are required. With well-written code, the type system usually does an impressively good job of bringing to the fore the code that needs to be considered, and not much more.

Perhaps the problem here is that it's hard to answer the question without a more concrete example. Consider providing a piece of code in Haskell or ML that you're not sure how to evolve cleanly. I imagine you'll get more precise and useful answers that way.

神妖 2024-09-08 05:59:36

这种权衡在编程语言理论文献中被称为表达式问题

目标是按情况定义数据类型,可以在数据类型上添加新情况并在数据类型上添加新函数,而无需重新编译现有代码,同时保留静态类型安全性(例如,无强制转换)。

解决方案已经提出了,但我没有研究过。 (大量讨论 Lambda The Ultimate。)

This tradeoff is known in the programming-language-theory literature as the expression problem:

The goal is to define a datatype by cases, where one can add new cases to the datatype and new functions over the datatype, without recompiling existing code, and while retaining static type safety (e.g., no casts).

Solutions have been put forward, but I haven't studied them. (Much discussion at Lambda The Ultimate.)

酒解孤独 2024-09-08 05:59:36

在 Haskell 中,至少我会创建一个抽象数据类型。那就是创建一个不导出构造函数的类型。该类型的用户失去了对该类型进行模式匹配的能力,并且您必须提供用于使用该类型的函数。作为回报,您将获得一个更容易修改的类型,而无需更改该类型的用户编写的代码。

In Haskell at least I would make an abstract data type. That is create a type that doesn't export constructors. Users of the type loose the ability to pattern match on the type and you have to provide functions for working with the type. In return you get a type that is easier to modify without changing code written by users of the type.

千鲤 2024-09-08 05:59:36

如果新数据意味着没有新的行为,就像在一个应用程序中,我们被要求将“生日”字段添加到“人员”资源中,那么我们所要做的就是将其添加到属于“人员”资源一部分的字段列表中。人员资源,那么在函数式和 OOP 世界中都很容易解决。只是不要将“生日”视为代码的一部分;它只是您数据的一部分。

让我解释一下:如果生日意味着不同的应用程序行为,例如,如果这个人未成年,我们会做不同的事情,那么在 OOP 中,我们会向 person 类添加一个生日字段,而在 FP 中,我们会添加类似的个人数据结构的生日字段。

如果“birthdate”没有附加任何行为,那么代码中不应有名为“birthdate”的字段。诸如字典(地图)之类的数据结构将保存各种字段。添加新的程序不需要更改程序,无论您是 OOP 还是 FP。通过附加验证正则表达式或使用类似的验证小语言来在数据中表达验证行为应该是什么,可以类似地添加验证。

If the new data imply no new behaviour, as in an application where we are asked to add a "birthdate" field to a "person" resource and then all we have to do is to add it to a list of fields that are part of the person resource, then it's easy to solve in both the functional and OOP world. Just don't treat the "birthdate" as part of your code; it's just part of your data.

Let me explain: if the birthdate is something that implies a different application behaviour, e.g. that we do something differently if the person is underage, then in OOP we would add a birthdate field to the person class, and in FP we would add similarly a birthdate field to a person data structure.

If there is no behaviour attached to "birthdate" then there should be no field named "birthdate" in the code. A data structure such as a dictionary (a map) would hold the various fields. Adding a new one would require no program changes, no matter if you it's OOP or FP. Validations would be added similarly, by attaching a validation regexp or using a similar validation little language to express in data what the validation behaviour should be.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文