F# 柯里化效率?

发布于 2024-08-30 14:46:06 字数 718 浏览 8 评论 0原文

我有一个如下所示的函数:

let isInSet setElems normalize p = 
        normalize p |> (Set.ofList setElems).Contains

该函数可用于快速检查某个元素在语义上是否是某个集合的一部分;例如,检查文件路径是否属于 html 文件:

let getLowerExtension p = (Path.GetExtension p).ToLowerInvariant()
let isHtmlPath = isInSet [".htm"; ".html"; ".xhtml"] getLowerExtension

但是,当我使用上述函数时,性能很差,因为对“isInSet”中编写的函数体的评估似乎被延迟,直到所有参数都被设置为止。已知 - 特别是,每次执行 isHtmlPath 时都会重新评估诸如 (Set.ofList setElems).Contains 之类的不变位。

如何才能最好地保持 F# 的简洁、可读性,同时仍然获得更有效的行为(其中预先评估集合构造)。

以上只是一个例子;我正在寻找一种通用方法,避免让我陷入实现细节的困境 - 尽可能 我希望避免被实现的执行顺序等细节分散注意力,因为这通常对我来说并不重要,并且这有点破坏了函数式编程的一个主要卖点。

I have a function that looks as follows:

let isInSet setElems normalize p = 
        normalize p |> (Set.ofList setElems).Contains

This function can be used to quickly check whether an element is semantically part of some set; for example, to check if a file path belongs to an html file:

let getLowerExtension p = (Path.GetExtension p).ToLowerInvariant()
let isHtmlPath = isInSet [".htm"; ".html"; ".xhtml"] getLowerExtension

However, when I use a function such as the above, performance is poor since evaluation of the function body as written in "isInSet" seems to be delayed until all parameters are known - in particular, invariant bits such as (Set.ofList setElems).Contains are reevaluated each execution of isHtmlPath.

How can best I maintain F#'s succint, readable nature while still getting the more efficient behavior in which the set construction is preevaluated.

The above is just an example; I'm looking for a general approach that avoids bogging me down in implementation details - where possible I'd like to avoid being distracted by details such as the implementation's execution order since that's usually not important to me and kind of undermines a major selling point of functional programming.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

記憶穿過時間隧道 2024-09-06 14:46:06

只要 F# 不区分纯代码和不纯代码,我怀疑我们会看到这种优化。但是,您可以使柯里化变得明确。

let isInSet setElems =
    let set = Set.ofList setElems
    fun normalize p -> normalize p |> set.Contains

isHtmlSet 现在将仅调用 isInSet 一次来获取闭包,同时执行 ofList

As long as F# doesn't differentiate between pure and impure code, I doubt we'll see optimisations of that kind. You can, however, make the currying explicit.

let isInSet setElems =
    let set = Set.ofList setElems
    fun normalize p -> normalize p |> set.Contains

isHtmlSet will now call isInSet only once to obtain the closure, at the same time executing ofList.

墨离汐 2024-09-06 14:46:06

Kha 的答案展示了如何直接使用闭包来手动优化代码。如果这是您需要经常使用的频繁模式,则还可以定义一个高阶函数,该函数从两个函数构造高效代码 - 第一个函数对某些函数进行预处理参数和第二个参数,它在获取剩余参数后进行实际处理。

代码看起来像这样:

let preProcess finit frun preInput =  
  let preRes = finit preInput
  fun input -> frun preRes input

let f : string list -> ((string -> string) * string) -> bool =
  preProcess 
    Set.ofList                           // Pre-processing of the first argument
    (fun elemsSet (normalize, p) ->      // Implements the actual work to be 
      normalize p |> elemsSet.Contains) // .. done once we get the last argument

这是否更优雅是一个问题...

另一个(疯狂的)想法是您可以为此使用计算表达式。允许您执行此操作的计算构建器的定义非常不标准(这不是人们通常用它们做的事情,并且它与 monad 或任何其他理论没有任何关系)。但是,应该可以这样写:

type CurryBuilder() = 
  member x.Bind((), f:'a -> 'b) = f
  member x.Return(a) = a
let curry = new CurryBuilder()

curry 计算中,您可以使用 let! 来表示您想要获取函数的下一个参数(在求值之后)前面的代码):

let f : string list -> (string -> string) -> string -> bool = curry {
  let! elems = ()
  let elemsSet = Set.ofList elems
  printf "elements converted"
  let! normalize = ()
  let! p = ()
  printf "calling"
  return normalize p |> elemsSet.Contains }

let ff = f [ "a"; "b"; "c" ] (fun s -> s.ToLower()) 
// Prints 'elements converted' here
ff "C"
ff "D"
// Prints 'calling' two times

以下是一些资源,其中包含有关计算表达式的更多信息:

The answer from Kha shows how to optimize the code manually by using closures directly. If this is a frequent pattern that you need to use often, it is also possible to define a higher-order function that constructs the efficient code from two functions - the first one that does pre-processing of some arguments and a second one which does the actual processing once it gets the remaining arguments.

The code would look like this:

let preProcess finit frun preInput =  
  let preRes = finit preInput
  fun input -> frun preRes input

let f : string list -> ((string -> string) * string) -> bool =
  preProcess 
    Set.ofList                           // Pre-processing of the first argument
    (fun elemsSet (normalize, p) ->      // Implements the actual work to be 
      normalize p |> elemsSet.Contains) // .. done once we get the last argument

It is a question whether this is more elegant though...

Another (crazy) idea is that you could use computation expressions for this. The definition of computation builder that allows you to do this is very non-standard (it is not something that people usually do with them and it isn't in any way related to monads or any other theory). However, it should be possible to write this:

type CurryBuilder() = 
  member x.Bind((), f:'a -> 'b) = f
  member x.Return(a) = a
let curry = new CurryBuilder()

In the curry computation, you can use let! to denote that you want to take the next argument of the function (after evaluating the preceeding code):

let f : string list -> (string -> string) -> string -> bool = curry {
  let! elems = ()
  let elemsSet = Set.ofList elems
  printf "elements converted"
  let! normalize = ()
  let! p = ()
  printf "calling"
  return normalize p |> elemsSet.Contains }

let ff = f [ "a"; "b"; "c" ] (fun s -> s.ToLower()) 
// Prints 'elements converted' here
ff "C"
ff "D"
// Prints 'calling' two times

Here are some resources with more information about computation expressions:

谁把谁当真 2024-09-06 14:46:06

@Kha 的回答很正确。 F# 无法重写

// effects of g only after both x and y are passed
let f x y =
    let xStuff = g x
    h xStuff y

// effects of g once after x passed, returning new closure waiting on y
let f x =
    let xStuff = g x
    fun y -> h xStuff y

除非它知道 g 没有效果,并且在当今的 .NET Framework 中,通常无法推断 99% 的表达式的效果。这意味着程序员仍然负责如上所述明确地编码评估顺序。

@Kha's answer is spot on. F# cannot rewrite

// effects of g only after both x and y are passed
let f x y =
    let xStuff = g x
    h xStuff y

into

// effects of g once after x passed, returning new closure waiting on y
let f x =
    let xStuff = g x
    fun y -> h xStuff y

unless it knows that g has no effects, and in the .NET Framework today, it's usually impossible to reason about the effects of 99% of all expressions. Which means the programmer is still responsible for explicitly coding evaluation order as above.

回梦 2024-09-06 14:46:06
  1. 柯里化不会造成伤害。柯里化有时也会引入闭包。他们通常也很有效率。
    参考我之前问过的这个问题。如有必要,您可以使用内联来提高性能。

  2. 但是,示例中的性能问题主要是由于您的代码造成的:

    标准化 p |> (Set.ofList setElems).Contains

这里你需要执行 Set.ofList setElems 即使你柯里化它。它花费 O(n log n) 时间。
您现在需要将 setElems 的类型更改为 F# Set,而不是 List。顺便说一句,对于小型集合,即使对于查询,使用列表也比集合更快。

  1. Currying does not hurt. Currying sometimes introduces closures as well. They are usually efficient too.
    refer to this question I asked before. You can use inline to boost performance if necessary.

  2. However, your performance problem in the example is mainly due to your code:

    normalize p |> (Set.ofList setElems).Contains

here you need to perform Set.ofList setElems even you curry it. It costs O(n log n) time.
You need to change the type of setElems to F# Set, not List now. Btw, for small set, using lists is faster than sets even for querying.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文