fromJust 定义中的模式匹配
Data.Maybe
中的函数 fromJust
是这样定义的:
fromJust :: Maybe a -> a
fromJust Nothing = error "Maybe.fromJust: Nothing"
fromJust (Just x) = x
根据我对模式匹配(匹配从上到下进行)的理解,我会改变两者的顺序定义。因为通常 Nothing 部分在 it-is-sure-a-Just 情况下不匹配,但在达到第二个定义之前总是会检查它。
您能澄清我的推理错误吗?谢谢。
编辑:
示例: 假设我有一个文件,每行包含一百万个 Int
类型的数字,并且您在我的程序中需要这些数字(作为 Int
,而不是 String
)对于其他东西。
import qualified Data.ByteString.Lazy.Char8 as L
readInt = fst . fromJust . L.readInt
-- more stuff
根据上面的 fromJust
定义,我需要更多时间来读取数字,不是吗?
The function fromJust
in Data.Maybe
is defined in this way:
fromJust :: Maybe a -> a
fromJust Nothing = error "Maybe.fromJust: Nothing"
fromJust (Just x) = x
According to my understanding of pattern matching (matching proceeds from top to bottom) I would change the order of the two definitions. Since usually the Nothing-part doesn't match in a it-is-sure-a-Just situation, but it is always checked before the second definition is reached.
Can you please clarify my error in reasoning? Thanks.
Edit:
Example:
Suppose I have a file with a million numbers of type Int
per line and y need this numbers (as Int
, not String
) in my program for other stuff.
import qualified Data.ByteString.Lazy.Char8 as L
readInt = fst . fromJust . L.readInt
-- more stuff
With the above definition of fromJust
I would need more time to read the numbers, wouldn't I?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我认为这个问题是关于性能的。虽然语义模式匹配是从上到下测试的,但大多数 Haskell 编译器会将 ADT 构造函数上的匹配优化为相当于 C
switch
语句。您可以认为 ADT 的数据表示有一个“标签”,它说明它是使用哪个构造函数创建的,以及每个参数的指针。例如,
Nothing
可能表示为0 (null)
,Just 42
表示为1(指向 42 的指针)
代码>.然后在这样的函数中:
编译器将建立一个决策树:
其中每个标记都是由跳转表或其他东西计算的,因此将
0
作为1 检查并不昂贵
。请注意,无论我们定义中模式的原始顺序是什么,都会生成相同的决策树。然而,当使用防护而不是在构造函数上匹配时,很可能会从上到下检查模式(编译器必须非常聪明才能优化它)。因此,如果我们以这种神秘的方式编写
fromJust
:那么这可能会依次检查每个构造函数,并且我们可以通过切换案例的顺序来优化。幸运的是,以一种重要的方式编写是很麻烦的:-)。
I think this question is about performance. While semantically pattern matching is tested top-to-bottom, most Haskell compilers will optimize matching on an ADT's constructors to the equivalent of a C
switch
statement.You can think of the data representation for an ADT has a "tag" which says which constructor it was made with, together with a pointer for each argument. So for example,
Nothing
might be represented as0 (null)
andJust 42
represented as1 (pointer to 42)
.Then in a function like this:
The compiler would set up a decision tree:
Where each tag is computed by a jump table or something, so it is no more expensive to check against
0
as1
. Note that this same decision tree would be made no matter what the original order of the patterns in our definition.However, when using guards instead of matching on a constructor, the patterns are most likely checked from top to bottom (the compiler would have to be very smart to optimize that). So if we wrote
fromJust
in this arcane way:Then this would probably check for each constructor in turn, and we could optimize by switching the order of the cases. Fortunately, writing in a way that this matters is cumbersome :-).
这里需要意识到两件重要的事情:首先,Haskell 编译器完全意识到任何值只能是
Nothing
或Just x
。因此它只会测试其中之一。其次,GHC 使用指针标记,使其能够非常快速地区分指向各种构造函数的指针(详细信息)。结果,GHC 为模式匹配生成了非常好的代码。下面是
fromJust
返回代码的样子,直接从程序集转储中获取:它获取返回值,然后使用位掩码操作提取标记(二进制中的 7 = 111)。如果此标记为 2,则代码知道它指向
Just x
构造函数。因此它会跳转到适当的代码。否则,它只会继续处理Nothing
的代码。我认为甚至可以利用程序中只有一个
Nothing
闭包的知识来进一步优化。因此,您可以用单个地址比较来替换它,从而在实际代码中保存位屏蔽操作甚至寄存器。不知道为什么 GHC 不这样做。There are two important things to realize here: First, the Haskell compiler is fully aware that any value can only either be
Nothing
orJust x
. Therefore it will only ever test for one of them. Second, GHC uses pointer tagging that allows it to distinguish between pointers pointing at various constructors very fast (details).As a result, GHC generates quite nice code for the pattern match. Here's what the
fromJust
return code looks like, taken directly from the assembly dump:This takes the return value, then extracts the tag using a bit mask operation (7 = 111 in binary). If this tag is 2, then the code knows that it points to a
Just x
constructor. Therefore it jumps to the appropriate code. Otherwise, it just continues to the code dealing withNothing
.I think this could even be optimized further using the knowledge that there is only ever one
Nothing
closure in the program. Therefore you could replace that with a single address comparison, saving the bit-masking operations and even a register in the actual code. Not sure why GHC doesn't do this.与
Nothing
匹配的内容不可能与Just
匹配。所以这两个命令都有效。如果您通过
fromJust
传递Just 5
,则Just 5
与Nothing
不匹配。但Just 5
确实匹配Just x
。无论哪种方式,它的工作原理都完全相同。只有当函数与参数之一不模式匹配时,事情才会开始变得棘手。
Something that matches
Nothing
cannot possible matchJust
. So both orders work.If you pass
Just 5
throughfromJust
,Just 5
does not matchNothing
. ButJust 5
does matchJust x
.It would work exactly the same either way. Things only start to get sticky when a function doesn't pattern match on one of the arguments.