相当于 attoparsecs `inClass` 的秒差距
我正在将一些代码从 attoparsec 转换为 Parsec,因为解析器需要生成更好的错误消息。 attoparsec 代码广泛使用 inClass
(和 notInClass
)。 Parsec 是否有类似的函数可以让我机械地翻译 inClass
事件? Hayoo 和 Hoogle 没有对此事提供任何见解。
inClass :: String -> Char -> Bool
inClass "ac'-)0-3-"
相当于 \ x -> elem x "abc'()0123-"
,但后者对于大范围编写效率低下且繁琐。
如果没有其他可用的功能,我会自己重新实现该功能。
I am translating some code from attoparsec to Parsec, because the parser needs to produce better error messages. The attoparsec code uses inClass
(and notInClass
) extensively. Is there a similar function for Parsec that lets me translate inClass
-occurences mechanically? Hayoo and Hoogle didn't offer any insight into the matter.
inClass :: String -> Char -> Bool
inClass "a-c'-)0-3-"
is equivalent to \ x -> elem x "abc'()0123-"
, but the latter is inefficient and tedious to write for large ranges.
I will reimplement the function myself if nothing else is available.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
不存在这样的组合器;如果有,它将位于 Text 中.Parsec.Char(这是定义所有涉及
Char
的标准解析器组合器函数的地方)。您应该能够相当容易地定义它。我认为您无法获得 attoparsec 与 其实现,不过;它依赖于内部
FastSet
类型,该类型仅适用于 8 位字符。当然,如果您不需要 Unicode 支持,那可能不是问题,但是FastSet
的代码意味着您将得到不可预测的结果,传递的字符大于'\255'
,因此,如果您想重用基于FastSet
的解决方案,您至少必须读取在 二进制模式。 (您还必须将FastSet
的实现复制到您的程序中,因为它没有导出...)如果您的范围字符串很短,那么像这样的简单解决方案可能会非常快:
您甚至可以尝试这样的方法,它应该至少与上述版本一样高效(包括对单个
inClass
进行多次调用时),此外避免列表遍历开销:(注意移动lambda 的递归;我不知道 GHC 是否可以/将会这样做。)
There isn't any such combinator; if there was, it would be in Text.Parsec.Char (which is where all the standard parser combinator functions that involve
Char
are defined). You should be able to define it fairly easily.I don't think you'll be able to get the same performance advantages attoparsec does with its implementation, though; it relies on the internal
FastSet
type, which only works with 8-bit characters. Of course, if you don't need Unicode support, that might not be a problem, but the code forFastSet
implies you'll get unpredictable results passing Chars greater than'\255'
, so if you want to reuse theFastSet
-based solution, you'll at least have to read the strings you're parsing in binary mode. (You'll also have to copy the implementation ofFastSet
into your program, as it's not exported...)If your range strings are short, then a simple solution like this is likely to be pretty fast:
You could even try something like this, which should be at least as efficient as the above version (including when many calls to a single
inClass s
are made), and additionally avoid the list traversal overhead:(taking care to move the recursion out of the lambda; I don't know if GHC can/will do this itself.)
不,没有相当于秒差距的单位。你必须自己写。我看到两个主要选项,
inClass
语法以从中创建String
,与oneOf
一起解析满足
前者当然是后者的特例,如果你的类中有较长的范围,效率就会降低。但实施起来可能更容易一些。
是一种头脑简单的可能性。
No, there's no equivalent in parsec. You have to write it yourself. I see two main options,
inClass
syntax to create aString
from it, to use withoneOf
satisfy
the former is of course a special case of the latter, and if you have longer ranges in your class, it will be less efficient. But it's probably a bit easier to implement.
is a simple-minded possibility.