解析 FParsec 中的数字
我已经开始学习 FParsec 了。它有一种非常灵活的解析数字的方式;我可以提供一组我想要使用的数字格式:
type Number =
| Numeral of int
| Decimal of float
| Hexadecimal of int
| Binary of int
let numberFormat = NumberLiteralOptions.AllowFraction
||| NumberLiteralOptions.AllowHexadecimal
||| NumberLiteralOptions.AllowBinary
let pnumber =
numberLiteral numberFormat "number"
|>> fun num -> if num.IsHexadecimal then Hexadecimal (int num.String)
elif num.IsBinary then Binary (int num.String)
elif num.IsInteger then Numeral (int num.String)
else Decimal (float num.String)
但是,我尝试解析的语言有点奇怪。数字可以是数字(非负int
)、十进制(非负float
)、十六进制(带有前缀#x
)或二进制(带有前缀#b
):
numeral: 0, 2
decimal: 0.2, 2.0
hexadecimal: #xA04, #x611ff
binary: #b100, #b001
现在我必须通过用0
替换#
(如果需要)来解析两次才能使用< code>pnumber:
let number: Parser<_, unit> =
let isDotOrDigit c = isDigit c || c = '.'
let numOrDec = many1Satisfy2 isDigit isDotOrDigit
let hexOrBin = skipChar '#' >>. manyChars (letter <|> digit) |>> sprintf "0%s"
let str = spaces >>. numOrDec <|> hexOrBin
str |>> fun s -> match run pnumber s with
| Success(result, _, _) -> result
| Failure(errorMsg, _, _) -> failwith errorMsg
有什么更好的方法在这种情况下解析?或者我如何改变 FParsec 的 CharStream
以使条件解析更容易?
I've started learning FParsec. It has a very flexible way to parse numbers; I can provide a set of number formats I want to use:
type Number =
| Numeral of int
| Decimal of float
| Hexadecimal of int
| Binary of int
let numberFormat = NumberLiteralOptions.AllowFraction
||| NumberLiteralOptions.AllowHexadecimal
||| NumberLiteralOptions.AllowBinary
let pnumber =
numberLiteral numberFormat "number"
|>> fun num -> if num.IsHexadecimal then Hexadecimal (int num.String)
elif num.IsBinary then Binary (int num.String)
elif num.IsInteger then Numeral (int num.String)
else Decimal (float num.String)
However, the language I'm trying to parse is a bit strange. A number could be numeral (non-negative int
), decimal (non-negative float
), hexadecimal (with prefix #x
) or binary (with prefix #b
):
numeral: 0, 2
decimal: 0.2, 2.0
hexadecimal: #xA04, #x611ff
binary: #b100, #b001
Right now I have to do parsing twice by substituting #
by 0
(if necessary) to make use of pnumber
:
let number: Parser<_, unit> =
let isDotOrDigit c = isDigit c || c = '.'
let numOrDec = many1Satisfy2 isDigit isDotOrDigit
let hexOrBin = skipChar '#' >>. manyChars (letter <|> digit) |>> sprintf "0%s"
let str = spaces >>. numOrDec <|> hexOrBin
str |>> fun s -> match run pnumber s with
| Success(result, _, _) -> result
| Failure(errorMsg, _, _) -> failwith errorMsg
What is a better way of parsing in this case? Or how can I alter FParsec's CharStream
to be able to make conditional parsing easier?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
如果您想生成良好的错误消息并正确检查溢出,则解析数字可能会非常混乱。
以下是数字解析器的简单 FParsec 实现:
在溢出时生成良好的错误消息会使此实现变得有点复杂,因为理想情况下您还需要在错误发生后回溯,以便错误位置最终位于数字的开头文字(有关示例,请参阅 numberLiteral 文档)。
优雅地处理可能的溢出异常的一个简单方法是使用一个小的异常处理组合器,如下所示:
然后你可以写
我不确定你的意思是“改变 FParsec 的
CharStream
以便能够使条件解析更容易”,但以下示例演示了如何编写仅直接使用CharStream
方法的低级实现。虽然此实现无需系统方法的帮助即可解析十六进制和二进制数字,但它最终将十进制数字的解析委托给 Int32.TryParse 和 Double.TryParse 方法。
正如我所说:这很混乱。
Parsing numbers can be pretty messy if you want to generate good error messages and properly check for overflows.
The following is a simple FParsec implementation of your number parser:
Generating good error messages on overflows would complicate this implementation a bit, as you would ideally also need to backtrack after the error, so that the error position ends up at the start of the number literal (see the numberLiteral docs for an example).
A simple way to gracefully handle possible overflow exception is to use a little exception handling combinator like the following:
You could then write
I'm not sure what you meant to say with "alter FParsec's
CharStream
to be able to make conditional parsing easier", but the following sample demonstrates how you could write a low-level implementation that only uses theCharStream
methods directly.While this implementation parses hex and binary numbers without the help of system methods, it eventually delegates the parsing of decimal numbers to the Int32.TryParse and Double.TryParse methods.
As I said: it's messy.