FParsec：如何解析 fparsec 中的日期（新手）

发布于 2024-10-31 10:37:19 字数 1688 浏览 9 评论 0原文

我正在使用 Bill Casarin 帖子来了解如何解析使用 fparsec 分隔文件，我将逻辑简化以了解代码的工作原理。我正在将多行分隔文档解析为单元格列表列表结构（目前），其中单元格是字符串或浮点数。我对此完全是新手。

我在解析浮点数时遇到问题 - 在典型情况下（由制表符分隔的单元格，包含数字）它可以工作。然而，当一个单元格恰好是一个以数字开头的字符串时，它就会崩溃。

如何修改 pFloatCell 以将其解析（尽管通过选项卡）为浮点数或什么都不解析？

谢谢你

type Cell = 
    | String of string 
    | Float of float
.
.
.
let pStringCell delim = 
    manyChars (nonQuotedCellChar delim)
    |>> String

// this is my issue. pfloat parses the string one 
// char at a time, and once it starts off with a number 
// it is down that path, and errors out
let pFloatCell delim = 
    FParsec.CharParsers.pfloat
    |>> Float

let pCell delim = 
    (pFloatCell delim) <|> (pStringCell delim)
.
.
.
let ParseTab s  =
  let delim = "\t"
  let res = run (csv delim) s in
    match res with
     | Success (rows, _, _) -> { IsSuccess = true; ErrorMsg = "Ok"; Result = stripEmpty rows }
     | Failure (s, _, _) -> { IsSuccess = false; ErrorMsg = s; Result = [[]] }
.
.
.
let test() =

    let parsed = ParseTab data

昨晚为我迟到了。本来想把数据贴出来的第一个方法有效

let data = 
    "s10 Mar 2011 18:28:11 GMT\n"

，但返回错误：

let data = 
    "10 Mar 2011 18:28:11 GMT\n"

返回，无论是否有 ChaosP 的建议：

ErrorMsg = "Ln 中的错误：1 列： 3\r\n2011 年 3 月 10 日 18:28:11 GMT\r\n ^\r\n预期：文件结尾、换行符或 '\t'\r\n"

看起来尝试工作正常。在第二种情况下，它只抓取到 10 - 并且 pfloat 的代码只查找第一个空格

原文

I am using the Bill Casarin post on how to parse delimited files with fparsec, I am dumbing the logic down to get an understanding of how the code works. I am parsing a multi row delimited document into Cell list list structure (for now) where a Cell is a string or a float. I am a complete newbie on this.

I am having issues parsing the floats - in a typical case (a cell delimitted by tabs, containing a numeric) it works. However when a cell happens to be a string that starts with a number - it falls apart.

How do I modify pFloatCell to either parse (although the way through the tab) as a float or nothing?

Thank you

type Cell = 
    | String of string 
    | Float of float
.
.
.
let pStringCell delim = 
    manyChars (nonQuotedCellChar delim)
    |>> String

// this is my issue. pfloat parses the string one 
// char at a time, and once it starts off with a number 
// it is down that path, and errors out
let pFloatCell delim = 
    FParsec.CharParsers.pfloat
    |>> Float

let pCell delim = 
    (pFloatCell delim) <|> (pStringCell delim)
.
.
.
let ParseTab s  =
  let delim = "\t"
  let res = run (csv delim) s in
    match res with
     | Success (rows, _, _) -> { IsSuccess = true; ErrorMsg = "Ok"; Result = stripEmpty rows }
     | Failure (s, _, _) -> { IsSuccess = false; ErrorMsg = s; Result = [[]] }
.
.
.
let test() =

    let parsed = ParseTab data

oops late for me last night. I meant to post the data. This first one works

let data = 
    "s10 Mar 2011 18:28:11 GMT\n"

while this returns an error:

let data = 
    "10 Mar 2011 18:28:11 GMT\n"

returns, both with and witout ChaosP's recommendation:

ErrorMsg = "Error in Ln: 1 Col:
3\r\n10 Mar 2011 18:28:11 GMT\r\n
^\r\nExpecting: end of file, newline
or '\t'\r\n"

It looks as though the attempt is working fine. in the second case it is only grabbing up to the 10 - and the code for pfloat looks only up to the first whitespace. I need to convice pfloat that it needs to look all the way up to the next tab or newline regardless of whether there is a space before it; write my own version of pfloat by performing a Double.Parse - but I would rather rely on the library.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

心房的律动 2024-11-07 10:37:19

由于您要解析的文本似乎有点模糊，您需要修改您的 pCell 解析器。

let sep delim =
     skipString delim <|> skipAnyOf "\r\n" <|> eof

let pCell delim = 
    attempt (pFloatCell delim .>> sep delim) <|> (pStringCell delim .>> sep delim)

这也意味着您需要修改使用 pCell 的解析器。

let pCells delim =
    many pCell delim

注意

.>> 运算符实际上非常简单。把它想象成蛙跳运算符。在应用右侧并忽略结果后，返回左侧的值。

Parser<'a, 'b> -> Parser<'c, 'b> -> Parser<'a, 'b>

Since it seems the text you'll be parsing is a bit ambiguous you'll need to modify your pCell parser.

let sep delim =
     skipString delim <|> skipAnyOf "\r\n" <|> eof

let pCell delim = 
    attempt (pFloatCell delim .>> sep delim) <|> (pStringCell delim .>> sep delim)

This also means you'll need to modify whichever parser uses pCell.

let pCells delim =
    many pCell delim

Note

The .>> operator is actually quite simple. Think of it like the leap-frog operator. The value of the left hand side is returned after applying the right hand side and ignoring the result.

Parser<'a, 'b> -> Parser<'c, 'b> -> Parser<'a, 'b>

回复收藏 0 原文

~没有更多了~

关于作者

燕归巢

暂无简介

文章

27 人气

关注发私信

友情链接

文江博客

FParsec：如何解析 fparsec 中的日期（新手）

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

alipaysp_snBf0MSZIv

梦断已成空

瞎闹

凯凯我们等你回来

寄意

似梦非梦

友情链接

FParsec：如何解析 fparsec 中的日期（新手）

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

alipaysp_snBf0MSZIv

梦断已成空

瞎闹

凯凯我们等你回来

寄意

似梦非梦

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。