在 Rebol 中解析此内容的最佳方法

发布于 2024-11-04 08:22:43 字数 719 浏览 7 评论 0原文

如何从以下 HTML 中提取解析规则中干扰最少的交易收据日期时间? (我希望得到的输出是这样的:“交易收据:04/28/2011 17:03:09”

 <FONT COLOR=DARKBLUE>Transaction Receipt </FONT></TH></TR><TR></TR><TR></TR><TR><TD COLSPAN=4 ALIGN=CENTER><FONT SIZE=-1 COLOR=DARKBLUE>04/28/2011 17:03:09</FONT>

以下内容有效,但我感觉不太好!保证在某处“交易收据”一词后面有一个日期时间(尽管如果我正在执行 grep,我不会进行贪婪匹配)

parse d [
    thru {<FONT COLOR=DARKBLUE>Transaction Receipt </FONT></TH></TR><TR></TR><TR></TR><TR><TD COLSPAN=4 ALIGN=CENTER><FONT SIZE=-1 COLOR=DARKBLUE>}
    copy t to "</FONT>"
    ]

How do I extract the transaction receipt datetime with the least bit of noise in my parse rule from the following HTML? (The output I'm looking to get is this: "Transaction Receipt: 04/28/2011 17:03:09")

 <FONT COLOR=DARKBLUE>Transaction Receipt </FONT></TH></TR><TR></TR><TR></TR><TR><TD COLSPAN=4 ALIGN=CENTER><FONT SIZE=-1 COLOR=DARKBLUE>04/28/2011 17:03:09</FONT>

The following works but I don't get a good feeling! There is guaranteed to be a datetime following the words Transaction Receipt somewhere (although I wouldn't do a greedy match if I'm doing a grep)

parse d [
    thru {<FONT COLOR=DARKBLUE>Transaction Receipt </FONT></TH></TR><TR></TR><TR></TR><TR><TD COLSPAN=4 ALIGN=CENTER><FONT SIZE=-1 COLOR=DARKBLUE>}
    copy t to "</FONT>"
    ]

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

你的呼吸 2024-11-11 08:22:43

这更短......

parse d [thru <FONT SIZE=-1 COLOR=DARKBLUE> copy t to </FONT>]

但并不是专门寻找日期时间对。不幸的是,REBOL 认为使用的日期无效......

>> 04/28/2011
** Syntax Error: Invalid date -- 04/28/2011
** Near: (line 1) 04/28/2011

因此您无法专门搜索它。如果日期是 28/04/2011(并且时间后面有一个空格,但我不确定为什么加载需要它),则以下内容将有效...

parse load d [to date! copy t to </FONT>]

嗯。试试这个...

t: ""
parse d [
    some [
        to "<" thru ">" mark: copy text to "<" (if text [append t text]) :mark
    ]
]

返回:“Transaction Receipt 04/28/2011 17:03:09”

它的工作原理是跳过所有标签,附加剩下的任何文本。

希望有帮助!

This is shorter...

parse d [thru <FONT SIZE=-1 COLOR=DARKBLUE> copy t to </FONT>]

but isn't specifically looking for the datetime pair. And unfortunately REBOL considers the date used an invalid one...

>> 04/28/2011
** Syntax Error: Invalid date -- 04/28/2011
** Near: (line 1) 04/28/2011

so you can't search for it specifically. If the date was 28/04/2011 (and there was a space after the time, though why it's needed for load I'm not sure), the following would work...

parse load d [to date! copy t to </FONT>]

Hmmm. Try this...

t: ""
parse d [
    some [
        to "<" thru ">" mark: copy text to "<" (if text [append t text]) :mark
    ]
]

That returns: "Transaction Receipt 04/28/2011 17:03:09"

It works by skipping all the tags, appending any text that's left to t.

Hope that helps!

北城挽邺 2024-11-11 08:22:43

像往常一样及时:如果格式一致,您可以随时尝试显式匹配日期:

rule: use [dg tag date value][
    tag: use [chars][
        chars: charset [#"a" - #"z" #"A" - #"Z" #"0" - #"9" " =-"]
        ["<" opt "/" some chars ">"]
    ]

    date: use [dg mo dy yr tm][
        dg: charset "0123456789"
        [
            copy mo [2 dg "/"] copy dy [2 dg "/"] copy yr 4 dg
            " " copy tm [2 dg ":" 2 dg ":" 2 dg]
            (value: load rejoin [dy mo yr "/" tm])
        ]
    ]

    [
        some [
              "Transaction Receipt" (probe "Transaction Receipt")
            | date (probe value)

            ; everything else
            | some " " | tag ; | skip ; will parse the whole doc...
        ]
    ]
]

Timely as per usual: if the format is consistent, you can always try to explicitly match dates:

rule: use [dg tag date value][
    tag: use [chars][
        chars: charset [#"a" - #"z" #"A" - #"Z" #"0" - #"9" " =-"]
        ["<" opt "/" some chars ">"]
    ]

    date: use [dg mo dy yr tm][
        dg: charset "0123456789"
        [
            copy mo [2 dg "/"] copy dy [2 dg "/"] copy yr 4 dg
            " " copy tm [2 dg ":" 2 dg ":" 2 dg]
            (value: load rejoin [dy mo yr "/" tm])
        ]
    ]

    [
        some [
              "Transaction Receipt" (probe "Transaction Receipt")
            | date (probe value)

            ; everything else
            | some " " | tag ; | skip ; will parse the whole doc...
        ]
    ]
]
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文