Haskell 中的正则表达式与词法分析器

发布于 2024-09-06 08:24:24 字数 202 浏览 10 评论 0 原文

我正在开始使用 Haskell,我正在尝试使用 Alex 工具来创建正则表达式,我我有点失落;我的第一个不便是编译部分。我必须如何使用 Alex 编译文件?然后,我认为我必须将 alex 生成的模块导入到我的代码中,但不确定。如果有人可以帮助我,我将非常感激!

I'm getting started with Haskell and I'm trying to use the Alex tool to create regular expressions and I'm a little bit lost; my first inconvenience was the compile part. How I have to do to compile a file with Alex?. Then, I think that I have to import into my code the modules that alex generates, but not sure. If someone can help me, I would be very greatful!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

尘曦 2024-09-13 08:24:24

您可以在 Alex 中指定正则表达式函数。

例如,Alex 中的正则表达式用于匹配浮点数:

$space       = [\ \t\xa0]
$digit       = 0-9
$octit       = 0-7
$hexit       = [$digit A-F a-f]

@sign        = [\-\+]
@decimal     = $digit+
@octal       = $octit+
@hexadecimal = $hexit+
@exponent    = [eE] [\-\+]? @decimal

@number      = @decimal
             | @decimal \. @decimal @exponent?
             | @decimal @exponent
             | 0[oO] @octal
             | 0[xX] @hexadecimal

lex :-

   @sign? @number { strtod }

当我们匹配浮点数时,我们分派一个解析函数来对捕获的字符串进行操作,然后我们可以将其包装并作为解析函数公开给用户:

readDouble :: ByteString -> Maybe (Double, ByteString)
readDouble str = case alexScan (AlexInput '\n' str) 0 of
    AlexEOF            -> Nothing
    AlexError _        -> Nothing
    AlexToken (AlexInput _ rest) n _ ->
       case strtod (B.unsafeTake n str) of d -> d `seq` Just $! (d , rest)

使用 Alex 进行正则表达式匹配的一个很好的结果是性能良好,因为正则表达式引擎是静态编译的。它也可以作为用 cabal 构建的常规 Haskell 库公开。有关完整实现,请参阅 bytestring-lexing

关于何时使用词法分析器而不是正则表达式匹配器的一般建议是,如果您有要匹配的词素的语法,就像我对浮点所做的那样,请使用 Alex。如果不这样做,并且结构更加临时,请使用正则表达式引擎。

You can specify regular expression functions in Alex.

Here for example, a regex in Alex to match floating point numbers:

$space       = [\ \t\xa0]
$digit       = 0-9
$octit       = 0-7
$hexit       = [$digit A-F a-f]

@sign        = [\-\+]
@decimal     = $digit+
@octal       = $octit+
@hexadecimal = $hexit+
@exponent    = [eE] [\-\+]? @decimal

@number      = @decimal
             | @decimal \. @decimal @exponent?
             | @decimal @exponent
             | 0[oO] @octal
             | 0[xX] @hexadecimal

lex :-

   @sign? @number { strtod }

When we match the floating point number, we dispatch to a parsing function to operate on that captured string, which we can then wrap and expose to the user as a parsing function:

readDouble :: ByteString -> Maybe (Double, ByteString)
readDouble str = case alexScan (AlexInput '\n' str) 0 of
    AlexEOF            -> Nothing
    AlexError _        -> Nothing
    AlexToken (AlexInput _ rest) n _ ->
       case strtod (B.unsafeTake n str) of d -> d `seq` Just $! (d , rest)

A nice consequence of using Alex for this regex matching is that the performance is good, as the regex engine is compiled statically. It can also be exposed as a regular Haskell library built with cabal. For the full implementation, see bytestring-lexing.

The general advice on when to use a lexer instead of a regex matcher would be that, if you have a grammar for the lexemes you're trying to match, as I did for floating point, use Alex. If you don't, and the structure is more ad hoc, use a regex engine.

临走之时 2024-09-13 08:24:24

为什么要使用 alex 创建正则表达式?
如果您只想进行一些正则表达式匹配等,您应该查看 regex-base 包。

Why do you want to use alex to create regular expressions?
If all you want is to do some regex matching etc, you should look at the regex-base package.

ペ泪落弦音 2024-09-13 08:24:24

如果您想要的是纯正则表达式,则 API 在 text.regex.base。然后是实现 text.regex.Posix , text.regex.pcre 和其他几个。 Haddoc 文档有点薄弱,但是基本知识在 现实世界 Haskell,第 8 章。 在此 所以问题。

If it is plain Regex you want, the API is specified in text.regex.base. Then there are the implementations text.regex.Posix , text.regex.pcre and several others. The Haddoc documentation is a bit slim, however the basics are described in Real World Haskell, chapter 8. Some more indepth stuff is descriped in this SO question.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文