OCamllex 匹配行首?

发布于 2024-10-22 05:37:56 字数 184 浏览 5 评论 0原文

我正在用 ocamllex 在 OCaml 中编写一种玩具编程语言,并试图使该语言对缩进更改敏感,python 风格,但在将行开头与 ocamllex 的正则表达式规则匹配时遇到问题。我习惯使用 ^ 来匹配行的开头,但在 OCaml 中这是字符串连接运算符。不幸的是,谷歌搜索对我来说并没有出现太多:(有人知道这是如何工作的吗?

I am messing around writing a toy programming language in OCaml with ocamllex, and was trying to make the language sensitive to indentation changes, python-style, but am having a problem matching the beginning of a line with ocamllex's regex rules. I am used to using ^ to match the beginning of a line, but in OCaml that is the string concat operator. Google searches haven't been turning up much for me unfortunately :( Anyone know how this would work?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

妖妓 2024-10-29 05:37:56

我不确定是否明确支持零长度匹配符号(例如 Perl 样式正则表达式中的 ^ ,它匹配位置而不是子字符串)。但是,您应该能够让您的词法分析器将换行符转换为显式标记,如下所示:

parser.mly

%token EOL
%token <int> EOLWS
% other stuff here
%%
main:
    EOL stmt                { MyStmtDataType(0, $2) }
  | EOLWS stmt              { MyStmtDataType($1 - 1, $2) }
 ;

lexer.mll

{
 open Parser
 exception Eof
}
rule token = parse
    [' ' '\t']           { token lexbuf }     (* skip other blanks *)
  | ['\n'][' ']+ as lxm  { EOLWS(String.length(lxm)) }
  | ['\n']               { EOL }
  (* ... *)

这未经测试,但总体思路是:

  • 将换行符视为语句“起始符”
  • 测量紧随换行符之后的空白并将其长度作为 int 传递

警告:您需要预处理输入以从单个 \n 如果它不包含一个。

I'm not sure if there is explicit support for zero-length matching symbols (like ^ in Perl-style regular expressions, which matches a position rather than a substring). However, you should be able to let your lexer turn newlines into an explicit token, something like this:

parser.mly

%token EOL
%token <int> EOLWS
% other stuff here
%%
main:
    EOL stmt                { MyStmtDataType(0, $2) }
  | EOLWS stmt              { MyStmtDataType($1 - 1, $2) }
 ;

lexer.mll

{
 open Parser
 exception Eof
}
rule token = parse
    [' ' '\t']           { token lexbuf }     (* skip other blanks *)
  | ['\n'][' ']+ as lxm  { EOLWS(String.length(lxm)) }
  | ['\n']               { EOL }
  (* ... *)

This is untested, but the general idea is:

  • Treat newlines as staetment 'starters'
  • Measure whitespace that immediately follows the newline and pass its length as an int

Caveat: you will need to preprocess your input to start with a single \n if it doesn't contain one.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文