使用 Prolog DCG 分割字符串
我正在尝试使用 DCG 将字符串拆分为用空格分隔的两部分。例如“abc def”应该给我返回“abc”和“abc def”。 “定义”。该计划& DCG如下。
main:-
prompt(_, ''),
repeat,
read_line_to_codes(current_input, Codes),
(
Codes = end_of_file
->
true
;
processData(Codes),
fail
).
processData(Codes):-
(
phrase(data(Part1, Part2), Codes)
->
format('~s, ~s\n', [ Part1, Part2 ])
;
format('Didn''t recognize data.\n')
).
data([ P1 | Part1 ], [ P2 | Part2 ]) --> [ P1 | Part1 ], spaces(_), [ P2 | Part2 ].
spaces([ S | S1 ]) --> [ S ], { code_type(S, space) }, (spaces(S1); "").
这工作正常。但我发现必须输入 [ P1 |第 1 部分]
& <代码>[ P2 |第 2 部分] 确实很冗长。因此,我尝试替换 [ P1 | 的所有实例第 1 部分]
w/ 第 1 部分
&同样w/ [ P2 | Part2]
中data
的定义,即如下。
data(Part1, Part2) --> Part1, spaces(_), Part2.
这更容易输入,但这给了我一个 Arguments are not充分实例化
错误。因此,看起来未绑定的变量不会自动解释为 DCG 中的代码列表。有没有其他方法可以让这个不那么冗长?我的目的是在其他编程语言中使用正则表达式的地方使用 DCG。
I'm trying to use a DCG to split a string into two parts separated by spaces. E.g. 'abc def' should give me back "abc" & "def". The program & DCG are below.
main:-
prompt(_, ''),
repeat,
read_line_to_codes(current_input, Codes),
(
Codes = end_of_file
->
true
;
processData(Codes),
fail
).
processData(Codes):-
(
phrase(data(Part1, Part2), Codes)
->
format('~s, ~s\n', [ Part1, Part2 ])
;
format('Didn''t recognize data.\n')
).
data([ P1 | Part1 ], [ P2 | Part2 ]) --> [ P1 | Part1 ], spaces(_), [ P2 | Part2 ].
spaces([ S | S1 ]) --> [ S ], { code_type(S, space) }, (spaces(S1); "").
This works correctly. But I found that having to type [ P1 | Part1 ]
& [ P2 | Part2 ]
was really verbose. So, I tried replacing all instances of [ P1 | Part1 ]
w/ Part1
& likewise w/ [ P2 | Part2 ]
in the definition of data
, i.e. the following.
data(Part1, Part2) --> Part1, spaces(_), Part2.
That's much easier to type, but that gave me an Arguments are not sufficiently instantiated
error. So it looks like an unbound variable isn't automatically interpreted as a list of codes in a DCG. Is there any other way to make this less verbose? My intent is to use DCG's where I would use regular expressions in other programming languages.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
你的直觉是正确的; DCG 的术语扩展过程(至少在 SWI-Prolog 中,但应该适用于其他)与您修改后的
data
版本给出以下内容:如您所见,变量
Part1< DCG 规则的 /code> 和
Part2
部分已被解释为再次调用phrase/3
,而不是列表;您需要明确指定它们是列表,以便将它们视为列表。我可以建议一个更通用的替代版本。考虑以下一组 DCG 规则:
看一下顶部的第一个子句;
data
规则现在尝试匹配 0 到多个空格(由于剪切而尽可能多),然后匹配一对多非空格字符来构造一个原子 (A
)从代码中,然后再次0到多空格,然后递归以在字符串中找到更多原子(As
)。你最终得到的是一个原子列表,它出现在输入字符串中,没有任何空格。您可以通过以下方式将此版本合并到您的代码中:此版本将字符串分开,单词之间有任意数量的空格,即使它们出现在字符串的开头和结尾。
Your intuition is correct; the term-expansion procedure for DCGs (at least in SWI-Prolog, but should apply to others) with your modified version of
data
gives the following:As you can see, the variable
Part1
andPart2
parts of your DCG rule have been interpreted into calls tophrase/3
again, and not lists; you need to explicitly specify that they are lists for them to be treated as such.I can suggest an alternative version which is more general. Consider the following bunch of DCG rules:
Take a look at the first clause at the top; the
data
rule now attempts to match 0-to-many spaces (as many as possible, because of the cut), then one-to-many non-space characters to construct an atom (A
) from the codes, then 0-to-many spaces again, then recurses to find more atoms in the string (As
). What you end up with is a list of atoms which appeared in the input string without any spaces. You can incorporate this version into your code with the following:This version breaks a string apart with any number of spaces between words, even if they appear at the start and end of the string.