如何让Prolog的DCG不贪婪？

发布于 2024-10-17 06:59:03 字数 1177 浏览 9 评论 0原文

我想编写一个 DCG 谓词，它将接受字母标签、空格、可能包含空格或字母的伪标签、另一个空格和另一个字母标签，最后是一个句点，如下所示：

label_madness --> label(Table1), " ", label_with_spaces(Rel), " ", label(Table2), ".".

这是标签的代码：

label(A) --> letters(S), {string_to_atom(S, A)}, !.
label_with_spaces(A) --> letters_or_spaces(S), {string_to_atom(S, A)}, !.

letters([C|D]) --> letter(C), letters(D), !.
letters([C]) --> letter(C), !.

letters_or_spaces([C|D]) --> letter(C), letters_or_spaces(D), !.
letters_or_spaces([C|D]) --> spacehyphen(C), letters_or_spaces(D), !.
letters_or_spaces([C]) --> letter(C), !.
letters_or_spaces([C]) --> spacehyphen(C), !.

letter(C) --> [C], {"a"=<C, C=<"z"}, !.
letter(C) --> [C], {"A"=<C, C=<"Z"}, !.
spacehyphen(E) --> " ", {from_list("-", E)}, !. % spaces are replaced with hyphens in the pseudolabel
from_list([E], E).

现在当我向 label_madness 提供一个类似 “Alice is Responsible for Bob.” 的字符串，它失败了。由于神秘的原因，trace 拒绝工作，但我认为它失败了，因为 DCG 匹配了 Rel 的整个 负责 Bob。我尝试在标签之间使用非空格分隔符，效果很好。我应该如何重写 label_with_spaces 谓词以仅消耗所需的输入量？

原文

I want to write a DCG predicate that will accept an alphabetic label, a space, a pseudolabel that may contain spaces or letters, another space, and another alphabetic label, and finally a period, like this:

label_madness --> label(Table1), " ", label_with_spaces(Rel), " ", label(Table2), ".".

Here's the code for labels:

label(A) --> letters(S), {string_to_atom(S, A)}, !.
label_with_spaces(A) --> letters_or_spaces(S), {string_to_atom(S, A)}, !.

letters([C|D]) --> letter(C), letters(D), !.
letters([C]) --> letter(C), !.

letters_or_spaces([C|D]) --> letter(C), letters_or_spaces(D), !.
letters_or_spaces([C|D]) --> spacehyphen(C), letters_or_spaces(D), !.
letters_or_spaces([C]) --> letter(C), !.
letters_or_spaces([C]) --> spacehyphen(C), !.

letter(C) --> [C], {"a"=<C, C=<"z"}, !.
letter(C) --> [C], {"A"=<C, C=<"Z"}, !.
spacehyphen(E) --> " ", {from_list("-", E)}, !. % spaces are replaced with hyphens in the pseudolabel
from_list([E], E).

Now when I feed label_madness a string like "Alice is responsible for Bob.", it fails. For mysterious reasons trace refuses to work, but I assume it fails because DCG matches the whole is responsible for Bob for Rel. I tried with a nonspace separators between the labels and it works fine. How should I rewrite the label_with_spaces predicate to only consume as much input as required?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

遗心遗梦遗幸福 2024-10-24 06:59:03

您的解决方案中的问题是您在时间之前提交解析（使用剪切，！）
当您解析 letter_or_spaces 时，您实际上不知道要处理多少输入，因为您必须解析直到倒数第二个标签（在空格内）。

因此，您应该让 prolog 引擎在该谓词中回溯，以允许在字母或空格中选择正确的短语。
类似的东西（只是显示对代码的更改，即从某些谓词子句中删除剪切）：

label(A) --> letters(S), {string_to_atom(S, A)}.
label_with_spaces(A) --> letters_or_spaces(S), {string_to_atom(S, A)}.
letters_or_spaces([C|D]) --> letter(C), letters_or_spaces(D).
letters_or_spaces([C|D]) --> spacehyphen(C), letters_or_spaces(D).
letters_or_spaces([C]) --> letter(C).
letters_or_spaces([C]) --> spacehyphen(C).

您不妨稍微更改一下解析器，而不是使用回溯，只需解析直到字母或空格中的句点，然后将最后一个标签从它。

The problem in your solution is that you are commiting the parse before time (using the cut, !)
When you parse letters_or_spaces you really don't know how much input to process, because you have to parse until the second to last label (within spaces).

So, you should let the prolog engine backtrack in that predicate to allow the selection of the right phrase in letters_or_spaces.
Something like (just showing the changes to your code, that is removing the cut from some predicate clauses):

label(A) --> letters(S), {string_to_atom(S, A)}.
label_with_spaces(A) --> letters_or_spaces(S), {string_to_atom(S, A)}.
letters_or_spaces([C|D]) --> letter(C), letters_or_spaces(D).
letters_or_spaces([C|D]) --> spacehyphen(C), letters_or_spaces(D).
letters_or_spaces([C]) --> letter(C).
letters_or_spaces([C]) --> spacehyphen(C).

You might as well change your parser a bit and instead of using backtracking just parse until the period in letters_or_spaces and then split the last label from it.

回复收藏 0 原文

~没有更多了~