构建“半自然语言” Ruby 中的 DSL

发布于 2024-08-20 15:38:32 字数 1905 浏览 5 评论 0原文

我有兴趣在 Ruby 中构建 DSL 以用于解析微博更新。具体来说,我认为我可以将文本翻译成 Ruby 字符串,就像 Rails gem 允许“4.days.ago”一样。我已经有了正则表达式代码,可以将文本转换

@USER_A: give X points to @USER_B for accomplishing some task
@USER_B: take Y points from @USER_A for not giving me enough points

为类似的内容。

Scorekeeper.new.give(x).to("USER_B").for("accomplishing some task").giver("USER_A")
Scorekeeper.new.take(x).from("USER_A").for("not giving me enough points").giver("USER_B")

我可以接受将更新的语法形式化,以便仅提供和解析标准化文本,从而使我能够智能地处理更新。因此,这似乎更多是如何实现 DSL 类的问题。我有以下存根类(删除了所有错误检查并用注释替换了一些以最大程度地减少粘贴):

class Scorekeeper

  attr_accessor :score, :user, :reason, :sender

  def give(num)
    # Can 'give 4' or can 'give a -5'; ensure 'to' called
    self.score = num
    self
  end

  def take(num)
    # ensure negative and 'from' called
    self.score = num < 0 ? num : num * -1
    self
  end

  def plus
    self.score > 0
  end

  def to (str)
    self.user = str
    self
  end

  def from(str)
    self.user = str
    self
  end

  def for(str)
    self.reason = str
    self
  end

  def giver(str)
    self.sender = str
    self
  end

  def command
    str = plus ? "giving @#{user} #{score} points" : "taking #{score * -1} points from @#{user}"
    "@#{sender} is #{str} for #{reason}"
  end

end

运行以下命令:

t = eval('Scorekeeper.new.take(4).from("USER_A").for("not giving me enough points").giver("USER_B")')
p t.command
p t.inspect

产生预期结果:

"@USER_B is taking 4 points from @USER_A for not giving me enough points"
"#<Scorekeeper:0x100152010 @reason=\"not giving me enough points\", @user=\"USER_A\", @score=4, @sender=\"USER_B\">"

所以我的问题主要是,我正在做任何事情来搬起石头砸自己的脚吗在此实现的基础上构建? 有谁有 DSL 类本身的改进示例或对我的任何警告吗?

顺便说一句,为了获取 eval 字符串,我主要使用 sub/gsub 和正则表达式,我认为这是最简单的方法,但我可能是错的。

I'm interested in building a DSL in Ruby for use in parsing microblog updates. Specifically, I thought that I could translate text into a Ruby string in the same way as the Rails gem allows "4.days.ago". I already have regex code that will translate the text

@USER_A: give X points to @USER_B for accomplishing some task
@USER_B: take Y points from @USER_A for not giving me enough points

into something like

Scorekeeper.new.give(x).to("USER_B").for("accomplishing some task").giver("USER_A")
Scorekeeper.new.take(x).from("USER_A").for("not giving me enough points").giver("USER_B")

It's acceptable to me to formalize the syntax of the updates so that only standardized text is provided and parsed, allowing me to smartly process updates. Thus, it seems it's more a question of how to implement the DSL class. I have the following stub class (removed all error checking and replaced some with comments to minimize paste):

class Scorekeeper

  attr_accessor :score, :user, :reason, :sender

  def give(num)
    # Can 'give 4' or can 'give a -5'; ensure 'to' called
    self.score = num
    self
  end

  def take(num)
    # ensure negative and 'from' called
    self.score = num < 0 ? num : num * -1
    self
  end

  def plus
    self.score > 0
  end

  def to (str)
    self.user = str
    self
  end

  def from(str)
    self.user = str
    self
  end

  def for(str)
    self.reason = str
    self
  end

  def giver(str)
    self.sender = str
    self
  end

  def command
    str = plus ? "giving @#{user} #{score} points" : "taking #{score * -1} points from @#{user}"
    "@#{sender} is #{str} for #{reason}"
  end

end

Running the following commands:

t = eval('Scorekeeper.new.take(4).from("USER_A").for("not giving me enough points").giver("USER_B")')
p t.command
p t.inspect

Yields the expected results:

"@USER_B is taking 4 points from @USER_A for not giving me enough points"
"#<Scorekeeper:0x100152010 @reason=\"not giving me enough points\", @user=\"USER_A\", @score=4, @sender=\"USER_B\">"

So my question is mainly, am I doing anything to shoot myself in the foot by building upon this implementation? Does anyone have any examples for improvement in the DSL class itself or any warnings for me?

BTW, to get the eval string, I'm mostly using sub/gsub and regex, I figured that's the easiest way, but I could be wrong.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

时光无声 2024-08-27 15:38:32

我对你的理解是否正确:你想从用户那里获取一个字符串并使其触发某些行为?

根据您列出的两个示例,您可能可以使用正则表达式。

例如,解析这个例子:

@USER_A: give X points to @USER_B for accomplishing some task

使用 Ruby:

input = "@abe: give 2 points to @bob for writing clean code"
PATTERN = /^@(.+?): give ([0-9]+) points to @(.+?) for (.+?)$/
input =~ PATTERN
user_a = $~[1] # => "abe"
x      = $~[2] # => "2"
user_b = $~[3] # => "bob"
why    = $~[4] # => "writing clean code"

但是如果有更多的复杂性,在某些时候您可能会发现使用真正的解析器更容易且更易于维护。如果您想要一个与 Ruby 配合良好的解析器,我推荐 Treetop:http://treetop.rubyforge.org/

获取字符串并将其转换为要评估的代码的想法让我感到紧张。使用 eval 是一个很大的风险,应该尽可能避免。还有其他方法可以实现您的目标。如果您愿意,我很乐意提供一些想法。

关于您建议的 DSL 的问题:您打算在应用程序的其他部分本地使用它吗?或者只是计划将其用作将字符串转换为您想要的行为的过程的一部分?在不了解更多信息的情况下,我不确定什么是最好的,但如果您只是解析字符串,您可能不需要 DSL。

Am I understanding you correctly: you want to take a string from a user and cause it to trigger some behavior?

Based on the two examples you listed, you probably can get by with using regular expressions.

For example, to parse this example:

@USER_A: give X points to @USER_B for accomplishing some task

With Ruby:

input = "@abe: give 2 points to @bob for writing clean code"
PATTERN = /^@(.+?): give ([0-9]+) points to @(.+?) for (.+?)$/
input =~ PATTERN
user_a = $~[1] # => "abe"
x      = $~[2] # => "2"
user_b = $~[3] # => "bob"
why    = $~[4] # => "writing clean code"

But if there is more complexity, at some point you might find it easier and more maintainable to use a real parser. If you want a parser that works well with Ruby, I recommend Treetop: http://treetop.rubyforge.org/

The idea of taking a string and converting it to code to be evaled makes me nervous. Using eval is a big risk and should be avoided if possible. There are other ways to accomplish your goal. I'll be happy to give some ideas if you want.

A question about the DSL you suggest: are you going to use it natively in another part of your application? Or do just plan on using it as part of the process to convert the string into the behavior you want? I'm not sure what is best without knowing more, but you may not need the DSL if you are just parsing the strings.

伊面 2024-08-27 15:38:32

这呼应了我对切线项目(旧式文本 MOO)的一些想法。

我不相信编译器风格的解析器将是程序处理模糊的英文文本的最佳方式。我目前的想法是把对英语的理解分成单独的对象——所以一个盒子理解“打开盒子”,但不理解“按下按钮”等——然后让这些对象使用某种 DSL 来调用集中式代码,实际上让事情发生。

我不确定您是否已经了解 DSL 实际上将如何帮助您。也许您需要首先看看英文文本如何转换为 DSL。我并不是说您不需要 DSL;而是说您不需要 DSL。你很可能是对的。

至于如何做到这一点的提示?好吧,我想如果我是你,我会寻找特定的动词。每个动词都会“知道”它应该从周围的文本中得到什么样的结果。因此,在您的示例中,“to”和“from”将期望用户立即跟随。

IMO,这与您在此处发布的代码并没有特别不同。

您可能会从我的问题的答案中得到一些帮助。一位评论者向我指出了解释器模式,我发现它特别具有启发性:这里有一个很好的 Ruby 示例< /a>.

This echoes some of my thoughts on a tangental project (an old-style text MOO).

I'm not convinced that a compiler-style parser is going to be the best way for the program to deal with the vaguaries of english text. My current thoughts have me splitting up the understanding of english into seperate objects -- so a box understands "open box" but not "press button", etc. -- and then having the objects use some sort of DSL to call centralised code that actually makes things happen.

I'm not sure that you've got to the point where you understand how the DSL is actually going to help you. Maybe you need to look at how the english text gets turned into DSL, first. I'm not saying that you don't need a DSL; you might very well be right.

As for hints as to how to do that? Well, I think if I were you I would be looking for specific verbs. Each verb would "know" what sort of thing it should expect from the text around it. So in your example "to" and "from" would expect a user immediately following.

This isn't especially divergent from the code you've posted here, IMO.

You might get some milage out of looking at the answers to my question. One commenter pointed me to the Interpreter Pattern, which I found especially enlightening: there's a nice Ruby example here.

无声无音无过去 2024-08-27 15:38:32

基于 @David_James 的回答,我提出了一个仅使用正则表达式的解决方案,因为我实际上并没有在其他任何地方使用 DSL 来构建分数,而只是向用户解析分数。我有两种用于搜索的模式:

SEARCH_STRING = "@Scorekeeper give a healthy 4 to the great @USER_A for doing something 
really cool.Then give the friendly @USER_B a healthy five points for working on this. 
Then take seven points from the jerk @USER_C."

PATTERN_A = /\b(give|take)[\s\w]*([+-]?[0-9]|one|two|three|four|five|six|seven|eight|nine|ten)[\s\w]*\b(to|from)[\s\w]*@([a-zA-Z0-9_]*)\b/i

PATTERN_B = /\bgive[\s\w]*@([a-zA-Z0-9_]*)\b[\s\w]*([+-]?[0-9]|one|two|three|four|five|six|seven|eight|nine|ten)/i

SEARCH_STRING.scan(PATTERN_A) # => [["give", "4", "to", "USER_A"],
                              #     ["take", "seven", "from", "USER_C"]]
SEARCH_STRING.scan(PATTERN_B) # => [["USER_B", "five"]]

正则表达式可能会被清理一下,但这使我能够拥有允许使用一些有趣的形容词的语法,同时仍然使用“name->points”来提取核心信息" 和 "points->name" 语法。它不允许我抓住原因,但这太复杂了,所以现在我将只存储整个更新,因为在除异常情况之外的所有情况下,整个更新无论如何都将与每个分数的上下文相关。获取“给予者”用户名也可以在其他地方完成。

我已经写了这些的描述表达式 以及,希望其他人可能会发现它有用(这样我就可以回到它并记住那一长串官样文章的含义:)

Building on @David_James' answer, I've come up with a regex-only solution to this since I'm not actually using the DSL anywhere else to build scores and am merely parsing out points to users. I've got two patterns that I'll use to search:

SEARCH_STRING = "@Scorekeeper give a healthy 4 to the great @USER_A for doing something 
really cool.Then give the friendly @USER_B a healthy five points for working on this. 
Then take seven points from the jerk @USER_C."

PATTERN_A = /\b(give|take)[\s\w]*([+-]?[0-9]|one|two|three|four|five|six|seven|eight|nine|ten)[\s\w]*\b(to|from)[\s\w]*@([a-zA-Z0-9_]*)\b/i

PATTERN_B = /\bgive[\s\w]*@([a-zA-Z0-9_]*)\b[\s\w]*([+-]?[0-9]|one|two|three|four|five|six|seven|eight|nine|ten)/i

SEARCH_STRING.scan(PATTERN_A) # => [["give", "4", "to", "USER_A"],
                              #     ["take", "seven", "from", "USER_C"]]
SEARCH_STRING.scan(PATTERN_B) # => [["USER_B", "five"]]

The regex might be cleaned up a bit, but this allows me to have syntax that allows a few fun adjectives while still pulling the core information using both "name->points" and "points->name" syntaxes. It does not allow me to grab the reason, but that's so complex that for now I'm going to just store the entire update, since the whole update will be related to the context of each score anyway in all but outlier cases. Getting the "giver" username can be done elsewhere as well.

I've written up a description of these expressions as well, in hopes that other people might find that useful (and so that I can go back to it and remember what that long string of gobbledygook means :)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文