Ruby 中的解析器:#slice! #each_with_index = 缺少元素
比方说,我想从数组中分离某些元素组合。例如,
data = %w{ start before rgb 255 255 255 between hex FFFFFF after end }
rgb, hex = [], []
data.each_with_index do |v,i|
p [i,v]
case v.downcase
when 'rgb' then rgb = data.slice! i,4
when 'hex' then hex = data.slice! i,2
end
end
pp [rgb, hex, data]
# >> [0, "start"]
# >> [1, "before"]
# >> [2, "rgb"]
# >> [3, "hex"]
# >> [4, "end"]
# >> [["rgb", "255", "255", "255"],
# >> ["hex", "FFFFFF"],
# >> ["start", "before", "between", "after", "end"]]
代码已经完成了正确的提取,但它错过了提取的集合之后的元素。那么如果我的数据数组是
data = %w{ start before rgb 255 255 255 hex FFFFFF after end }
那么
pp [rgb, hex, data]
# >> [["rgb", "255", "255", "255"],
# >> [],
# >> ["start", "before", "hex", "FFFFFF", "after", "end"]]
为什么会发生呢?如何获取#each_with_index
中那些丢失的元素?或者假设有更多的集合需要提取,这个问题可能有更好的解决方案吗?
Let's say, I want to separate certain combinations of elements from an array. For example
data = %w{ start before rgb 255 255 255 between hex FFFFFF after end }
rgb, hex = [], []
data.each_with_index do |v,i|
p [i,v]
case v.downcase
when 'rgb' then rgb = data.slice! i,4
when 'hex' then hex = data.slice! i,2
end
end
pp [rgb, hex, data]
# >> [0, "start"]
# >> [1, "before"]
# >> [2, "rgb"]
# >> [3, "hex"]
# >> [4, "end"]
# >> [["rgb", "255", "255", "255"],
# >> ["hex", "FFFFFF"],
# >> ["start", "before", "between", "after", "end"]]
The code have done the correct extraction, but it missed the elements just after the extracted sets. So if my data array is
data = %w{ start before rgb 255 255 255 hex FFFFFF after end }
then
pp [rgb, hex, data]
# >> [["rgb", "255", "255", "255"],
# >> [],
# >> ["start", "before", "hex", "FFFFFF", "after", "end"]]
Why does it happen? How to get those missed elements inside #each_with_index
? Or may be there is a better solution for this problem assuming that there are much more sets to extract?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
问题是您在迭代集合的同时改变了集合。这可能行不通。 (在我看来,不应该。在这种情况下,Ruby 应该引发异常,而不是默默地允许不正确的行为。这几乎是所有其他命令式语言所做的。)
这是我能想到的最好的方法保持原来的风格:
然而,你遇到的是一个解析问题,这应该由解析器来解决。一个简单的手动解析器/状态机可能会比上面的代码多一点,但它的可读性会如此。
这是一个简单的递归下降解析器,可以解决您的问题:
我真的很喜欢递归下降解析器,因为它们的结构几乎完全匹配语法:只需继续解析元素,直到输入为空。什么是元素?嗯,它是颜色规范或停用词。什么是颜色规格?嗯,它要么是 RGB 颜色规范,要么是十六进制颜色规范。什么是 RGB 颜色规格?嗯,它与 Regexp
/rgb/i
后跟 RGB 值相匹配。什么是 RGB 值?好吧,它只是三个数字......像这样使用它:
为了比较,这里是语法:
*
|
单词| 十六进制
The problem is that you are mutating the collection while you are iterating over it. This cannot possibly work. (And in my opinion, it shouldn't. Ruby should raise an exception in this case, instead of silently allowing incorrect behavior. That's what pretty much all other imperative languages do.)
This here is the best I could come up with while still keeping your original style:
However, what you have is a parsing problem and that should really be solved by a parser. A simple hand-rolled parser/state machine will probably be a little bit more code than the above, but it will be so much more readable.
Here's a simple recursive-descent parser that solves your problem:
I really like recursive-descent parsers because their structure almost perfectly matches the grammar: just keep parsing elements until the input is empty. What is an element? Well, it's a color specification or a stop word. What is a color specification? Well, it's either an RGB color specification or a hex color specification. What is an RGB color specification? Well, it's something that matches the Regexp
/rgb/i
followed by RGB values. What are RGB values? Well, it's just three numbers …Use it like so:
For comparison, here's the grammar:
*
|
word|
hex因为您正在就地操作
数据
。当您点击
rgb
时,循环中的下一个元素将为255
,但您要删除这些元素,因此现在Between
位于 < code>rgb 是,所以下一个元素是hex
这样的东西可能更适合您:
Because you are manipulating
data
in place.When you hit
rgb
the next element in the loop would be255
, but you are deleting those elements so nowbetween
is in the place thatrgb
was, so the next element ishex
Something like this may work better for you:
这是一个更好的解决方案
Here is a bit nicer solution