Treetop 和嵌套样式表规则中的 CSS/HSS 解析器
我是 Treetop 的新手,正在尝试编写 CSS/HSS 解析器。 HSS 通过嵌套样式、变量和一种 mixin 功能增强了 CSS 的基本功能。
我非常接近 - 解析器可以处理 CSS - 但当涉及到在样式中实现样式时,我会失败。 例如:
#rule #one {
#two {
color: red;
}
color: blue;
}
我对其进行了两次拍摄,一张处理空白,另一张不处理。 我都无法正常工作。 树顶文档有点稀疏,我真的觉得我错过了一些基本的东西。 希望有人可以纠正我。
A:
grammar Stylesheet
rule stylesheet
space* style*
end
rule style
selectors space* '{' space* properties? space* '}' space*
end
rule properties
property space* (';' space* property)* ';'?
end
rule property
property_name space* [:] space* property_value
end
rule property_name
[^:;}]+
end
rule property_value
[^:;}]+
end
rule space
[\t ]
end
rule selectors
selector space* ([,] space* selector)*
end
rule selector
element (space+ ![{] element)*
end
rule element
class / id
end
rule id
[#] [a-zA-Z-]+
end
rule class
[.] [a-zA-Z-]+
end
end
B:
grammar Stylesheet
rule stylesheet
style*
end
rule style
selectors closure
end
rule closure
'{' ( style / property )* '}'
end
rule property
property_name ':' property_value ';'
end
rule property_name
[^:}]+
<PropertyNode>
end
rule property_value
[^;]+
<PropertyNode>
end
rule selectors
selector ( !closure ',' selector )*
<SelectorNode>
end
rule selector
element ( space+ !closure element )*
<SelectorNode>
end
rule element
class / id
end
rule id
('#' [a-zA-Z]+)
end
rule class
('.' [a-zA-Z]+)
end
rule space
[\t ]
end
end
线束代码:
require 'rubygems'
require 'treetop'
class PropertyNode < Treetop::Runtime::SyntaxNode
def value
"property:(#{text_value})"
end
end
class SelectorNode < Treetop::Runtime::SyntaxNode
def value
"--> #{text_value}"
end
end
Treetop.load('css')
parser = StylesheetParser.new
parser.consume_all_input = false
string = <<EOS
#hello-there .my-friend {
font-family:Verdana;
font-size:12px;
}
.my-friend, #is-cool {
font: 12px Verdana;
#he .likes-jam, #very-much {asaads:there;}
hello: there;
}
EOS
root_node = parser.parse(string)
def print_node(node, output = [])
output << node.value if node.respond_to?(:value)
node.elements.each {|element| print_node(element, output)} if node.elements
output
end
puts print_node(root_node).join("\n") if root_node
#puts parser.methods.sort.join(',')
puts parser.input
puts string[0...parser.failure_index] + '<--'
puts parser.failure_reason
puts parser.terminal_failures
I'm new to Treetop and attempting to write a CSS/HSS parser. HSS augments the basic functionality of CSS with nested styles, variables and a kind of mixin functionality.
I'm pretty close - the parser can handle CSS - but I fall down when it comes to implementing a style within a style. e.g:
#rule #one {
#two {
color: red;
}
color: blue;
}
I've taken two shots at it, one which handles whitespace and one which doesn't. I can't quite get either to work. The treetop documentation is a little sparse and I really feel like I'm missing something fundamental. Hopefully someone can set me straight.
A:
grammar Stylesheet
rule stylesheet
space* style*
end
rule style
selectors space* '{' space* properties? space* '}' space*
end
rule properties
property space* (';' space* property)* ';'?
end
rule property
property_name space* [:] space* property_value
end
rule property_name
[^:;}]+
end
rule property_value
[^:;}]+
end
rule space
[\t ]
end
rule selectors
selector space* ([,] space* selector)*
end
rule selector
element (space+ ![{] element)*
end
rule element
class / id
end
rule id
[#] [a-zA-Z-]+
end
rule class
[.] [a-zA-Z-]+
end
end
B:
grammar Stylesheet
rule stylesheet
style*
end
rule style
selectors closure
end
rule closure
'{' ( style / property )* '}'
end
rule property
property_name ':' property_value ';'
end
rule property_name
[^:}]+
<PropertyNode>
end
rule property_value
[^;]+
<PropertyNode>
end
rule selectors
selector ( !closure ',' selector )*
<SelectorNode>
end
rule selector
element ( space+ !closure element )*
<SelectorNode>
end
rule element
class / id
end
rule id
('#' [a-zA-Z]+)
end
rule class
('.' [a-zA-Z]+)
end
rule space
[\t ]
end
end
Harness Code:
require 'rubygems'
require 'treetop'
class PropertyNode < Treetop::Runtime::SyntaxNode
def value
"property:(#{text_value})"
end
end
class SelectorNode < Treetop::Runtime::SyntaxNode
def value
"--> #{text_value}"
end
end
Treetop.load('css')
parser = StylesheetParser.new
parser.consume_all_input = false
string = <<EOS
#hello-there .my-friend {
font-family:Verdana;
font-size:12px;
}
.my-friend, #is-cool {
font: 12px Verdana;
#he .likes-jam, #very-much {asaads:there;}
hello: there;
}
EOS
root_node = parser.parse(string)
def print_node(node, output = [])
output << node.value if node.respond_to?(:value)
node.elements.each {|element| print_node(element, output)} if node.elements
output
end
puts print_node(root_node).join("\n") if root_node
#puts parser.methods.sort.join(',')
puts parser.input
puts string[0...parser.failure_index] + '<--'
puts parser.failure_reason
puts parser.terminal_failures
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我假设您遇到了左递归问题? 如果是这样,请记住 TreeTop 生成递归下降解析器,因此,您可以'在你的语法中并没有真正使用左递归。 (尽管它的外观非常性感,但我仍然更喜欢 ocamlyacc/ocamllex 而不是 TreeTop 的主要原因之一。)这意味着您需要从左递归形式转换为右递归形式。 既然你无疑拥有龙之书(对吗?),我'将引导您查看涵盖该问题的第 4.3.3、4.3.4 和 4.4.1 节。 正如典型的那样,它很难理解,但解析器的声誉并不是白来的。 还有一个很好的 左递归消除教程,ANTLR 的人提出这个问题。 它在某种程度上是 ANTLR/ANTLRworks 特有的,但比 Dragon Book 中的内容更容易理解。 对于以前至少做过几次的人来说,这是一件没有多大意义的事情。
另外,小评论,如果您要使用 TreeTop,我建议这样做:
您不太可能需要匹配单个空白字符,而且几乎每个语法规则都需要它,所以这是有道理的给它起一个很短的名字。 顺便说一句,单独的词法分析步骤有有优点。 这是其中之一。
I assume you're running into left recursion problems? If so, keep in mind that TreeTop produces recursive descent parsers, and as such, you can't really use left recursion in your grammar. (One of the main reasons I still prefer ocamlyacc/ocamllex over TreeTop despite its very sexy appearance.) This means you need to convert from left recursive forms to right recursion. Since you undoubtedly own the Dragon Book (right?), I'll direct you to sections 4.3.3, 4.3.4, and 4.4.1 which cover the issue. As is typical, it's hard-to-understand, but parsers didn't get their reputation for nothing. There's also a nice left recursion elimination tutorial that the ANTLR guys put up on the subject. It's somewhat ANTLR/ANTLRworks specific, but it's slightly easier to understand than what's found in the Dragon Book. This is one of those things that just doesn't ever make a whole lot of sense to anyone who hasn't done it at least a few times before.
Also, minor comment, if you're going to use TreeTop, I recommend doing this instead:
You're not likely to ever need to match a single whitespace character, plus almost every grammar rule is going to need it, so it makes sense to name it something very short. Incidentally, there are advantages to a separate lexing step. This is one of them.
看起来有人比我先一步:
http://lesscss.org/
虽然我注意到他们使用正则表达式并且使用 eval() 来解析输入文件而不是解析器。
编辑:现在他们使用 TreeTop! 就好像有人为我做了所有的辛苦工作。
Looks like someone beat me to it:
http://lesscss.org/
Although I notice that they use regular expressions and an eval() to parse the input file rather than a parser.
Edit: Now they use TreeTop! It's like someone did all the hard work for me.