从 BNF 语法导出状态机
我正在尝试整理 XSS 安全字符串插值方案的概念证明。
给定一个带有替换的字符串,
"Hello <b>$planetoid</b>!"
我想将其分解为文字部分和替换("Hello"planetoid "!")
,然后从左到右运行一个状态机文字部分。当我达到插值(上面的planetoid)时,我需要能够从状态到达适当的转义函数。
有谁知道如何使用 lex/yacc/bison 派生状态机并能够将语法中的标签与输出状态相关联的任何示例?我想派生一个可以在 javascript 中使用的状态机,并尝试替换 PHP 的底层字符串实现。
我这样做的原因如下: 此处。
干杯, 麦克风
I am trying to put together a proof of concept of an XSS-safe string interpolation scheme.
Given a string with substitutions,
"Hello <b>$planetoid</b>!"
I want break it into literal portions and substitutions ("Hello<b>" planetoid "</b>!")
and then run a state machine left to right over the literal portions. When I reach an interpolated value (planetoid
in the above), I need to be able to get from the state to an appropriate escaping function.
Does anyone know of any examples of how to use lex/yacc/bison to derive a state machine and be able to associate labels in the grammar with output states? I want to derive a state machine that I can use both in javascript, and to try and replace PHP's underlying string implementation.
My reasons for doing this are described here.
cheers,
mike
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
一般来说,不可能为可以用 BNF 表示的语法创建状态机。状态机只能识别常规语言,而 BNF 可以指定上下文无关语言。 Yacc 可以创建解析器。这样就足够了吗?
In general, it is not possible to create a state machine for grammars that can be represented in BNF. State machines can only recognize regular languages and BNF can specify context-free languages. Yacc can create parsers. Would that be sufficient?
看起来我可以在语法中放置标记,所以如果我使用两种不同的生产类型,一种没有副作用,并且消耗字符,另一种不消耗字符,但更新状态变量,
编译的输出将与状态名称相关联switch 语句
可能有一种方法可以在不解析 C 的情况下提取表,但我对 yacc/bison 非常无知。
It looks like I can put markers in the grammar, so if I use two different production types, one with no side effect, and that consumes characters, and one that consumes no characters, but updates a state variable
the compiled output associates state names in a switch statement
There may be a way to extract the tables without parsing C, but I'm so ignorant of yacc/bison.
您可以使用 yacc/bison 来实现此目的。最初查看 bison 时,很难判断在哪里可以实现状态机。野牛中的规则是从左到右解决的。 IE;如果您有一个派生出rule1、rule2、rule3 的规则(称为rule0),则按以下顺序调用操作:rule1、rule2、rule3、rule0。您可以使用全局状态机将其与规则的动态返回值结合起来(我使用具有不同类型的联合,如字符串、int 甚至返回值的容器)。
You can use yacc/bison for this. Initially looking at bison its hard to tell where you can implement a state machine. Rules in bison are resolved left-right-up. Ie; If you have a rule (called rule0) which derives rule1 rule2 rule3, the actions are called in this order: rule1,rule2,rule3,rule0. You can combine that using a global state machine, with dynamic return values for rules (I use a union with different types like a string, an int or even a container for return values).