Lex:如何防止它与子字符串匹配?
例如,我应该将“int”转换为“INT”。但如果有“integer”这个词,我认为它不应该变成“INTeger”。
如果我定义 "int" printf("INT");
子字符串就会匹配。有没有办法防止这种情况发生?
For example, I'm supposed to convert "int" to "INT". But if there's the word "integer", I don't think it's supposed to turn into "INTeger".
If I define "int" printf("INT");
the substrings are matched though. Is there a way to prevent this from happening?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
好吧,这就是我的做法:
欢迎更好的建议。
well, here's how i did it:
better suggestions welcome.
Lex 将为当前输入选择最长匹配的规则。为了避免子字符串匹配,您需要包含一个比
int
长的附加规则。最简单的方法是添加一个简单的规则,该规则拾取任何长于一个字符的字符串,即[a-zA-Z]+
。整个 lex 程序如下所示:-Lex will choose the rule with the longest possible match for the current input. To avoid substring matches you need to include an additional rule that is longer than
int
. The easiest way to do to this is to add a simple rule that picks up any string that is longer than one character, i.e.[a-zA-Z]+
. The entire lex program would look like this:-我相信以下内容捕获了您想要的内容。
要将其扩展到单词边界之外(在本例中为
{ws}
),您需要向ws
添加修饰符或添加更多特定检查。I believe the following captures what you want.
To expand this beyond word boundaries (
{ws}
, in this case) you will need to either add modifiers tows
or add more specifc checks.