使用flex识别变量名不重复字符

发布于 2025-01-16 21:23:13 字数 562 浏览 2 评论 0原文

我不完全确定如何表达我的问题,所以对粗略的标题感到抱歉。

我正在尝试创建一个可以识别具有以下限制的变量名称的模式:

  • 必须以字母开头
  • 第一个字母后面可以是字母、数字和连字符的任意组合
  • 第一个字母后面可以不包含任何内容
  • 变量名称不能是完全 X 的([xX]+ 在此语法中是一个单独的标识符)

因此,例如,这些都有效:

  • Avariable123
  • Bee-keeper
  • Y
  • E-3

但以下内容无效:

  • XXXX
  • X
  • 3variable
  • 5

I am能够使用我当前的标识符满足前三个要求,但我真的很难更改它,以便它不会拾取完全是字母 X 的变量。

这是我到目前为止所拥有的:[az] [a-z0-9\-]* {return (NAME);}

任何人都可以建议一种编辑方法以避免仅由字母 X 组成的变量吗?

I'm not fully sure how to word my question, so sorry for the rough title.

I am trying to create a pattern that can identify variable names with the following restraints:

  • Must begin with a letter
  • First letter may be followed by any combination of letters, numbers, and hyphens
  • First letter may be followed with nothing
  • The variable name must not be entirely X's ([xX]+ is a seperate identifier in this grammar)

So for example, these would all be valid:

  • Avariable123
  • Bee-keeper
  • Y
  • E-3

But the following would not be valid:

  • XXXX
  • X
  • 3variable
  • 5

I am able to meet the first three requirements with my current identifier, but I am really struggling to change it so that it doesn't pick up variables that are entirely the letter X.

Here is what I have so far: [a-z][a-z0-9\-]* {return (NAME);}

Can anyone suggest a way of editing this to avoid variables that are made up of just the letter X?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

浅浅 2025-01-23 21:23:13

处理此类要求的最简单方法是使用一个与异常字符串匹配的模式和文件中随后出现的另一个模式,该模式与所有字符串匹配:

[xX]+                    { /* matches all-x tokens */ }
[[:alpha:]][[:alnum:]-]* { /* handle identifiers */ }

这是有效的,因为 lex (以及几乎所有 lex 衍生品)选择第一个如果两个模式匹配相同的最长标记,则匹配。

当然,您需要知道您想用这个特殊符号做什么。如果您只想接受它作为某种令牌类型,那么没有问题;你就这样做。另一方面,如果目的是将其分解为子标记(可能是单个字母),那么您将必须使用 yyless(),并且您可能希望切换到新的词法状态为了避免重复匹配相同的长 X 序列。但也许这对你来说并不重要。

有关更多详细信息和示例,请参阅Flex 手册

The easiest way to handle that sort of requirement is to have one pattern which matches the exceptional string and another pattern, which comes afterwards in the file, which matches all the strings:

[xX]+                    { /* matches all-x tokens */ }
[[:alpha:]][[:alnum:]-]* { /* handle identifiers */ }

This works because lex (and almost all lex derivatives) select the first match if two patterns match the same longest token.

Of course, you need to know what you want to do with the exceptional symbol. If you just want to accept it as some token type, there's no problem; you just do that. If, on the other hand, the intention was to break it into subtokens, perhaps individual letters, then you'll have to use yyless(), and you might want to switch to a new lexing state in order to avoid repeatedly matching the same long sequence of Xs. But maybe that doesn't matter in your case.

See the flex manual for more details and examples.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文