整数列表的正则表达式算法
通常,正则表达式适用于 ASCII 代码。说“abbbd”.match(“ab*d”)。
我想知道是否存在允许用户匹配正则表达式的算法或工具 对于整数列表。
例如:
int[] a = {1,2,2,2,2,5};
a.match("12*5");
非常感谢。
Usually, regular expression works for ASCII code. Say "abbbd".match("ab*d").
I wonder if there exist algorithms or tools that allow user to match regular expression
for integer lists.
e.g:
int[] a = {1,2,2,2,2,5};
a.match("12*5");
Thank you so much.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我以前做过类似的事情,尽管我基本上必须为其编写自己的引擎。 ASCII(或 Unicode 或任何其他字符集)并没有什么神奇之处,当他们教授正则表达式< /a> 在学校里,他们通常使用一小部分任意符号(例如 Σ = {a, b})来保持简单。算法的工作原理仍然相同。
Perl 风格的正则表达式引擎的大多数功能都是特定于字符的。一些功能如
^
和$
仍然可以正常工作。有些像[:alnum:]
根本没有意义。其他如[3-5]
可以适应非字符字符串。一个棘手的问题(已经被 Polygenelubricants 和其他人注意到)是 Perl 正则表达式工作得很好,因为你用来描述语言的东西和你匹配的东西都是字符串——语法几乎不起作用对于非字符串字母也是如此。因此,字符中的
/[3-5]/
可能需要是[3,4,5]
(整数列表),因此您需要构建语言来自表达式,而不是字符串(除非您想编写自己的解析器!)。为什么大多数正则表达式库在字母表上不是通用的?让我惊讶的是——它是一个非常有用的工具,但仅将其应用于字符串似乎是一种可怕的浪费。 LINQ 很好,但我不确定它在这里有什么帮助。
I've done something like that before, though I had to basically write my own engine for it. There's nothing magic about ASCII (or Unicode or any other character set), and when they teach regular expressions in school they usually use a tiny set of arbitrary symbols (like Σ = {a, b}) to keep things simple. The algorithms still work the same.
Most of the features of Perl-style regex engines are specific to characters. Some features like
^
and$
still work fine. Some like[:alnum:]
make no sense at all. And others like[3-5]
can be adapted to work with non-character strings.One tricky bit (already noted by polygenelubricants and others) is that Perl regexes work well because the thing you're using to describe the language, and the thing you're matching, are both character strings -- the syntax doesn't work nearly as well for non-character-string alphabets. So
/[3-5]/
in characters might need to be[3,4,5]
(a list of integers), and so you need to build the language from expressions, rather than strings (unless you want to write your own parser!).Why aren't most regex libraries generic on alphabet? Beats me -- it's a tremendously useful tool, and seems a terrible waste to apply it only to character strings. LINQ is nice but I'm not sure how it would help here.
我对此表示怀疑,主要是因为它是如此模糊。只要看看您提供的示例,您的意思是匹配这个:
还是这个:
当然,您可以稍微改进语法来解决这个问题,但您可能会得到一个非常混乱的语法。
这太复杂了,而且我确信有更好的方法可以做到这一点(列表推导式、LINQ 等)。
I doubt it, mostly because it is so ambiguous. Just looking at the example that you provided, do you mean to match this:
or this:
Sure, you could possibly improve the syntax slightly to fix this, but you would likely end up with a very messy syntax.
It would be just too complicated, and I'm sure that there are much better ways of doing it (list comprehensions, LINQ, etc).
您可以使用类似 marge() 的方法,其中 marge 只会创建一个包含数组所有成员的字符串/字符序列 -
You can use something like marge(), where marge will just make a string/character sequence having all members of an array-
假设您尝试将“122225”与正则表达式“12*5”进行匹配。
使用 C/C++ 中的 snprintf 或 Java 中的 .toString() 等生成字符串应该干净且简单。
不建议您为此使用特殊的算法或工具。
Assume that you are trying to match "122225" against regular expression "12*5".
Generate string from in using snprintf in C/C++ or .toString() in Java etc. should be clean and simple.
Not recommend you to get a special algorithm or tool for this.