检查两个正则表达式是否匹配java中的相同字符串

发布于 2024-12-17 13:28:37 字数 115 浏览 7 评论 0原文

我有两个正则表达式(简单的示例:“[0-9]+”和“[0123456789]+”)。我想看看它们是否与完全相同的输入匹配。 java中有没有内置函数可以进行此检查?如果没有,是否有一个相对简单的算法来进行检查?谢谢!

I have two regular expressions (simple example: "[0-9]+" and "[0123456789]+"). I'd like to see if they match exactly the same inputs. Is there a built-in function for doing this check in java? If not, is there a relatively easy algorithm for doing the check? Thanks!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

差↓一点笑了 2024-12-24 13:28:37

实际上有一种算法方法可以检查正则表达式是否相等,尽管它很复杂。方法如下:

  1. 将两个正则表达式转换为其等效的 NFA。这是一个众所周知且已定义的流程。
  2. 转换两个 NFA通过幂集构建到DFA。
  3. 鉴于交集和补集是对于 DFA 来说是封闭且定义良好的,构造两个 DFA 的 XOR。 (这有点滥用符号,但如果自动机是 A 和 B,则构造 AB'+A'B)
  4. 此结果机器表示原始正则表达式之间的差异(一个中的任何字符串,但另一个中则不然)。现在只需从 DFA 的开始到结束运行图形可达性即可。如果失败了,他们是平等的,成功了,就不平等了!

There actually is an algorithmic way to check for regex equality, although it's complicated. Here's how:

  1. Convert both regexes into their equivalent NFA. This is a well-known and defined process.
  2. Convert both NFAs to DFAs via the powerset construction.
  3. Given that intersection and complementation are closed and well defined for DFAs, construct the XOR of the two DFAs. (This is somewhat an abuse of notation, but if the automota are A and B, construct AB'+A'B)
  4. This resultant machine represents the difference between the original regexes (any string in one but not the other). Now just run graph reachability from the start to end of the DFA. If it fails, they're equal, on success, not equal!
叫嚣ゝ 2024-12-24 13:28:37

首先,它是完全相同的。其次,我无法想象内置函数可以满足您的需求。想想:您实际上想要将正则表达式与多个输入进行匹配。什么输入?随机字符串?在这种情况下,您的随机字符串仅包含数字的机会是非常规律的。

我可以稍微改变一下你的问题。这是我的版本。

*我有 2 个正则表达式,想验证它们的功能是否相同。 *

这个问题很有道理。在这种情况下,我可以使用一个流行的单元测试框架(例如 JUnit 或 TestNG)编写一系列单元测试,并对这两个正则表达式运行相同的测试。我每次都期待相同的结果。但我必须自己写字符串。例如,

  • 空字符串
  • ,仅包含字母的字符串
  • ,仅包含数字的字符串
  • ,包含特殊字符的字符串
  • ,包含前面 等的 unicode 字符
  • 混合的

字符串,等等

First, it is exactly the same. Second, I cannot imagine built-in function that does what you want. Think: you actually want to match the regex against several inputs. What inputs? Random strings? In this case the chance that your random string contains digits only is very law.

I can a little bit change your question. Here is my version.

*I have 2 regular expressions and want to verify that they function equally. *

This question makes sense. In this case I can write a series of unit test using one of popular unit test frameworks (e.g. JUnit or TestNG) and run the same tests against these 2 regexes. I am expecting the same results every time. But I have to write the strings myself. For example

  • empty string
  • string with letters only
  • string with numbers only
  • string with special characters
  • string with unicode characters
  • mixture of previous

etc, etc

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文