字符串模式匹配问题
想象一下,我们有一个长字符串,其中包含子字符串“cat”和“dog”以及其他随机字符,例如。
cat x dog cat x cat x dog x dog x cat x dog x cat
这里“x”代表任意随机字符序列(但不是“cat”或“dog”)。
我想要做的是找到每个“猫”,后面跟着除“狗”之外的任何字符,然后是“猫”。我想在每种情况下删除第一个“猫”实例。
在这种情况下,我想删除括号内的 [cat],因为在下一个“cat”之前,它后面没有“dog”:
cat x dog [cat] x cat x dog x dog x cat x dog x cat
最终结果是:
cat x dog x cat x dog x dog x cat x dog x cat
这该怎么办?
我想到以某种方式使用正则表达式,如 VonC 推荐的 (n)(?=(n)) 此处
(cat)(?=(.*cat))
匹配字符串中的所有“cat”对。但我仍然不确定如何使用它来删除“猫”之前没有“狗”的每只猫。
我要解决的真正问题是 Java。但我实际上只是在寻找通用的伪代码/正则表达式解决方案。
Imagine we have a long string containing the substrings 'cat' and 'dog' as well as other random characters, eg.
cat x dog cat x cat x dog x dog x cat x dog x cat
Here 'x' represents any random sequence of characters (but not 'cat' or 'dog').
What I want to do is find every 'cat' that is followed by any characters except 'dog' and then by 'cat'. I want to remove that first instance of 'cat' in each case.
In this case, I would want to remove the bracketed [cat] because there is no 'dog' after it before the next 'cat':
cat x dog [cat] x cat x dog x dog x cat x dog x cat
To end up with:
cat x dog x cat x dog x dog x cat x dog x cat
How can this be done?
I thought of somehow using a regular expression like (n)(?=(n)) as VonC recommended
here
(cat)(?=(.*cat))
to match all of the pairs of 'cat' in the string. But I am still not sure how I could use this to remove each cat that is not followed by 'dog' before 'cat'.
The real problem I am tackling is in Java. But I am really just looking for a general pseudocode/regex solution.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您想通过一次 RE 调用来完成此操作是否有任何特殊原因?我不确定这在 RE 中是否真的可行。
如果我必须这样做,我可能会分两次进行。首先标记字符串中“cat”和“dog”的每个实例,然后编写一些代码来识别需要删除哪些猫,并在另一遍中执行此操作。
伪代码如下:
Is there any particular reason you want to do this with just one RE call? I'm not sure if that's actually possible in one RE.
If I had to do this, I'd probably go in two passes. First mark each instance of 'cat' and 'dog' in the string, then write some code to identify which cats need to be removed, and do that in another pass.
Pseudocode follows: