匹配括号中的 x 个单词正则表达式
如果字符串包含 4 个或更多单词,我会尝试从该字符串中删除括号。我一直在摸不着头脑,却一事无成。
preg_replace('#\([word]{4,}\)#', '', $str); # pseudo code
示例字符串:
罗伯特·阿尔纳基金标准公开赛 NH 平地赛(由安德鲁·斯图尔特慈善基金会支持)
要匹配(括号中超过 x 个单词)并删除:
(由安德鲁斯图尔特慈善基金会支持)
我有两个数据源,正在使用:
similar_text($str1, $str2, &$percent)
进行比较,括号中的长字符串对于一个源来说是唯一的。
I am trying to remove brackets from a string if it contains 4 or more words. I have been scratching my head and cannot get anywhere with it.
preg_replace('#\([word]{4,}\)#', '', $str); # pseudo code
Sample string:
Robert Alner Fund Standard Open NH Flat Race (Supported by The Andrew Stewart Charitable Foundation)
To match (more than x words in brackets) and remove:
(Supported by The Andrew Stewart Charitable Foundation)
I have two sources of data, and am using:
similar_text($str1, $str2, &$percent)
to compare and longish strings in brackets are unique to one source.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
好吧,你已经很接近了......
基本上,内部子模式
(\b\w+\b[^\w)]*)
匹配单词边界(意味着不在两个单词之间)单词字符),后跟至少一个单词字符 (a-z0-9),后跟另一个单词边界,最后后跟 0 个或多个非单词字符且不是)
的字符。 ..测试:
给出:
Well, you're close...
Basically, the inner sub-pattern
(\b\w+\b[^\w)]*)
matches a word-boundary (meaning not in-between two word characters) followed by at least one word character (a-z0-9), followed by another word-boundary, and finally followed by 0 or more characters that are not word characters and are not)
...Testing with:
Gives:
为此,您不需要
preg_replace()
。只需使用substr_count()
计算空格,然后使用str_replace()
。You don't need
preg_replace()
for this. Just count the spaces withsubstr_count()
, then usestr_replace()
.语法
[…]
具有特殊含义。[…]
被称为 字符类 并匹配以下之一列出的字符。因此[word]
匹配w
、o
、r
、d
的字符之一代码>.现在如果你想匹配单词,你应该首先定义单词是什么。如果一个单词是除空白字符之外的字符序列(
\S
表示所有非空白字符),您可以这样做:这匹配四个或更多单词的任何序列(非空白字符的序列) ),由空格字符 (
\s
) 分隔。括号中包含四个或更多单词:
请注意,
\S
确实匹配除空白字符之外的任何其他字符,这意味着甚至包括周围的括号。因此,您可能需要将\S
更改为[^\s)]
:The syntax
[…]
has a special meaning.[…]
are so called character classes and match one of the listed characters. So[word]
matches one of the character ofw
,o
,r
,d
.Now if you want to match words, you should first define what a word is. If a word is a sequence of characters except whitespace characters (
\S
represents all non-whitespace characters), you could do this:This matches any sequence of four or more words (sequence of non-whitespace characters) that are separated by whitespace characters (
\s
).And four or more words in brackets:
Note that
\S
does match anything else but whitespace characters, that means even the surrounding brackets. So you might want to change\S
to[^\s)]
:我不是专家,但这可能有用。
这是一个模式字符串:
这是 PHP 中的替换字符串,用于获取除前导和尾随转义括号之外的所有内容。
这是 preg_replace 函数。
I'm no expert, but this might work.
Here's a pattern string:
And here's a replacement string in PHP to take everything except the leading and trailing escaped parentheses.
Here's the preg_replace function.