PHP正则表达式,替换所有垃圾符号

发布于 2024-11-08 16:02:37 字数 349 浏览 0 评论 0原文

我无法理解一个可靠的正则表达式来做到这一点,对于所有正则表达式的魔力来说仍然很新。我取得了一些有限的成功,但我觉得有一种更简单、更有效的方法。

我想净化一串所有非字母数字字符,并将所有这些无效子集变成一个下划线,但在边缘修剪它们。例如,字符串 <<+ćThis?//String_..! 应转换为 This_String

对于在一个正则表达式中完成这一切有什么想法吗?我用常规的 str_replace 做到了这一点,然后将多下划线正则化,然后从边缘修剪掉最后一个下划线,但这似乎有点矫枉过正,就像 RegEx 可以一次性完成的事情一样。在这里追求最大速度/效率,即使我正在处理的是毫秒。

I can't get my head around a solid RegEx for doing this, still very new at all this RegEx magic. I had some limited success, but I feel like there is a simpler, more efficient way.

I would like to purify a string of all non-alphanumeric characters, and turn all those invalid subsets into one single underscore, but trim them at the edges. For example, the string <<+ćThis?//String_..! should be converted to This_String

Any thoughts on doing this all in one RegEx? I did it with regular str_replace, and then regexed the multi-underscores out of the way, and then trimmed the last underscores from the edges, but it seems like overkill and like something RegEx could do in one go. Kind of going for max speed/efficiency here, even if it is milliseconds I'm dealing with.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

就是爱搞怪 2024-11-15 16:02:37
= trim(preg_replace('<\W+>', "_", $string), "_");

这里的大写 \W 转义符匹配“非单词”字符,表示除字母和数字之外的所有字符。要删除剩余的外部下划线,我仍然会使用 trim

= trim(preg_replace('<\W+>', "_", $string), "_");

The uppercase \W escape here matches "non-word" characters, meaning everything but letters and numbers. To remove the leftover outer underscores I would still use trim.

新雨望断虹 2024-11-15 16:02:37

是的,你可以这样做:

preg_replace("/[^a-zA-Z0-9]+/", "_", $myString);

然后你可以修剪前导和尾随下划线,也许可以这样做:

preg_replace("/^_+|_+$/", "", $myReplacedString);

它不是一个正则表达式,但它比 str_replace 和一堆正则表达式更干净。

Yes, you could do this:

preg_replace("/[^a-zA-Z0-9]+/", "_", $myString);

Then you would trim leading and trailing underscores, maybe by doing this:

preg_replace("/^_+|_+$/", "", $myReplacedString);

It's not one regex, but it's cleaner than str_replace and a bunch of regex.

挖个坑埋了你 2024-11-15 16:02:37
$output = preg_replace('/([^0-9a-z])/i', ' ', '<<+ćThis?//String_..!');
$output = preg_replace('!\s+!', '_', trim($output));
echo $output;
This_String
$output = preg_replace('/([^0-9a-z])/i', ' ', '<<+ćThis?//String_..!');
$output = preg_replace('!\s+!', '_', trim($output));
echo $output;
This_String
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文