正则表达式仅允许字母数字、逗号、连字符、下划线和分号

发布于 01-06 06:59 字数 670 浏览 2 评论 0原文

我已经有了一些工作代码,但我需要有人帮助解释为什么它可以工作(如果可以的话)!

我使用 PHP 来替换字符串中的任何内容,如果它不是 az、AZ、0-9、逗号、分号、下划线或连字符(最终应表示单个用户名或逗号/分号分隔)用户名列表)。

以下有效:

$data = preg_replace('/[^,;a-zA-Z0-9_-]/s', '', $data);

但以下无效:

$data = preg_replace('/[^a-zA-Z0-9_-,;]/s', '', $data);

为什么只有当逗号和分号开头时才有效?将它们放在最后似乎会破坏一些东西(这是我在遇到 /[^a-zA-Z0-9_-]/s 时最初尝试的方法。

顺便说一句,我还使用以下内容来修剪任何 尾随分号(复数)或逗号(复数),有人可能能够建议一种更有效和/或更优雅的方法来做到这一点?:

if(preg_match('/;$/', $data))
{
    $data = rtrim($data, ';' );
}
if(preg_match('/,$/', $data))
{
    $data = rtrim($data, ',' );
}

感谢您的帮助:)

I've already got a bit of working code but I need someone to help explain why it works if they can!

I am using PHP to replace anything in a string if it is not either a-z, A-Z, 0-9, a comma, a semicolon, an underscore or a hyphen (which ultimately should represent either a single username, or a comma/semicolon separated list of usernames).

The following works:

$data = preg_replace('/[^,;a-zA-Z0-9_-]/s', '', $data);

But the following does not:

$data = preg_replace('/[^a-zA-Z0-9_-,;]/s', '', $data);

Why will this only work when the comma and semicolon are at the start? Putting them at the end seems to break things (this is what I tried initially when I came across /[^a-zA-Z0-9_-]/s.

As an aside, I am also using the following to trim any trailing semicolons (plural) or commas (plural) and someone may be able to suggest a more efficient and/or elegant way to do this?:

if(preg_match('/;$/', $data))
{
    $data = rtrim($data, ';' );
}
if(preg_match('/,$/', $data))
{
    $data = rtrim($data, ',' );
}

Thanks for any help :)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

逐鹿2025-01-13 06:59:22

这不是导致你的问题的逗号和分号;这是连字符。查看字符类的各个部分并考虑它们的含义:

0-9 # Anything from '0' to '9', meaning 0, 1, 2, ... 9
A-Z # Anything from 'A' to 'Z', meaning A, B, C, ... Z
_-, # Anything from '_' to ',', meaning...uh...hmmm.

_, 没有明显的进展,因此正则表达式引擎不确定如何处理它。在字符类中,如果您希望按字面解释连字符,则它需要位于类的开头或结尾(或使用反斜杠转义)。因此,这些中的任何一个都可以工作:

[^,;a-zA-Z0-9_-]
[^-,;a-zA-Z0-9_]
[^a-zA-Z0-9_\-,;]

至于修剪末端,您可以在一个正则表达式替换中完成所有这些操作:

$data = preg_replace('/[^,;a-zA-Z0-9_-]|[,;]$/s', '', $data);

It's not the comma and semicolon causing your problem; it's the hyphen. Look at the parts of your character class and consider what they mean:

0-9 # Anything from '0' to '9', meaning 0, 1, 2, ... 9
A-Z # Anything from 'A' to 'Z', meaning A, B, C, ... Z
_-, # Anything from '_' to ',', meaning...uh...hmmm.

There's no clear progression from _ to ,, so the regex engine isn't sure what to make of this. In character classes, if you want a hyphen to be interpreted literally, it needs to be at the very beginning or end of the class (or escaped with a backslash). So any of these will work:

[^,;a-zA-Z0-9_-]
[^-,;a-zA-Z0-9_]
[^a-zA-Z0-9_\-,;]

As for trimming off the end, you can do all of this in one regex replace:

$data = preg_replace('/[^,;a-zA-Z0-9_-]|[,;]$/s', '', $data);
羁绊已千年2025-01-13 06:59:22

我相信连字符的位置很重要——连字符必须位于开头或结尾(字面意思),否则它会被用来定义范围。

I believe it's the placement of the hyphen that matters -- has to be at start or end to be a hyphen (literal), otherwise it's being used to define a range.

潦草背影2025-01-13 06:59:22

您可以转义连字符并将其放在正则表达式中的任何位置,如下所示 \-

至于尾随分号和逗号,请尝试此 /[,;]+$/ 它应该匹配末尾的任何逗号和分号,即使它们很多。

You can escape the hyphen and put it anywhere in the regex like this \-

As for the trailing semicolons and commas, try this /[,;]+$/ it should match any commas and semicolons at the end even if they are many.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文