如何创建以下集合的字符类

发布于 2024-11-04 16:28:38 字数 159 浏览 4 评论 0原文

+ - * / % < > = ! & ^ | ? :

我已经尝试过:

[+-*/%<>=!&^|?:]

但我认为其中一些需要转义。我怎么知道是哪些?

+ - * / % < > = ! & ^ | ? :

I've tried:

[+-*/%<>=!&^|?:]

But I think some of them will need to be escaped. How can I tell which ones?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

泪意 2024-11-11 16:28:38

您需要对 - 进行转义,否则它会被解释为 +* 之间的字符范围,这是无效的:

[+\-*/%<>=!&^|?:]

据我所知,其他元字符在字符类中按字面意思理解。

You'll need to escape the - as otherwise it'd be interpreted as a character range between + and *, which isn't valid:

[+\-*/%<>=!&^|?:]

The other metacharacters, as far as I know, are taken literally in a character class.

明媚如初 2024-11-11 16:28:38

如果您将减号作为第一个或最后一个字符放入组中,则不需要任何掩码,因为它不能像 [az] 中那样表示“直到”。同样,如果插入符号是该组的第一个字符,则它仅表示“不在该组中”: [^az] := not az。
量词 (+?*) 在组中没有任何意义,因此用于表示它们的字符在这里代表它们自己。
其他字符在正则表达式中从来没有特殊含义。

scala 中的快速演示:

for (c <- "-+*/%<>=!&^|?:") yield ("" + c).matches ("[-+*/%<>=!&^|?:]") 
res1: scala.collection.immutable.IndexedSeq[Boolean] =
  Vector(true, true, true, true, true, true, true, true, true, true, true, true, true, true)

You don't need any masking, if you put the minus as first or last character into your group, because then it can't mean 'until' as in [a-z]. Similarly, the caret only means 'not in this group' if it is the first character of the group: [^a-z] := not a-z.
Quantifiers (+?*) don't make any sense in a group, so the characters, used to represent them, stand for themselves here.
The other characters never mean something special in regexes.

Fast demo in scala:

for (c <- "-+*/%<>=!&^|?:") yield ("" + c).matches ("[-+*/%<>=!&^|?:]") 
res1: scala.collection.immutable.IndexedSeq[Boolean] =
  Vector(true, true, true, true, true, true, true, true, true, true, true, true, true, true)
标点 2024-11-11 16:28:38

您可以尝试使用字符类之一,例如 \p{Punct} (匹配 US-ASCII 标点符号)或 \W 匹配非单词字符 !(A-Za-z0-9),而不是显式枚举要匹配的符号。

您可以使用 \p{Punct}* 或 \W* 之类的内容,它们可能会导致比您的其他搜索更广泛的内容,但这可能不是一件坏事......

Instead of explicitly enumerating the symbols to match, you can try using one of the character classes like \p{Punct} (matches US-ASCII punctuation) or \W matches non word characters !(A-Za-z0-9).

You'd use something like \p{Punct}* or \W* they may result in something a bit broader than your other search, but that might not be a bad thing....

也只是曾经 2024-11-11 16:28:38

通过阅读手册:-)? + 表示至少匹配前一个字符,* 表示任意数量的字符,|是逻辑或,^ 定义字符的否定。
当然,您也可以逃脱所有这些!

编辑

我明白了,我应该先阅读手册:-)

by reading the manual :-)? + is match at least one previous character, * means any number of characters, | is logical or, ^ defines a negation of a character.
You can also escape all of them to be sure!

EDIT

I see, i should have also read the manual first :-)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文