匹配大写的正则表达式

发布于 2024-10-26 21:48:03 字数 278 浏览 4 评论 0原文

 def normalized?

    matches = match(/[^A-Z]*/)
    return matches.size == 0

  end

这是我对字符串进行操作的函数,检查字符串是否只包含大写字母。它可以很好地排除不匹配,但是当我在像 "ABC" 这样的字符串上调用它时,它说不匹配,因为显然 matches.size 是 1 而不是零。里面好像有一个空元素左右。

有人能解释一下为什么吗?

 def normalized?

    matches = match(/[^A-Z]*/)
    return matches.size == 0

  end

This is my function operating on a string, checking wether a string contains only uppercase letters. It works fine ruling out non matches, but when i call it on a string like "ABC" it says no match, because apparently matches.size is 1 and not zero. There seems to be an empty element in it or so.

Can anybody explain why?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

各自安好 2024-11-02 21:48:03

您的正则表达式是错误的 - 如果您希望它仅匹配大写字符串,请使用 /^[AZ]+$/

Your regex is wrong - if you want it to match ONLY uppercase strings, use /^[A-Z]+$/.

你是我的挚爱i 2024-11-02 21:48:03

您的正则表达式不正确。 /[^AZ]*/ 表示“匹配字符串中任何位置不在 AZ 之间的零个或多个字符”。字符串 ABC 有 0 个不在 AZ 之间的字符,因此它与正则表达式匹配。

将正则表达式更改为 /^[^AZ]+$/。这意味着“匹配不在 AZ 之间的一个或多个字符,并确保字符串开头和结尾之间的每个字符都不在 A 之间Z”。那么字符串 ABC 将不匹配,然后您可以根据 sepp2k 的答案检查 matches[0].size 或其他内容。

Your regular expression is incorrect. /[^A-Z]*/ means "match zero or more characters that are not between A and Z, anywhere in the string". The string ABC has zero characters that are not between A and Z, so it matches the regular expression.

Change your regular expression to /^[^A-Z]+$/. This means "match one or more characters that are not between A and Z, and make sure every character between the beginning and end of the string are not between A and Z". Then the string ABC will not match, and then you can check matches[0].size or whatever, as per sepp2k's answer.

时常饿 2024-11-02 21:48:03

MatchData#size 返回正则表达式中捕获组的数量加一,以便 md[i] 将访问有效组当且仅当 i md.大小。因此 size 返回的值仅取决于正则表达式,而不是匹配的字符串,并且永远不会是 0。

您需要 matches.to_s.sizematches[ 0].大小

MatchData#size returns the number of capturing groups in the regex plus one, so that md[i] will access a valid group iff i < md.size. So the value returned by size only depends on the regex, not the matched string, and will never be 0.

You want matches.to_s.size or matches[0].size.

倾城月光淡如水﹏ 2024-11-02 21:48:03
ruby-1.9.2-p180>   def normalized? s
ruby-1.9.2-p180?>    s.match(/^[[:upper:]]+$/) ? true : false
ruby-1.9.2-p180?>  end
 => nil 
ruby-1.9.2-p180>  normalized? "asdf"
 => false 
ruby-1.9.2-p180>  normalized? "ASDF"
 => true 
ruby-1.9.2-p180>   def normalized? s
ruby-1.9.2-p180?>    s.match(/^[[:upper:]]+$/) ? true : false
ruby-1.9.2-p180?>  end
 => nil 
ruby-1.9.2-p180>  normalized? "asdf"
 => false 
ruby-1.9.2-p180>  normalized? "ASDF"
 => true 
人事已非 2024-11-02 21:48:03

这个问题需要一个更明确的答案。正如 tchrist 评论的那样,我希望他能回答。 “用于匹配大写字母的正则表达式”的用途是:

/\p{Uppercase}/

正如 tchrist 提到的“与一般类别 \p{Uppercase_Letter} 又名 \p{Lu} 不同。那是因为存在算作的非字母大写”

This question needs a more clear answer. As tchrist commented, I wish he would have answered. The "Regex for matching capitals" is to use:

/\p{Uppercase}/

As tchrist mentions "is distinct from the general category \p{Uppercase_Letter} aka \p{Lu}. That’s because there exist non-Letters that count as Uppercase"

陈独秀 2024-11-02 21:48:03

正则表达式中的 * 表示它匹配任意数量的非大写字符,包括零个。所以它总是匹配一切。修复方法是删除 *,这样它将无法匹配仅包含大写字符的字符串。 (尽管如果不允许零长度字符串,您将需要不同的测试。)

The * in your regular expression means that it matches any number of non-uppercase characters, including zero. So it always matches everything. The fix is to remove the *, then it will fail to match a string containing only uppercase characters. (Although you would need a different test if zero-length strings are not permitted.)

谢绝鈎搭 2024-11-02 21:48:03

如果您想知道输入字符串完全由英文大写字母(即 AZ)组成,则必须删除 Kleene Star,因为它将匹配任何输入字符串中每个字符之前和之后的字符(零长度匹配)。语句 !s[/[^AZ]/] 告诉您是否没有非 A 到 Z 字符的匹配:

irb(main):001:0> def normalized? s
irb(main):002:1>     return !s[/[^A-Z]/]
irb(main):003:1> end
=> nil
irb(main):004:0> normalized? "ABC"
=> true
irb(main):005:0> normalized? "AbC"
=> false
irb(main):006:0> normalized? ""
=> true
irb(main):007:0> normalized? "abc"
=> false

If you want to know that the input string entirely consists of English uppercase letters, i.e. A-Z, then you must remove the Kleene Star as it will match before and after every single character in any input string (zero length match). The statement !s[/[^A-Z]/] tells you if there's no match of non-A-to-Z characters:

irb(main):001:0> def normalized? s
irb(main):002:1>     return !s[/[^A-Z]/]
irb(main):003:1> end
=> nil
irb(main):004:0> normalized? "ABC"
=> true
irb(main):005:0> normalized? "AbC"
=> false
irb(main):006:0> normalized? ""
=> true
irb(main):007:0> normalized? "abc"
=> false
南烟 2024-11-02 21:48:03

only 1 个正则表达式定义仅包含大写字母和 All 大写字母的字符串:

def onlyupper(s)
(s =~ /^[AZ]+$/) != nil
end

真值表:

/[^A-Z]*/:
 Testing  'asdf'     matched  'asdf'     length  4
 Testing  'HHH'      matched  ''         length  0
 Testing  ''         matched  ''         length  0
 Testing  '-=AAA'    matched  '-='       length  2
--------
/[^A-Z]+/:
 Testing  'asdf'     matched  'asdf'     length  4
 Testing  'HHH'      matched  nil
 Testing  ''         matched  nil
 Testing  '-=AAA'    matched  '-='       length  2
--------
/^[^A-Z]*$/:
 Testing  'asdf'     matched  'asdf'     length  4
 Testing  'HHH'      matched  nil
 Testing  ''         matched  ''         length  0
 Testing  '-=AAA'    matched  nil
--------
/^[^A-Z]+$/:
 Testing  'asdf'     matched  'asdf'     length  4
 Testing  'HHH'      matched  nil
 Testing  ''         matched  nil
 Testing  '-=AAA'    matched  nil
--------
/^[A-Z]*$/:
 Testing  'asdf'     matched  nil
 Testing  'HHH'      matched  'HHH'      length  3
 Testing  ''         matched  ''         length  0
 Testing  '-=AAA'    matched  nil
--------
/^[A-Z]+$/:
 Testing  'asdf'     matched  nil
 Testing  'HHH'      matched  'HHH'      length  3
 Testing  ''         matched  nil
 Testing  '-=AAA'    matched  nil
--------

There is only 1 regular expression that defines a string with only and All capitals:

def onlyupper(s)
(s =~ /^[A-Z]+$/) != nil
end

Truth table:

/[^A-Z]*/:
 Testing  'asdf'     matched  'asdf'     length  4
 Testing  'HHH'      matched  ''         length  0
 Testing  ''         matched  ''         length  0
 Testing  '-=AAA'    matched  '-='       length  2
--------
/[^A-Z]+/:
 Testing  'asdf'     matched  'asdf'     length  4
 Testing  'HHH'      matched  nil
 Testing  ''         matched  nil
 Testing  '-=AAA'    matched  '-='       length  2
--------
/^[^A-Z]*$/:
 Testing  'asdf'     matched  'asdf'     length  4
 Testing  'HHH'      matched  nil
 Testing  ''         matched  ''         length  0
 Testing  '-=AAA'    matched  nil
--------
/^[^A-Z]+$/:
 Testing  'asdf'     matched  'asdf'     length  4
 Testing  'HHH'      matched  nil
 Testing  ''         matched  nil
 Testing  '-=AAA'    matched  nil
--------
/^[A-Z]*$/:
 Testing  'asdf'     matched  nil
 Testing  'HHH'      matched  'HHH'      length  3
 Testing  ''         matched  ''         length  0
 Testing  '-=AAA'    matched  nil
--------
/^[A-Z]+$/:
 Testing  'asdf'     matched  nil
 Testing  'HHH'      matched  'HHH'      length  3
 Testing  ''         matched  nil
 Testing  '-=AAA'    matched  nil
--------
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文