匹配大写的正则表达式
def normalized?
matches = match(/[^A-Z]*/)
return matches.size == 0
end
这是我对字符串进行操作的函数,检查字符串是否只包含大写字母。它可以很好地排除不匹配,但是当我在像 "ABC"
这样的字符串上调用它时,它说不匹配,因为显然 matches.size
是 1 而不是零。里面好像有一个空元素左右。
有人能解释一下为什么吗?
def normalized?
matches = match(/[^A-Z]*/)
return matches.size == 0
end
This is my function operating on a string, checking wether a string contains only uppercase letters. It works fine ruling out non matches, but when i call it on a string like "ABC"
it says no match, because apparently matches.size
is 1 and not zero. There seems to be an empty element in it or so.
Can anybody explain why?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
您的正则表达式是错误的 - 如果您希望它仅匹配大写字符串,请使用
/^[AZ]+$/
。Your regex is wrong - if you want it to match ONLY uppercase strings, use
/^[A-Z]+$/
.您的正则表达式不正确。
/[^AZ]*/
表示“匹配字符串中任何位置不在A
和Z
之间的零个或多个字符”。字符串ABC
有 0 个不在A
和Z
之间的字符,因此它与正则表达式匹配。将正则表达式更改为
/^[^AZ]+$/
。这意味着“匹配不在A
和Z
之间的一个或多个字符,并确保字符串开头和结尾之间的每个字符都不在A 之间
和Z
”。那么字符串ABC
将不匹配,然后您可以根据 sepp2k 的答案检查matches[0].size
或其他内容。Your regular expression is incorrect.
/[^A-Z]*/
means "match zero or more characters that are not betweenA
andZ
, anywhere in the string". The stringABC
has zero characters that are not betweenA
andZ
, so it matches the regular expression.Change your regular expression to
/^[^A-Z]+$/
. This means "match one or more characters that are not betweenA
andZ
, and make sure every character between the beginning and end of the string are not betweenA
andZ
". Then the stringABC
will not match, and then you can checkmatches[0].size
or whatever, as per sepp2k's answer.MatchData#size
返回正则表达式中捕获组的数量加一,以便md[i]
将访问有效组当且仅当i
md.大小
。因此size
返回的值仅取决于正则表达式,而不是匹配的字符串,并且永远不会是 0。您需要
matches.to_s.size
或matches[ 0].大小
。MatchData#size
returns the number of capturing groups in the regex plus one, so thatmd[i]
will access a valid group iffi < md.size
. So the value returned bysize
only depends on the regex, not the matched string, and will never be 0.You want
matches.to_s.size
ormatches[0].size
.这个问题需要一个更明确的答案。正如 tchrist 评论的那样,我希望他能回答。 “用于匹配大写字母的正则表达式”的用途是:
正如 tchrist 提到的“与一般类别 \p{Uppercase_Letter} 又名 \p{Lu} 不同。那是因为存在算作的非字母大写”
This question needs a more clear answer. As tchrist commented, I wish he would have answered. The "Regex for matching capitals" is to use:
As tchrist mentions "is distinct from the general category \p{Uppercase_Letter} aka \p{Lu}. That’s because there exist non-Letters that count as Uppercase"
正则表达式中的
*
表示它匹配任意数量的非大写字符,包括零个。所以它总是匹配一切。修复方法是删除*
,这样它将无法匹配仅包含大写字符的字符串。 (尽管如果不允许零长度字符串,您将需要不同的测试。)The
*
in your regular expression means that it matches any number of non-uppercase characters, including zero. So it always matches everything. The fix is to remove the*
, then it will fail to match a string containing only uppercase characters. (Although you would need a different test if zero-length strings are not permitted.)如果您想知道输入字符串完全由英文大写字母(即 AZ)组成,则必须删除 Kleene Star,因为它将匹配任何输入字符串中每个字符之前和之后的字符(零长度匹配)。语句
!s[/[^AZ]/]
告诉您是否没有非 A 到 Z 字符的匹配:If you want to know that the input string entirely consists of English uppercase letters, i.e. A-Z, then you must remove the Kleene Star as it will match before and after every single character in any input string (zero length match). The statement
!s[/[^A-Z]/]
tells you if there's no match of non-A-to-Z characters:only 1 个正则表达式定义仅包含大写字母和 All 大写字母的字符串:
def onlyupper(s)
(s =~ /^[AZ]+$/) != nil
end
真值表:
There is only 1 regular expression that defines a string with only and All capitals:
def onlyupper(s)
(s =~ /^[A-Z]+$/) != nil
end
Truth table: