如何获取正则表达式来检查字符串是否仅包含字母字符 [az] 或 [AZ]?

发布于 2024-07-24 04:50:06 字数 655 浏览 13 评论 0原文

我正在尝试创建一个正则表达式来验证给定的字符串是否仅包含字母字符 az 或 AZ。 该字符串的长度最多为 25 个字母。 (我不确定正则表达式是否可以检查字符串的长度)

示例:
1. "abcdef" = true;
2. "a2bdef" = false;
3. "333" = false;
4. "j" = true;
5. "aaaaaaaaaaaaaaaaaaaaaaaaaaa" = false; //26 个字母

这是我到目前为止所拥有的...尽管

Regex 无法弄清楚它出了什么问题alphaPattern = new Regex("[^az]|[^AZ]");

我认为这意味着该字符串只能包含 az 中的大写或小写字母,但是当我将其与字符串匹配时对于所有字母,它都会返回 false...

此外,任何有关使用正则表达式与其他验证方法的效率的建议将不胜感激。

I'm trying to create a regex to verify that a given string only has alpha characters a-z or A-Z. The string can be up to 25 letters long. (I'm not sure if regex can check length of strings)

Examples:
1. "abcdef" = true;
2. "a2bdef" = false;
3. "333" = false;
4. "j" = true;
5. "aaaaaaaaaaaaaaaaaaaaaaaaaa" = false; //26 letters

Here is what I have so far... can't figure out what's wrong with it though

Regex alphaPattern = new Regex("[^a-z]|[^A-Z]");

I would think that would mean that the string could contain only upper or lower case letters from a-z, but when I match it to a string with all letters it returns false...

Also, any suggestions regarding efficiency of using regex vs. other verifying methods would be greatly appreciated.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

相对绾红妆 2024-07-31 04:50:06
Regex lettersOnly = new Regex("^[a-zA-Z]{1,25}$");
  • ^ 表示“从字符串开头开始匹配”
  • [a-zA-Z] 表示“匹配小写和大写字母 az”
  • {1,25} 表示“匹配前一项(字符类,见上文)1 到 25 次”
  • $ 表示“仅当光标位于字符串末尾时才匹配”
Regex lettersOnly = new Regex("^[a-zA-Z]{1,25}$");
  • ^ means "begin matching at start of string"
  • [a-zA-Z] means "match lower case and upper case letters a-z"
  • {1,25} means "match the previous item (the character class, see above) 1 to 25 times"
  • $ means "only match if cursor is at end of string"
潦草背影 2024-07-31 04:50:06

我正在尝试创建一个正则表达式来验证给定的字符串仅具有 alpha
字符 az 或 AZ。

正如许多其他人所指出的那样,使用所谓的“字符类”很容易完成。 本质上,这些允许我们指定用于匹配的值范围:
(注意:为了简化,我假设隐式 ^ 和 $ 锚点,这将在本文后面进行解释)

[az] 匹配任何单个小写字母。
例如:a 匹配,8 不匹配

[AZ] 匹配任何单个大写字母。
例如:A 匹配,a 不匹配

[0-9] 匹配零到九的任何单个数字
例如:8 个匹配,a 不匹配

[aeiou] 仅匹配 a 或 e 或 i 或 o 或 u。
ex: o 匹配,z 不匹配

[a-zA-Z] 匹配任何单个小写或大写字母。
例如:A 匹配、a 匹配、3 不匹配

这些自然也可以被否定:
[^az] 匹配非小写字母的任何内容
例如:5 个匹配,A 匹配,a 不匹配

[^AZ] 匹配任何非大写字母
ex: 5 matches, A does not matche, a matches

[^0-9] 匹配任何不是数字的内容
ex: 5 不匹配、A 匹配、a 匹配

[^Aa69] 匹配任何内容,只要不是 A 或 a 或 6 或 9
例如:5 个匹配、A 不匹配、a 不匹配、3 个匹配

要查看一些常见字符类,请转至:
http://www.regular-expressions.info/reference.html

字符串长度最多可达 25 个字母。
(我不确定正则表达式是否可以检查字符串的长度)

您绝对可以检查“长度”,但不是以您想象的方式。 严格来说,我们使用 {} 来衡量重复性,而不是长度:

a{2} 将两个 a 匹配在一起。
例如:a 不匹配、aa 匹配、aca 不匹配

4{3} 将三个 4 匹配在一起。
例如:4 不匹配、44 不匹配、444 匹配、4434 不匹配

重复的值我们可以设置下限和上限:

a{2,} 匹配两个或多个a在一起。
ex: a 不匹配, aa 匹配, aaa 匹配, aba 不匹配, aaaaaaaaa 匹配

a{2,5} 匹配 2 到 5 个 a。
例如:a 不匹配,aa 匹配,aaa 匹配,aba 不匹配,aaaaaaaaa 不匹配

重复扩展到字符类,因此:
[az]{5} 将任意五个小写字符匹配在一起。
ex: bubba 匹配, Bubba 不匹配, BUBBA 不匹配, asdjo 匹配

[AZ]{2,5} 将两到五个大写字符匹配在一起。
例如: bubba 不匹配,Bubba 不匹配,BUBBA 匹配,BUBBETTE 不匹配

[0-9]{4,8} 将四到八个数字匹配在一起。
例如: bubba 不匹配、15835 匹配、44 不匹配、3456876353456 不匹配

[a3g]{2} 如果 a OR 3 OR g 一起出现两次,则匹配它们。
例如:aa 匹配,ba 不匹配,33 匹配,38 不匹配,a3 不匹配

现在让我们看看你的正则表达式:
[^az]|[^AZ]
翻译:匹配任何内容,只要它不是小写字母或大写字母。

为了修复它以满足您的需求,我们将像这样重写它:
第 1 步:删除否定
[az]|[AZ]
翻译:找到任何小写字母或大写字母。

第 2 步:虽然不是严格需要,但让我们稍微清理一下 OR 逻辑
[a-zA-Z]
翻译:找到任何小写字母或大写字母。 与上面相同,但现在仅使用一组 []。

第 3 步:现在让我们指出“长度”
[a-zA-Z]{1,25}
翻译:找到任何重复一到二十五次的小写字母或大写字母。

这就是事情变得有趣的地方。 您可能认为您已经完成了,这很可能取决于您所使用的技术。

严格来说,正则表达式 [a-zA-Z]{1,25} 将匹配一行上的 1 到 25 个大写或小写字母 ANYWHERE

>[a-zA-Z]{1,25}
a matches, aZgD matches, BUBBA matches, 243242hello242552 MATCHES

事实上,到目前为止我给出的每个例子都会做同样的事情。 如果这就是您想要的,那么您的状态良好,但根据您的问题,我猜您只需要整行上有一到二十五个大写或小写字母。 为此,我们求助于锚。 锚点允许我们指定那些讨厌的细节:

^行的开头
(我知道,我们之前只是用它来求反,别让我开始)

$ 行尾

我们可以像这样使用它们:

^a{3}从该行开始,将 a 一起匹配 3 次
例如:aaa 匹配,123aaa 不匹配,aaa123 匹配

a{3}$ 在行尾匹配 a 3 次
例如:aaa 匹配、123aaa 匹配、aaa123 不匹配

^a{3}$ENTIRE 行匹配 3 次
例如:aaa 匹配,123aaa 不匹配,aaa123 不匹配

请注意,aaa 在所有情况下都匹配,因为从技术上讲,它在行的开头和结尾有三个 a。

因此,在一行中查找“最多五个字符长”的“单词”的最终技术上正确的解决方案是:

^[a-zA-Z]{1,25}$

有趣的是,有些技术会隐式地将锚点放入正则表达式中,而有些则不会。 您只需测试您的正则表达式或阅读文档即可查看是否有隐式锚点。

I'm trying to create a regex to verify that a given string only has alpha
characters a-z or A-Z.

Easily done as many of the others have indicated using what are known as "character classes". Essentially, these allow us to specifiy a range of values to use for matching:
(NOTE: for simplification, I am assuming implict ^ and $ anchors which are explained later in this post)

[a-z] Match any single lower-case letter.
ex: a matches, 8 doesn't match

[A-Z] Match any single upper-case letter.
ex: A matches, a doesn't match

[0-9] Match any single digit zero to nine
ex: 8 matches, a doesn't match

[aeiou] Match only on a or e or i or o or u.
ex: o matches, z doesn't match

[a-zA-Z] Match any single lower-case OR upper-case letter.
ex: A matches, a matches, 3 doesn't match

These can, naturally, be negated as well:
[^a-z] Match anything that is NOT an lower-case letter
ex: 5 matches, A matches, a doesn't match

[^A-Z] Match anything that is NOT an upper-case letter
ex: 5 matches, A doesn't matche, a matches

[^0-9] Match anything that is NOT a number
ex: 5 doesn't match, A matches, a matches

[^Aa69] Match anything as long as it is not A or a or 6 or 9
ex: 5 matches, A doesn't match, a doesn't match, 3 matches

To see some common character classes, go to:
http://www.regular-expressions.info/reference.html

The string can be up to 25 letters long.
(I'm not sure if regex can check length of strings)

You can absolutely check "length" but not in the way you might imagine. We measure repetition, NOT length strictly speaking using {}:

a{2} Match two a's together.
ex: a doesn't match, aa matches, aca doesn't match

4{3} Match three 4's together.
ex: 4 doesn't match, 44 doesn't match, 444 matches, 4434 doesn't match

Repetition has values we can set to have lower and upper limits:

a{2,} Match on two or more a's together.
ex: a doesn't match, aa matches, aaa matches, aba doesn't match, aaaaaaaaa matches

a{2,5} Match on two to five a's together.
ex: a doesn't match, aa matches, aaa matches, aba doesn't match, aaaaaaaaa doesn't match

Repetition extends to character classes, so:
[a-z]{5} Match any five lower-case characters together.
ex: bubba matches, Bubba doesn't match, BUBBA doesn't match, asdjo matches

[A-Z]{2,5} Match two to five upper-case characters together.
ex: bubba doesn't match, Bubba doesn't match, BUBBA matches, BUBBETTE doesn't match

[0-9]{4,8} Match four to eight numbers together.
ex: bubba doesn't match, 15835 matches, 44 doesn't match, 3456876353456 doesn't match

[a3g]{2} Match an a OR 3 OR g if they show up twice together.
ex: aa matches, ba doesn't match, 33 matches, 38 doesn't match, a3 DOESN'T match

Now let's look at your regex:
[^a-z]|[^A-Z]
Translation: Match anything as long as it is NOT a lowercase letter OR an upper-case letter.

To fix it so it meets your needs, we would rewrite it like this:
Step 1: Remove the negation
[a-z]|[A-Z]
Translation: Find any lowercase letter OR uppercase letter.

Step 2: While not stricly needed, let's clean up the OR logic a bit
[a-zA-Z]
Translation: Find any lowercase letter OR uppercase letter. Same as above but now using only a single set of [].

Step 3: Now let's indicate "length"
[a-zA-Z]{1,25}
Translation: Find any lowercase letter OR uppercase letter repeated one to twenty-five times.

This is where things get funky. You might think you were done here and you may well be depending on the technology you are using.

Strictly speaking the regex [a-zA-Z]{1,25} will match one to twenty-five upper or lower-case letters ANYWHERE on a line:

[a-zA-Z]{1,25}
a matches, aZgD matches, BUBBA matches, 243242hello242552 MATCHES

In fact, every example I have given so far will do the same. If that is what you want then you are in good shape but based on your question, I'm guessing you ONLY want one to twenty-five upper or lower-case letters on the entire line. For that we turn to anchors. Anchors allow us to specify those pesky details:

^ beginning of a line
(I know, we just used this for negation earlier, don't get me started)

$ end of a line

We can use them like this:

^a{3} From the beginning of the line match a three times together
ex: aaa matches, 123aaa doesn't match, aaa123 matches

a{3}$ Match a three times together at the end of a line
ex: aaa matches, 123aaa matches, aaa123 doesn't match

^a{3}$ Match a three times together for the ENTIRE line
ex: aaa matches, 123aaa doesn't match, aaa123 doesn't match

Notice that aaa matches in all cases because it has three a's at the beginning and end of the line technically speaking.

So the final, technically correct solution, for finding a "word" that is "up to five characters long" on a line would be:

^[a-zA-Z]{1,25}$

The funky part is that some technologies implicitly put anchors in the regex for you and some don't. You just have to test your regex or read the docs to see if you have implicit anchors.

那些过往 2024-07-31 04:50:06
/// <summary>
/// Checks if string contains only letters a-z and A-Z and should not be more than 25 characters in length
/// </summary>
/// <param name="value">String to be matched</param>
/// <returns>True if matches, false otherwise</returns>
public static bool IsValidString(string value)
{
    string pattern = @"^[a-zA-Z]{1,25}$";
    return Regex.IsMatch(value, pattern);
}
/// <summary>
/// Checks if string contains only letters a-z and A-Z and should not be more than 25 characters in length
/// </summary>
/// <param name="value">String to be matched</param>
/// <returns>True if matches, false otherwise</returns>
public static bool IsValidString(string value)
{
    string pattern = @"^[a-zA-Z]{1,25}$";
    return Regex.IsMatch(value, pattern);
}
清晰传感 2024-07-31 04:50:06

字符串长度最多可达 25 个字母。
(我不确定正则表达式是否可以检查字符串的长度)

正则表达式确实可以检查字符串的长度 - 从其他人发布的答案中可以看出。

但是,当您验证用户输入(例如用户名)时,我建议单独进行该检查。

问题是,正则表达式只能告诉您字符串是否匹配。 它不会告诉你为什么不匹配。 文本是否太长或者是否包含不允许的字符 - 您无法判断。 当程序说:“提供的用户名包含无效字符或太长”时,这远非友好。 相反,您应该针对不同的情况提供单独的错误消息。

The string can be up to 25 letters long.
(I'm not sure if regex can check length of strings)

Regexes ceartanly can check length of a string - as can be seen from the answers posted by others.

However, when you are validating a user input (say, a username), I would advise doing that check separately.

The problem is, that regex can only tell you if a string matched it or not. It won't tell why it didn't match. Was the text too long or did it contain unallowed characters - you can't tell. It's far from friendly, when a program says: "The supplied username contained invalid characters or was too long". Instead you should provide separate error messages for different situations.

时光沙漏 2024-07-31 04:50:06

您使用的正则表达式是 [^az][^AZ] 的交替。 表达式[^…]表示匹配字符集中描述之外的任何字符。

因此,总的来说,您的表达式意味着匹配 azAZ 以外的任何单个字符。

但您更需要一个仅匹配 a-zA-Z 的正则表达式:

[a-zA-Z]

要指定其长度,请使用开始 (^) 和结束 ( $) 的字符串并用 {n,< 描述长度em>m} 量词,表示至少 n 但不超过 < code>m 次重复:

^[a-zA-Z]{0,25}$

The regular expression you are using is an alternation of [^a-z] and [^A-Z]. And the expressions [^…] mean to match any character other than those described in the character set.

So overall your expression means to match either any single character other than a-z or other than A-Z.

But you rather need a regular expression that matches a-zA-Z only:

[a-zA-Z]

And to specify the length of that, anchor the expression with the start (^) and end ($) of the string and describe the length with the {n,m} quantifier, meaning at least n but not more than m repetitions:

^[a-zA-Z]{0,25}$
悲歌长辞 2024-07-31 04:50:06

我是否正确理解它只能包含大写小写字母?

new Regex("^([a-z]{1,25}|[A-Z]{1,25})$")

对于这种情况,正则表达式似乎是正确的选择。

顺便说一下,字符类中第一个位置的脱字号(“^”)表示“不”,因此您的“[^az]|[^AZ]”意味着“不是任何小写字母”字母,或不是任何大写字母”(忽略 az 并不全是字母)。

Do I understand correctly that it can only contain either uppercase or lowercase letters?

new Regex("^([a-z]{1,25}|[A-Z]{1,25})$")

A regular expression seems to be the right thing to use for this case.

By the way, the caret ("^") at the first place inside a character class means "not", so your "[^a-z]|[^A-Z]" would mean "not any lowercase letter, or not any uppercase letter" (disregarding that a-z are not all letters).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文