正则表达式:英国固定电话、手机号码
我一直在努力寻找合适的解决方案:-
我需要一个匹配所有英国电话号码和手机的正则表达式。
到目前为止,这个似乎涵盖了大多数英国号码:
^0\d{2,4}[ -]{1}[\d]{3}[\d -]{1}[\d -]{1}[\d]{1,4}$
但是,手机号码不适用于此正则表达式或以单个实心块编写的电话号码,例如 01234567890。
任何人都可以帮助我创建所需的正则表达式吗?
I've been struggling with finding a suitable solution :-
I need an regex expression that will match all UK phone numbers and mobile phones.
So far this one appears to cover most of the UK numbers:
^0\d{2,4}[ -]{1}[\d]{3}[\d -]{1}[\d -]{1}[\d]{1,4}$
However mobile numbers do not work with this regex expression or phone-numbers written in a single solid block such as 01234567890.
Could anyone help me create the required regex expression?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
明显不正确:数字或空格或连字符。
01000 不是有效的英国区号。 123456 不是有效的本地号码。
测试数据必须是真实的区号和真实的号码范围,这一点很重要。
由于许多不同的原因,上述模式是垃圾。
[7,8] 匹配 7 或逗号或 8。不需要匹配逗号。
伦敦号码也以 3 开头,而不仅仅是 7 或 8。
伦敦 020 号码并不是唯一的 2+8 格式号码;伦敦 020 号码并不是唯一的 2+8 格式号码。另请参见 023、024、028 和 029。
[1-9]{1} 简化为 [1-9]
[ ]?简化为 \s?
既然已经找到了最初的0,为什么还要一次又一次地寻找它呢?
^(0....|0..|0....|0....)$ 简化为 ^0(....|....|....|... .)$
说真的。 ([1]|[2]|[3]|[7]){1} 在此简化为 [1237]。
英国电话号码使用多种格式:2+8、3+7、3+6、4+6、4+5、5+5、5+4。一些用户不知道哪种格式适合哪个数字范围,并且可能在输入时使用错误的格式。让他们这样做吧;你对数字感兴趣。
第 1 步:检查输入格式是否有效
确保输入内容类似于英国电话号码。接受各种拨号前缀,+44、011 44、00 44,带或不带括号、连字符或空格;或带有前导 0 的国家格式。让用户对数字的其余部分使用他们想要的任何格式:(020) 3555 7788 或 00 (44) 203 555 7788 或 02035-557-788,即使它的格式错误那个特定的数字。不用担心括号不平衡。输入的重要部分是确保其位数正确。标点符号和空格并不重要。
上述模式匹配可选的左括号,后跟 00 或 011 和可选的右括号,后跟可选的空格或连字符,最后是可选的左括号。或者,最初的左括号后跟一个文字+,后面不带空格或连字符。前两个选项中的任何一个后跟 44 和可选的右括号,后跟可选的空格或连字符,后跟可选括号中的可选 0,后跟可选的空格或连字符,后跟可选的左括号(国际格式)。或者,该模式匹配可选的初始左括号,后跟 0 干线代码(国家格式)。
前一部分后面是 NDC(区号)和用户电话号码,格式为 2+8、3+7、3+6、4+6、4+5、5+5 或 5+4,带或不带空格和/或连字符。这还包括在用户认为区号结束和本地用户号码开始之后提供可选的右括号和/或可选的空格或连字符。该模式允许任何格式与任何 GB 数一起使用。如果用户在输入时使用了错误的数字格式,则必须通过后续逻辑更正显示格式。
该模式以可选的分机号结尾,该分机号排列为可选的空格或连字符,后跟 x、ext 和可选的句点或 #,后跟分机号数字。整个模式不需要检查平衡括号,因为这些括号将在下一步中从数字中删除。
此时,您并不关心该数字是否以 01 或 07 或其他形式开头。您并不关心它是否是有效的区号。后续步骤将处理这些问题。
第 2 步:提取 NSN,以便可以更详细地检查长度和范围
使用上述模式检查输入看起来像 GB 电话号码后,下一步是提取 NSN 部分,以便可以更详细地检查其有效性,然后针对适用的数字范围以正确的方式进行格式化。
使用上面的模式从 $1 中提取“44”即可知道使用了国际格式,否则如果 $1 为空,则假定为国家格式。
从 $3 中提取可选分机号码详细信息并存储以供以后使用。
从 $2 中提取 NSN(包括空格、连字符和括号)。
第 3 步:验证 NSN
从 $2 中删除空格、连字符和括号,并使用进一步的正则表达式模式来检查长度和范围并识别数字类型。
这些模式将更加简单,因为它们不必处理各种拨号前缀或国家/地区代码。
因此,匹配有效手机号码的模式就像
高级费率一样简单。
每种号码类型都有许多其他模式:固定电话、商业费率、非地理、VoIP 等。
通过将问题分解为几个步骤,可以允许非常广泛的输入格式,并且对 NSN 的数字范围和长度进行非常详细的检查。
第 4 步:存储号码
提取并验证 NSN 后,将号码与国家/地区代码以及所有其他数字(不含空格或标点符号)一起存储,例如 442035557788。
第 5 步:格式化用于显示的数字
可以使用另一组简单规则来格式化数字,并在开头添加必需的+44或0。
03 开头的数字的规则
格式为
02 开头的数字的
格式为
完整列表很长。我可以将其全部复制并粘贴到该线程中,但随着时间的推移,很难在多个位置维护该信息。目前,完整列表可在以下网址找到:http://aa-asterisk.org。 uk/index.php/Regular_Expressions_for_Validating_and_Formatting_GB_Telephone_Numbers
is blatently incorrect: a digit OR a space OR a hyphen.
01000 is not a valid UK area code. 123456 is not a valid local number.
It is important that test data be real area codes and real number ranges.
The above pattern is garbage for many different reasons.
[7,8] matches 7 or comma or 8. You don't need to match a comma.
London numbers also begin with 3 not just 7 or 8.
London 020 numbers aren't the only 2+8 format numbers; see also 023, 024, 028 and 029.
[1-9]{1} simplifies to [1-9]
[ ]? simplifies to \s?
Having found the intial 0 once, why keep searching for it again and again?
^(0....|0....|0....|0....)$ simplifies to ^0(....|....|....|....)$
Seriously. ([1]|[2]|[3]|[7]){1} simplifies to [1237] here.
UK phone numbers use a variety of formats: 2+8, 3+7, 3+6, 4+6, 4+5, 5+5, 5+4. Some users don't know which format goes with which number range and might use the wrong one on input. Let them do that; you're interested in the DIGITS.
Step 1: Check the input format looks valid
Make sure that the input looks like a UK phone number. Accept various dial prefixes, +44, 011 44, 00 44 with or without parentheses, hyphens or spaces; or national format with a leading 0. Let the user use any format they want for the remainder of the number: (020) 3555 7788 or 00 (44) 203 555 7788 or 02035-557-788 even if it is the wrong format for that particular number. Don't worry about unbalanced parentheses. The important part of the input is making sure it's the correct number of digits. Punctuation and spaces don't matter.
The above pattern matches optional opening parentheses, followed by 00 or 011 and optional closing parentheses, followed by an optional space or hyphen, followed by optional opening parentheses. Alternatively, the initial opening parentheses are followed by a literal + without a following space or hyphen. Any of the previous two options are then followed by 44 with optional closing parentheses, followed by optional space or hyphen, followed by optional 0 in optional parentheses, followed by optional space or hyphen, followed by optional opening parentheses (international format). Alternatively, the pattern matches optional initial opening parentheses followed by the 0 trunk code (national format).
The previous part is then followed by the NDC (area code) and the subscriber phone number in 2+8, 3+7, 3+6, 4+6, 4+5, 5+5 or 5+4 format with or without spaces and/or hyphens. This also includes provision for optional closing parentheses and/or optional space or hyphen after where the user thinks the area code ends and the local subscriber number begins. The pattern allows any format to be used with any GB number. The display format must be corrected by later logic if the wrong format for this number has been used by the user on input.
The pattern ends with an optional extension number arranged as an optional space or hyphen followed by x, ext and optional period, or #, followed by the extension number digits. The entire pattern does not bother to check for balanced parentheses as these will be removed from the number in the next step.
At this point you don't care whether the number begins 01 or 07 or something else. You don't care whether it's a valid area code. Later steps will deal with those issues.
Step 2: Extract the NSN so it can be checked in more detail for length and range
After checking the input looks like a GB telephone number using the pattern above, the next step is to extract the NSN part so that it can be checked in greater detail for validity and then formatted in the right way for the applicable number range.
Use the above pattern to extract the '44' from $1 to know that international format was used, otherwise assume national format if $1 is null.
Extract the optional extension number details from $3 and store them for later use.
Extract the NSN (including spaces, hyphens and parentheses) from $2.
Step 3: Validate the NSN
Remove the spaces, hyphens and parentheses from $2 and use further RegEx patterns to check the length and range and identify the number type.
These patterns will be much simpler, since they will not have to deal with various dial prefixes or country codes.
The pattern to match valid mobile numbers is therefore as simple as
Premium rate is
There will be a number of other patterns for each number type: landlines, business rate, non-geographic, VoIP, etc.
By breaking the problem into several steps, a very wide range of input formats can be allowed, and the number range and length for the NSN checked in very great detail.
Step 4: Store the number
Once the NSN has been extracted and validated, store the number with country code and all the other digits with no spaces or punctuation, e.g. 442035557788.
Step 5: Format the number for display
Another set of simple rules can be used to format the number with the requisite +44 or 0 added at the beginning.
The rule for numbers beginning 03 is
formatted as
and for numbers beginning 02 is
formatted as
The full list is quite long. I could copy and paste it all into this thread, but it would be hard to maintain that information in multiple places over time. For the present the complete list can be found at: http://aa-asterisk.org.uk/index.php/Regular_Expressions_for_Validating_and_Formatting_GB_Telephone_Numbers
鉴于人们有时会在随机位置用空格书写数字,您最好忽略所有空格 - 您可以使用像这样简单的正则表达式:
^0(\d ?){10}$
这匹配:
但它也会匹配:
所以你可能不喜欢它,但它肯定更简单。
Given that people sometimes write their numbers with spaces in random places, you might be better off ignoring the spaces all together - you could use a regex as simple as this then:
^0(\d ?){10}$
This matches:
But it would also match:
So you may not like it, but it's certainly simpler.
这个正则表达式可以吗?
请注意,空格和破折号是可选的,并且可以成为其中的一部分。此外,它现在分为两个捕获组,分别称为“area_code”和“tel_no”,以将其分解并更容易提炼。
Would this regex do?
Notice how the spaces and dashes are optional and can be part of it.. also it is now divided into two capture groups called
area_code
andtel_no
to break it down and easier to extract.去掉所有空格和非数字字符,然后进行测试。这比尝试考虑括号、空格等所有可能的选项要容易得多。
请尝试以下操作:
以
0
或+44
开头(国际) - 我确信您可以根据需要添加0044
。然后它有
1
、2
、3
或7
。然后它有 8 或 9 位数字。
如果您想变得更聪明,以下内容可能是有用的参考:http://en.wikipedia。 org/wiki/Telephone_numbers_in_the_United_Kingdom
Strip all whitespace and non-numeric characters and then do the test. It'll be musch , much easier than trying to account for all the possible options around brackets, spaces, etc.
Try the following:
Starts with
0
or+44
(for international) - I;m sure you could add0044
if you wanted.It then has a
1
,2
,3
or7
.It then has either 8 or 9 digits.
If you want to be even smarter, the following may be a useful reference: http://en.wikipedia.org/wiki/Telephone_numbers_in_the_United_Kingdom
它不是一个单一的正则表达式,但有来自 Braemoor Software 的示例代码,很容易遵循并且相当彻底。
JS 版本可能是最容易阅读的。它去掉空格和连字符(我意识到你说过你不能这样做),然后应用一些正向和负向正则表达式检查。
It's not a single regex, but there's sample code from Braemoor Software that is simple to follow and fairly thorough.
The JS version is probably easiest to read. It strips out spaces and hyphens (which I realise you said you can't do) then applies a number of positive and negative regexp checks.
首先删除非数字,但第一个字符 + 除外。
(Javascript)
下面的正则表达式允许在国际指示符 + 之后使用 7 到 15 位数字(ITU 最大值)之间的任意组合,除非代码是 +44(英国)。否则,如果字符串以 +44、+440 或 0 开头,则后跟 2 或 7,然后是任意数字中的 9 个,或者后跟 1,然后是除 0 之外的任意数字,然后是任意数字中的 7 个或 8 个数字。 (所以 0203 有效,0703 有效,但 0103 无效)。目前没有 025(或伦敦 0205)这样的代码,但有一天可能会分配这些代码。
其主要目的是识别非公司号码的正确起始数字,然后是正确的数字位数。它不会推断用户的本地号码是 5、6、7 还是 8 位数字。它没有强制禁止用户号码中以首字母“1”或“0”开头,我找不到任何关于这些旧规则是否仍然执行的信息。英国电话规则不适用于英国境外格式正确的国际电话号码。
Start by stripping the non-numerics, excepting a + as the first character.
(Javascript)
The regex below allows, after the international indicator +, any combination of between 7 and 15 digits (the ITU maximum) UNLESS the code is +44 (UK). Otherwise if the string either begins with +44, +440 or 0, it is followed by 2 or 7 and then by nine of any digit, or it is followed by 1, then any digit except 0, then either seven or eight of any digit. (So 0203 is valid, 0703 is valid but 0103 is not valid). There is currently no such code as 025 (or in London 0205), but those could one day be allocated.
Its primary purpose is to identify a correct starting digit for a non-corporate number, followed by the correct number of digits to follow. It doesn't deduce if the subscriber's local number is 5, 6, 7 or 8 digits. It does not enforce the prohibition on initial '1' or '0' in the subscriber number, about which I can't find any information as to whether those old rules are still enforced. UK phone rules are not enforced on properly formatted international phone numbers from outside the UK.
经过长时间搜索有效的正则表达式以涵盖英国案例后,我发现验证英国电话号码的最佳方法(如果您使用客户端 JavaScript)是使用
libphonenumber-js
以及自定义配置以减少包大小:如果您使用 NodeJS,请通过运行生成英国元数据:
然后通过
libphonenumber-js/core
导入并使用元数据:CodeSandbox 示例
After a long search for valid regexen to cover UK cases, I found that the best way (if you're using client side javascript) to validate UK phone numbers is to use
libphonenumber-js
along with custom config to reduce bundle size:If you're using NodeJS, generate UK metadata by running:
then import and use the metadata with
libphonenumber-js/core
:CodeSandbox Example