Python,正则表达式邮政编码搜索

发布于 2024-07-10 08:18:32 字数 361 浏览 9 评论 0原文

我正在尝试使用正则表达式在字符串中查找英国邮政编码。

我已经在 RegexBuddy 中使用了正则表达式,请参见下文:

\b[A-Z]{1,2}[0-9][A-Z0-9]? [0-9][ABD-HJLNP-UW-Z]{2}\b

我有很多地址,并且想要从中获取邮政编码,示例如下:

123 一些道路名称
镇、市

PA23 6NH

我如何用Python 来解决这个问题? 我知道 Python 的 re 模块,但我很难让它工作。

干杯

埃夫

I am trying to use regular expressions to find a UK postcode within a string.

I have got the regular expression working inside RegexBuddy, see below:

\b[A-Z]{1,2}[0-9][A-Z0-9]? [0-9][ABD-HJLNP-UW-Z]{2}\b

I have a bunch of addresses and want to grab the postcode from them, example below:

123 Some Road Name
Town, City
County
PA23 6NH

How would I go about this in Python? I am aware of the re module for Python but I am struggling to get it working.

Cheers

Eef

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

故事和酒 2024-07-17 08:18:32

使用邮政编码 PA23 6NH、PA2 6NH 和 PA2Q 6NH 重复您的地址 3 次作为您模式的测试,并使用维基百科中的正则表达式与您的代码进行比较,代码是..

import re

s="123 Some Road Name\nTown, City\nCounty\nPA23 6NH\n123 Some Road Name\nTown, City"\
    "County\nPA2 6NH\n123 Some Road Name\nTown, City\nCounty\nPA2Q 6NH"

#custom                                                                                                                                               
print re.findall(r'\b[A-Z]{1,2}[0-9][A-Z0-9]? [0-9][ABD-HJLNP-UW-Z]{2}\b', s)

#regex from #http://en.wikipedia.orgwikiUK_postcodes#Validation                                                                                            
print re.findall(r'[A-Z]{1,2}[0-9R][0-9A-Z]? [0-9][A-Z]{2}', s)

结果是

['PA23 6NH', 'PA2 6NH', 'PA2Q 6NH']
['PA23 6NH', 'PA2 6NH', 'PA2Q 6NH']

两个正则表达式给出相同的结果。

repeating your address 3 times with postcode PA23 6NH, PA2 6NH and PA2Q 6NH as test for you pattern and using the regex from wikipedia against yours, the code is..

import re

s="123 Some Road Name\nTown, City\nCounty\nPA23 6NH\n123 Some Road Name\nTown, City"\
    "County\nPA2 6NH\n123 Some Road Name\nTown, City\nCounty\nPA2Q 6NH"

#custom                                                                                                                                               
print re.findall(r'\b[A-Z]{1,2}[0-9][A-Z0-9]? [0-9][ABD-HJLNP-UW-Z]{2}\b', s)

#regex from #http://en.wikipedia.orgwikiUK_postcodes#Validation                                                                                            
print re.findall(r'[A-Z]{1,2}[0-9R][0-9A-Z]? [0-9][A-Z]{2}', s)

the result is

['PA23 6NH', 'PA2 6NH', 'PA2Q 6NH']
['PA23 6NH', 'PA2 6NH', 'PA2Q 6NH']

both the regex's give the same result.

久随 2024-07-17 08:18:32

试试

import re
re.findall("[A-Z]{1,2}[0-9][A-Z0-9]? [0-9][ABD-HJLNP-UW-Z]{2}", x)

你不需要\b。

Try

import re
re.findall("[A-Z]{1,2}[0-9][A-Z0-9]? [0-9][ABD-HJLNP-UW-Z]{2}", x)

You don't need the \b.

破晓 2024-07-17 08:18:32
#!/usr/bin/env python

import re

ADDRESS="""123 Some Road Name
Town, City
County
PA23 6NH"""

reobj = re.compile(r'(\b[A-Z]{1,2}[0-9][A-Z0-9]? [0-9][ABD-HJLNP-UW-Z]{2}\b)')
matchobj = reobj.search(ADDRESS)
if matchobj:
    print matchobj.group(1)

输出示例:

[user@host]$ python uk_postcode.py 
PA23 6NH
#!/usr/bin/env python

import re

ADDRESS="""123 Some Road Name
Town, City
County
PA23 6NH"""

reobj = re.compile(r'(\b[A-Z]{1,2}[0-9][A-Z0-9]? [0-9][ABD-HJLNP-UW-Z]{2}\b)')
matchobj = reobj.search(ADDRESS)
if matchobj:
    print matchobj.group(1)

Example output:

[user@host]$ python uk_postcode.py 
PA23 6NH
烟柳画桥 2024-07-17 08:18:32

这是针对 PA236NH、PA2Q6NH、pa23 6nh、pa2q6NH 等邮政编码的解决方案

def get_postcode(text):
    # get all words with leading and ending digits along with atleast one alphabet

    pattern = r'\b[a-zA-Z]+\d+[a-zA-Z]?\b|\b\d+[a-zA-Z]+\b'    
    postcode = re.findall(pattern,  text.strip()) 

    # there may be postcodes expressed without a blank space
    if len(postcode) < 2:
          # extract all words with minimum 2 digits
          postcode = re.findall(r'\b[a-zA-Z]+\d+[a-zA-Z]?\d+[a-zA-Z]+\b', text.strip())

    postcode = ' '.join(postcode)
    return postcode

Here is a solution for postcodes like PA236NH, PA2Q6NH, pa23 6nh, pa2q6NH

def get_postcode(text):
    # get all words with leading and ending digits along with atleast one alphabet

    pattern = r'\b[a-zA-Z]+\d+[a-zA-Z]?\b|\b\d+[a-zA-Z]+\b'    
    postcode = re.findall(pattern,  text.strip()) 

    # there may be postcodes expressed without a blank space
    if len(postcode) < 2:
          # extract all words with minimum 2 digits
          postcode = re.findall(r'\b[a-zA-Z]+\d+[a-zA-Z]?\d+[a-zA-Z]+\b', text.strip())

    postcode = ' '.join(postcode)
    return postcode
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文