从自由格式文本中提取国际街道地址/电话号码

发布于 2024-07-20 07:30:50 字数 446 浏览 14 评论 0原文

嘿伙计。 我正在寻找一些正则表达式来帮助从自由格式文本(la Gmail)中获取街道地址和电话号码。

给定一些文字:“约翰,我今天去了商店,太棒了!你听说他们搬到了 500 Green St. 了吗?...有机会的话请给我打电话 +14252425424。 “

我希望能够提取:

500 Green St.(识别为街道地址)

+14252425424(识别为电话号码)

让这个问题变得更容易的是我不关心解析被拉出的文本。 也就是说,我不在乎 Green 是道路名称还是 425 是区号。 我只想获取“看起来像”地址或电话号码的字符串。

不幸的是,这需要尽可能在国际上发挥作用。

有人有任何线索吗? 谢谢!

Hey, folks. I'm looking for some regular expressions to help grab street addresses and phone numbers from free-form text (a la Gmail).

Given some text: "John, I went to the store today, and it was awesome! Did you hear that they moved to 500 Green St.? ... Give me a call at +14252425424 when you get a chance."

I'd like to be able to pull out:

500 Green St. (recognized as a street address)

+14252425424 (recognized as a phone number)

What makes this problem easier is that I don't care about parsing text that gets pulled out. That is, I don't care that Green is the name of the road or that 425 is the area code. I just want to grab strings that "look like" addresses or telephone numbers.

Unfortunately, this needs to work internationally, as best as possible.

Anyone have any leads? Thanks!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

篱下浅笙歌 2024-07-27 07:30:50

电话号码只要你有所有国家代码和数字格式的列表就很容易,街道地址我不知道,我能给你的唯一建议是验证每组单词@addressdoctor.com

Phone numbers as long as you have a list of all country codes and number formats is easy, street addresses I have no idea, the only advice I can give you is to validate each set of words @ addressdoctor.com

美人骨 2024-07-27 07:30:50

您可以尝试一下 RecogniContact (-> address-parser.com),它可以识别邮政地址和电话数字。

You can give RecogniContact (-> address-parser.com) a try, it recognizes both postal addresses and phone numbers.

回忆凄美了谁 2024-07-27 07:30:50

请参阅深入了解 Python 的第 7 章。 它涉及电话号码和街道地址。 我相信您可以以此为起点。 国际部分似乎很难。 我建议您构建初稿,在多个语言环境中进行尝试,进行迭代和改进。

Take a look at Chapter 7 of Dive Into Python. It touches both phone numbers and street addresses. I believe you can use this as a starting point. The international part seems tough. I suggest you build a first draft, try it on several locales, iterate and improve.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文