正则表达式中的匹配条件

发布于 2024-12-23 12:33:33 字数 268 浏览 2 评论 0原文

预先说明:我是一个正则表达式新手。也许这个问题的一个好的答案是将我链接到一个解释这些条件如何工作的资源:)

假设我有一个街道名称,例如 23rd St 或 5th St。我想摆脱该程序“th”、“rd”、“nd”和“st”。这怎么能做到呢?

现在我有这样的表达式: (st|nd|rd|th) 。这样做的问题是,它还会匹配包含“st”、“nd”、“rd”或“th”的街道名称。所以我真正需要的是一个条件匹配,它在其自身之前查找至少一个数字(即,第一个而不是街道)。

谢谢你!

Just a note upfront: I'm a bit of a regex newbie. Perhaps a good answer to this question would involve linking me to a resource that explains how these sorts of conditions work :)

Lets say that I have a street name, like 23rd St or 5th St. I'd like to get rid of the proceeding "th", "rd", "nd", and "st". How can this be done?

Right now I have the expression: (st|nd|rd|th) . The problem with this is that it will also match street names that contain a "st", "nd", "rd", or "th". So what I really need is a conditional match that looks for a minimum of one number before itself (ie; 1st and not street).

Thank you!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

在梵高的星空下 2024-12-30 12:33:33

听起来您只想匹配序数后缀 (st|nd|rd|th),是吗?

如果您的正则表达式引擎支持它,您可以使用lookbehind 断言

/(?<=\d)(st|nd|rd|th)/

仅当前面有数字 \d 时才匹配 (st|nd|rd|th),但匹配不会捕获数字本身。

It sounds like you just want to match the ordinal suffix (st|nd|rd|th), yes?

If your regex engine supports it, you could use a lookbehind assertion.

/(?<=\d)(st|nd|rd|th)/

That matches (st|nd|rd|th) only if preceded by a digit \d, but the match does not capture the digit itself.

小兔几 2024-12-30 12:33:33

你真正想要的是锚。

尝试全局替换:

\b(\d+)(?:st|nd|rd|th)\b

用第一组。

说明:

  • \b -->匹配一个位置,其中单词字符(数字、字母、下划线)后跟一个非单词字符(前一组都不是),或者相反;
  • (\d+) -->匹配一个或多个数字,并将它们捕获到第一组中($1);
  • (?:st|nd|rd|th) -->匹配任何 st 等...但不捕获它(?:...) 是非捕获组);
  • \b -->见上文。

使用perl演示:

$ perl -pe 's/\b(\d+)(?:st|nd|rd|th)\b/$1/g' <<EOF
> Mark, 23rd street, New Hampshire
> I live on the 7th avenue
> No match here...
> azoiu32rdzeriuoiu
> EOF
Mark, 23 street, New Hampshire
I live on the 7 avenue
No match here...
azoiu32rdzeriuoiu

What you really want are anchors.

Try and replace globally:

\b(\d+)(?:st|nd|rd|th)\b

with the first group.

Explanation:

  • \b --> matches a position where either a word character (digit, letter, underscore) is followed by a non word character (none of the previous group), or the reverse;
  • (\d+) --> matches one or more digits, and capture them in first group ($1);
  • (?:st|nd|rd|th) --> matches any of st, etc... wihtout capturing it ((?:...) is a non capturing group);
  • \b --> see above.

Demonstration using perl:

$ perl -pe 's/\b(\d+)(?:st|nd|rd|th)\b/$1/g' <<EOF
> Mark, 23rd street, New Hampshire
> I live on the 7th avenue
> No match here...
> azoiu32rdzeriuoiu
> EOF
Mark, 23 street, New Hampshire
I live on the 7 avenue
No match here...
azoiu32rdzeriuoiu
溺孤伤于心 2024-12-30 12:33:33

尝试使用这个正则表达式:

(\d+)(?:st|nd|rd|th)

我不知道 ruby​​。在 PHP 中我会使用类似的东西:

preg_replace('/(\d+)(?:st|nd|rd|th) /', '$1', 'South 2nd Street');

删除后缀

Try using this regex:

(\d+)(?:st|nd|rd|th)

I don't know ruby. In PHP I would use something like:

preg_replace('/(\d+)(?:st|nd|rd|th) /', '$1', 'South 2nd Street');

to remove suffix

深海里的那抹蓝 2024-12-30 12:33:33

要删除序数:

 /(\d+)(?:st|nd|rd|th)\b/$1/

您必须捕获数字,以便可以用它替换匹配项。您可以捕获或不捕获序数,这并不重要,除非您想将其输出到其他地方。

http://www.regular-expressions.info/javascriptexample.html

To remove the ordinal:

 /(\d+)(?:st|nd|rd|th)\b/$1/

You must capture the number so you can replace the match with it. You can capture the ordinal or not, it doesn't matter unless you want to output it somewhere else.

http://www.regular-expressions.info/javascriptexample.html

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文