从雅虎管道中的项目中获取子字符串

发布于 2024-10-18 05:44:16 字数 746 浏览 6 评论 0原文

以下情况:

item.
   content => "This is a 48593 test"
   title => "the title"

item.
   content => "This is a 48593 test 3255252"
   title => "the title"

item.
   content => "This 35542 is a 48593 test"
   title => "the title"

item.
   content => "i havent exactly 5 digits 34567654"
   title => "the title"

这是我在管道控制台中的当前项目,

不,我想将“内容”替换为“恰好有 5 位数字的数字的最后一个匹配项”。 想要的结果:

item.
   content => "48593"
   title => "the title"

item.
   content => "48593"
   title => "the title"

item.
   content => "48593"
   title => "the title"

item.
   content => ""
   title => "the title"

有没有办法在 pypes 2 中做到这一点?

如果有不清楚的地方请评论

following situation:

item.
   content => "This is a 48593 test"
   title => "the title"

item.
   content => "This is a 48593 test 3255252"
   title => "the title"

item.
   content => "This 35542 is a 48593 test"
   title => "the title"

item.
   content => "i havent exactly 5 digits 34567654"
   title => "the title"

this is my current item in the console of pipes

no i want to replace "content" with "the last match of a number that has exactly 5 digits.
wanted result:

item.
   content => "48593"
   title => "the title"

item.
   content => "48593"
   title => "the title"

item.
   content => "48593"
   title => "the title"

item.
   content => ""
   title => "the title"

is there a way to do this in pypes 2?

please comment if something is unclear

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

一场春暖 2024-10-25 05:44:16

像这样使用正则表达式模块:

在 item.content 中将 (.*) 替换为 X $1

在 item.content 中替换 .*\b(\d{5 })\b.*$1

在 item.content 中,将 X .* 替换为空(将字段留空)

这是一个示例管道

一些解释

  • \d{5} 恰好找到五位数字
  • < code>\b 字边界,以便找不到位数较多的数字。
  • 开头的 X 标记正则表达式不匹配的字符串,以便在
  • 找到最后一个数字 后删除它们而不是第一个是默认行为。因为*是一个贪心运算符。

Use the regex module like this:

In item.content replace (.*) with X $1

In item.content replace .*\b(\d{5})\b.* with $1

In item.content replace X .* with nothing (leave field empty)

Here's an example pipe

Some Explanations

  • \d{5} finds exactly five digits
  • \b word boundaries, so that numbers with more digits are not found
  • the X at the beginning marks strings where the regular expression doesn't match to delete them afterwards
  • finding the last number and not the first is the default behavior. Because * is a greedy operator.
记忆之渊 2024-10-25 05:44:16

抱歉,我除了 Python 之外什么都不知道

,但由于我对你的问题感兴趣,并且正则表达式在所有语言中或多或少都是相同的,所以我在 Python

import re

pat = re.compile("(?:.*((?<!\d)(?:\d{5})(?!\d))|\Z).*")

gh = ("This is a 48593 test",
      "This is a 48593 test 3255252",
      "This 35542 is a 48593 test",
      "i havent exactly 5 digits 34567654")

for x in gh:
    print x
    print 'AAA'+pat.search(x).groups("")[0]+'ZZZ'
    print

结果

This is a 48593 test
AAA48593ZZZ

This is a 48593 test 3255252
AAA48593ZZZ

This 35542 is a 48593 test
AAA48593ZZZ

i havent exactly 5 digits 34567654
AAAZZZ

中提出了我的解决方案'AAA' 和 'ZZZ' 没有其他实用程序显示第 4 个结果给出“”

groups("") 中的 "" 在没有匹配项时给出默认值 ""

否则,第四个结果将为 None

import re

pat = re.compile("(?:.*((?<!\d)(?:\d{5})(?!\d))|\Z).*")

gh = ("This is a 48593 test",
      "This is a 48593 test 3255252",
      "This 35542 is a 48593 test",
      "i havent exactly 5 digits 34567654")

for x in gh:
    print x
    print pat.search(x).groups()[0]
    print

结果为

This is a 48593 test
48593

This is a 48593 test 3255252
48593

This 35542 is a 48593 test
48593

i havent exactly 5 digits 34567654
None

sorry , i don't know anything else than Python

but as your problem interested me and that regexes are more or less the same in all the langages, I propose my solution in Python

import re

pat = re.compile("(?:.*((?<!\d)(?:\d{5})(?!\d))|\Z).*")

gh = ("This is a 48593 test",
      "This is a 48593 test 3255252",
      "This 35542 is a 48593 test",
      "i havent exactly 5 digits 34567654")

for x in gh:
    print x
    print 'AAA'+pat.search(x).groups("")[0]+'ZZZ'
    print

results

This is a 48593 test
AAA48593ZZZ

This is a 48593 test 3255252
AAA48593ZZZ

This 35542 is a 48593 test
AAA48593ZZZ

i havent exactly 5 digits 34567654
AAAZZZ

The 'AAA' and 'ZZZ' have no other utility to show that the 4th result gives ""

The "" in groups("") gives the default value "" when there is no match

Otherwise the 4th result would be None :

import re

pat = re.compile("(?:.*((?<!\d)(?:\d{5})(?!\d))|\Z).*")

gh = ("This is a 48593 test",
      "This is a 48593 test 3255252",
      "This 35542 is a 48593 test",
      "i havent exactly 5 digits 34567654")

for x in gh:
    print x
    print pat.search(x).groups()[0]
    print

results in

This is a 48593 test
48593

This is a 48593 test 3255252
48593

This 35542 is a 48593 test
48593

i havent exactly 5 digits 34567654
None
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文