需要正则表达式帮助来修改 XML 文件

发布于 2024-10-13 03:06:18 字数 1335 浏览 8 评论 0原文

我正在尝试修改一个 XML 文件,其中包含保存企业分支机构营业时间的元素。 XML 文件不一致,因为对于某些分店来说,它只有营业时间和关门时间,而其他分店则有营业时间、午餐关门时间、午餐后营业时间和关门时间。

下面两种类型的示例:

<monday>10.00,17.00</monday>
<monday>09.00,12.30,13.30,17.00</monday>

我想将这些字符串重新格式化为更好的格式,如下所示:

<monday>
  <open>10.00</open>
  <lunch></lunch>
  <close>17.00</close>
</monday>

<monday>
  <open>09.00</open>
  <lunch>12.30 - 13.30</lunch>
  <close>17.00</close>
</monday>

我一直在尝试在 Mac 上使用 BBEdit 正则表达式进行更改,但我遇到了困难,特别是我认为因为我不确定如何让正则表达式替换我告诉它匹配的文本的子集。例如,在伪代码中,我希望正则表达式执行此操作:

替换 time1,time2
time1time2

替换 <星期一>时间1,时间2,时间3,时间4
time1time2 - time3time4

我不是太熟悉正则表达式了,所以我肯定会犯一些错误,但到目前为止我一直在尝试以下操作:

替换 >#+\.#+,#+\.#+< >#+\.#+#+.\#+<

我明白这不是'无论如何都不会起作用,因为我告诉正则表达式用字符串“#+”等替换与 #+ 匹配的数字。

我怎样才能通过正则表达式或其他方式实现我想要做的事情意味着以及如何告诉正则表达式使用表达式进行比较但仅替换它匹配的字符的子集?

I'm trying to modify an XML file which contains elements holding opening times for branches of a business. The XML file is inconsistent because for some branches it has just an opening time and a closing time, others have an opening time, a closing time for lunch, a post-lunch opening time and a closing time.

Examples of both types below:

<monday>10.00,17.00</monday>
<monday>09.00,12.30,13.30,17.00</monday>

I want to reformat these strings to a better format such as the ones below:

<monday>
  <open>10.00</open>
  <lunch></lunch>
  <close>17.00</close>
</monday>

<monday>
  <open>09.00</open>
  <lunch>12.30 - 13.30</lunch>
  <close>17.00</close>
</monday>

I've been trying to use BBEdit regular expressions on my Mac to make the changes but I'm having difficulty, specifically I think because I'm not sure how I can get the regular expression to replace a subset of the text I tell it to match on. For example, in pseudo code I want the regular expression to do this:

replace <monday>time1,time2</monday>
with <monday><open>time1</open><lunch></lunch><close>time2</close></monday>

replace <monday>time1,time2,time3,time4</monday>
with <monday><open>time1</open><lunch>time2 - time3</lunch><close>time4</close></monday>

I'm not too familiar with regular expressions so I'm making some errors I'm sure but so far I've been trying the below:

replace >#+\.#+,#+\.#+<
with ><open>#+\.#+<open><lunch></lunch><close>#+.\#+<

I understand this isn't going to work anyway because I'm telling the regex to replace the numbers it matches with #+ with the strings '#+' etc.

How can I achieve what I want to do by regex or other means and also how to I tell the regular expression to use an expression for comparison but only replace a subset of the characters it matches?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

我很坚强 2024-10-20 03:06:18

好吧,我比我预想的更快地弄清楚了。以下是我使用的表达式:

我使用了以下查找字符串:

(<[a-z]+day>)([0-9]+\.[0-9]+),([0-9]+\.[0-9]+)(</[a-z]+day>)

...和以下替换字符串:

\1<open>\2</open><lunch></lunch><close>\3</close>\4

来匹配以下行:

<monday>10.00,17.00</monday>

这导致了以下输出:

<monday><open>10.00</open><lunch></lunch><close>17.00</close></monday>

Well I figured it out quicker than I expected. Here are the expressions I used:

I used the following find string:

(<[a-z]+day>)([0-9]+\.[0-9]+),([0-9]+\.[0-9]+)(</[a-z]+day>)

...and the following replace string:

\1<open>\2</open><lunch></lunch><close>\3</close>\4

to match the following lines:

<monday>10.00,17.00</monday>

which resulted in the following output:

<monday><open>10.00</open><lunch></lunch><close>17.00</close></monday>
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文