使用sed替换HTML标签内容

发布于 2024-12-01 12:06:26 字数 850 浏览 0 评论 0原文

我正在尝试在 bash 脚本中使用 sed 替换 HTML 页面中某些 HTML 标签的内容。由于某种原因,我没有得到正确的结果,因为它没有取代任何东西。这必须是非常简单/愚蠢的事情,我忽略了,有人愿意帮助我吗?

要搜索/替换的 HTML:

Unlocked <span id="unlockedCount"></span>/<span id="totalCount"></span> achievements for <span id="totalPoints"></span> points.

使用的 sed 命令:

cat index.html | sed -i -e "s/\<span id\=\"unlockedCount\"\>([0-9]\{0,\})\<\/span\>/${unlockedCount}/g" index.html 

这样做的目的是解析 HTML 页面并根据一些外部数据更新数字。第一次运行时,标签的内容将为空,之后将被填充。


编辑:

我最终使用了答案的组合,产生了以下代码:

sed -i -e 's|<span id="unlockedCount">\([0-9]\{0,\}\)</span>|<span id="unlockedCount">'"${unlockedCount}"'</span>|g' index.html

非常感谢@Sorpigal、@tripleee、@classic的帮助!

I'm trying to replace the content of some HTML tags in an HTML page using sed in a bash script. For some reason I'm not getting the proper result as it's not replacing anything. It has to be something very simple/stupid im overlooking, anyone care to help me out?

HTML to search/replace in:

Unlocked <span id="unlockedCount"></span>/<span id="totalCount"></span> achievements for <span id="totalPoints"></span> points.

sed command used:

cat index.html | sed -i -e "s/\<span id\=\"unlockedCount\"\>([0-9]\{0,\})\<\/span\>/${unlockedCount}/g" index.html 

The point of this is to parse the HTML page and update the figures according to some external data. For a first run, the contents of the tags will be empty, after that they will be filled.


EDIT:

I ended up using a combination of the answers which resulted in the following code:

sed -i -e 's|<span id="unlockedCount">\([0-9]\{0,\}\)</span>|<span id="unlockedCount">'"${unlockedCount}"'</span>|g' index.html

Many thanks to @Sorpigal, @tripleee, @classic for the help!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

可是我不能没有你 2024-12-08 12:06:26

试试这个:

sed -i -e "s/\(<span id=\"unlockedCount\">\)\(<\/span>\)/\1${unlockedCount}\2/g" index.html

Try this:

sed -i -e "s/\(<span id=\"unlockedCount\">\)\(<\/span>\)/\1${unlockedCount}\2/g" index.html
半透明的墙 2024-12-08 12:06:26

你所说的你想要做的事情并不是你告诉sed做的事情。

您想要将数字插入标签或替换它(如果存在)。您试图告诉 sed 做的是将 span 标记及其内容(如果有或数字)替换为 shell 变量中的值。

您还使用了许多复杂、烦人且容易出错的转义序列,但这些转义序列是不必要的。

这就是您想要的:

sed -r -i -e 's|<span id="unlockedCount">([0-9]{0,})</span>|<span id="unlockedCount">'"${unlockedCount}"'</span>|g' index.html

请注意差异:

  • 添加了 -r 以打开扩展表达式,否则您的捕获模式将无法工作。
  • 使用 | 而不是 / 作为替换的分隔符,这样就不需要转义 / 了。
  • 单引号 sed 表达式,这样就不需要从 shell 中转义其中的内容。
  • 在替换部分中包含匹配的span标签,这样它就不会被删除。
  • 为了展开 unlockedCount 变量,关闭单引号表达式,然后重新打开它。
  • 省略了这里没用的 cat |

我还在 shell 变量扩展周围使用了双引号,因为这是一个很好的做法,但如果它不包含空格,则实际上没有必要。

严格来说,我没有必要添加 -r。如果你说 \([0-9]\{0,\}\) ,普通的旧 sed 就可以工作,但这里的想法是简化。

What you say you want to do is not what you're telling sed to do.

You want to insert a number into a tag or replace it if present. What you're trying to tell sed to do is to replace a span tag and its contents, if any or a number, with the value of in a shell variable.

You're also employing a lot of complex, annoying and erorr-prone escape sequences which are just not necessary.

Here's what you want:

sed -r -i -e 's|<span id="unlockedCount">([0-9]{0,})</span>|<span id="unlockedCount">'"${unlockedCount}"'</span>|g' index.html

Note the differences:

  • Added -r to turn on extended expressions without which your capture pattern would not work.
  • Used | instead of / as the delimiter for the substitution so that escaping / would not be necessary.
  • Single-quoted the sed expression so that escaping things inside it from the shell would not be necessary.
  • Included the matched span tag in the replacement section so that it would not get deleted.
  • In order to expand the unlockedCount variable, closed the single-quoted expression, then later re-opened it.
  • Omitted cat | which was useless here.

I also used double quotes around the shell variable expansion, because this is good practice but if it contains no spaces this is not really necessary.

It was not, strictly speaking, necessary for me to add -r. Plain old sed will work if you say \([0-9]\{0,\}\), but the idea here was to simplify.

2024-12-08 12:06:26
sed -i -e 's%<span id="unlockedCount">([0-9]*)</span\>/'"${unlockedCount}/g" index.html 

我删除了 Cat 的无用用途,删除了一堆不必要的反斜杠,在正则表达式周围添加了单引号以保护它免受 shell 扩展的影响,并修复了重复运算符。您可能仍然需要反斜杠分组括号;至少我的 sed 想要 \(...\)。

请注意彼此相邻的单引号和双引号的使用。单引号可以防止 shell 扩展,因此当您确实希望 shell 插入变量时,不能在“${unlockedCount}”周围使用它们。

sed -i -e 's%<span id="unlockedCount">([0-9]*)</span\>/'"${unlockedCount}/g" index.html 

I removed the Useless Use of Cat, took out a bunch of unnecessary backslashes, added single quotes around the regex to protect it from shell expansion, and fixed the repetition operator. You might still need to backslash the grouping parentheses; my sed, at least, wants \(...\).

Note the use of single and double quotes next to each other. Single quotes protect against shell expansion, so you can't use them around "${unlockedCount}" where you do want the shell to interpolate the variable.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文