如何更改 HTML 标签的内部文本而不删除它们

发布于 2025-01-16 21:15:14 字数 1042 浏览 2 评论 0原文

假设我有“半”HMTL 字符串，比如

some_string = "sometext<body>someText<h1>Text</h1>Worldt<p>And some text here<br>Text.</p></body>HereAlsoText"

我需要替换字符串中的所有标签，但保留所有 HTML 标签（包括 br）：

"UPDATED<body>UPDATED<h1>UPDATED</h1>UPDATED<p>UPDATED<br>UPDATED</p></body>UPDATED"

以下代码可以工作，但无法使用执行任何操作html 之前和之后的标签和文本（在本例中，在 body 标签之外）：

soup = BeautifulSoup(mod_string, "html.parser")


# Find all tags
tags = soup.find_all()
# Loop through child tags
for tag in tags:
    # Check if tag is a string
    if tag.string:
        if tag.name != 'br':

            # Replace string
            tag.string.replace_with("TEST")

for parent_tag in tags:
    if not parent_tag.string:
        parent_tag.string = ''.join(
        ["TEST"
            if not re.match(r'<[^>]+>', str(t)) else str(t)
            for t in parent_tag.contents])

感谢您的帮助。谢谢！

原文

Assuming I have "semi" HMTL string like

some_string = "sometext<body>someText<h1>Text</h1>Worldt<p>And some text here<br>Text.</p></body>HereAlsoText"

I need to replace all tags in the string but with keeping all HTML tags (including br):

"UPDATED<body>UPDATED<h1>UPDATED</h1>UPDATED<p>UPDATED<br>UPDATED</p></body>UPDATED"

The following code works, but cannot do anything with <br> tag and text before and after html (outside of body tag, in this case):

soup = BeautifulSoup(mod_string, "html.parser")


# Find all tags
tags = soup.find_all()
# Loop through child tags
for tag in tags:
    # Check if tag is a string
    if tag.string:
        if tag.name != 'br':

            # Replace string
            tag.string.replace_with("TEST")

for parent_tag in tags:
    if not parent_tag.string:
        parent_tag.string = ''.join(
        ["TEST"
            if not re.match(r'<[^>]+>', str(t)) else str(t)
            for t in parent_tag.contents])

Appreciate your help. Thanks!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

尝蛊 2025-01-23 21:15:14

保持更简单，只需选择所有文本节点并替换您在示例中已经尝试过的文本：

for e in soup.find_all(text=True):
    e.string.replace_with('UPDATE')

示例

import requests
from bs4 import BeautifulSoup

some_string = 'sometext<body>someText<h1>Text</h1>Worldt<p>And some text here<br>Text.</p></body>HereAlsoText'

soup = BeautifulSoup(some_string, 'html.parser')

for e in soup.find_all(text=True):
    e.string.replace_with('UPDATE')

print(soup)

输出

UPDATE<body>UPDATE<h1>UPDATE</h1>UPDATE<p>UPDATE<br/>UPDATE</p></body>UPDATE

Keep it more simple, just select all the text nodes and replace the text as you have already tried in your example:

for e in soup.find_all(text=True):
    e.string.replace_with('UPDATE')

Example

import requests
from bs4 import BeautifulSoup

some_string = 'sometext<body>someText<h1>Text</h1>Worldt<p>And some text here<br>Text.</p></body>HereAlsoText'

soup = BeautifulSoup(some_string, 'html.parser')

for e in soup.find_all(text=True):
    e.string.replace_with('UPDATE')

print(soup)

Output

UPDATE<body>UPDATE<h1>UPDATE</h1>UPDATE<p>UPDATE<br/>UPDATE</p></body>UPDATE

回复收藏 0 原文

~没有更多了~

关于作者

私藏温柔

暂无简介

文章

518 人气

关注发私信

alipaysp_snBf0MSZIv

文章 0 评论 0

关注

梦断已成空

文章 0 评论 0

关注

瞎闹

文章 0 评论 0

关注

凯凯我们等你回来

文章 0 评论 0

关注

寄意

文章 0 评论 0

关注

似梦非梦

文章 0 评论 0

友情链接

文江博客

如何更改 HTML 标签的内部文本而不删除它们

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

示例

输出

Example

Output

关于作者

相关话题

热门标签

推荐作者

alipaysp_snBf0MSZIv

梦断已成空

瞎闹

凯凯我们等你回来

寄意

似梦非梦

友情链接

如何更改 HTML 标签的内部文本而不删除它们

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

示例

输出

Example

Output

关于作者

相关话题

热门标签

推荐作者

alipaysp_snBf0MSZIv

梦断已成空

瞎闹

凯凯我们等你回来

寄意

似梦非梦

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。