从 BeautifulSoup4 (python) 的主汤中删除特定标签

发布于 2025-01-19 11:08:56 字数 948 浏览 2 评论 0原文

这就是我尝试过的 - 看看 soup.div.decompose(),我也尝试过 soup.elements.div.decompose()。另外,这是使用 DataTables 中的内容,这是我第一次使用它,所以如果有更好的方法来实现我正在做的事情,请告诉我!提前致谢!

import bs4

with open('MapPage.html', 'r', encoding="utf8") as f:
    txt = f.read()
    soup = bs4.BeautifulSoup(txt,"html5lib")

elements = soup.find_all('tr')
elements.pop(0)

def DeleteData(msgID):
    for div in elements:
        ID = div.find('a').contents[0]
        if int(msgID)==int(ID):
            soup.div.decompose()
            return
    print('Failed to delete data from', msgID)

我希望我能够再次将汤写入“MapPage.html”。产生错误 AttributeError: 'NoneType' object has no attribute 'decompose'。 这是打印 div 时的输出: (链接到 html 文件) 这是打印div时的输出

This is what i have tried - look at soup.div.decompose(), I also tried soup.elements.div.decompose(). Also this is using content from DataTables and this my first time using it so if there's a better way to achieve what i'm doing please tell me! Thanks in advace!

import bs4

with open('MapPage.html', 'r', encoding="utf8") as f:
    txt = f.read()
    soup = bs4.BeautifulSoup(txt,"html5lib")

elements = soup.find_all('tr')
elements.pop(0)

def DeleteData(msgID):
    for div in elements:
        ID = div.find('a').contents[0]
        if int(msgID)==int(ID):
            soup.div.decompose()
            return
    print('Failed to delete data from', msgID)

I'm hoping i'll be able to then just write the soup to the 'MapPage.html' again. The error AttributeError: 'NoneType' object has no attribute 'decompose' is produced.
This is the output when printing div:
(Link to html file)
This is the output when printing div

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

小嗲 2025-01-26 11:08:56

如果我理解正确,您喜欢decompose() < tr>,其中包含其< a>中的特定值。

主要问题是您尝试执行soup.div.decompose()您喜欢decompose() first < div> 汤对象。

只需使用:

div.decompose()

甚至更好地将变量名称更改为无标签名称:

e.decompose()

示例

from bs4 import BeautifulSoup

html = '''
<html><body>
    <h2>Welcome to our collection of community made maps!</h2>
    <table id="example" class="cell-border" style="width:100%">
        <thead>
            <tr><th>ID</th><th>Author</th><th>Content</th><th>Thumbnail</th><th>Download</th><th>Rating</th>
            </tr>
        </thead>
        <tbody>
            <tr>
                <td><a href="https://discord.com/channels/932741876174454914/932881912714895390/939257309387980851">939257309387980851</a></td>
                <td>Matter</td><td>Cervinia Source</td><td><img src="https://media.discordapp.net/attachments/932881912714895390/939257307290796062/unknown.png" alt="Cervina Thumb" width="300" height="auto"></td><td><a href="https://discord.com/channels/932741876174454914/932881912714895390/939257309387980851">Download</a></td><td>5</td>
            </tr>
            <tr><td><a href="https://discord.com/channels/932741876174454914/932881912714895390/939257309387980851">939257309387980852</a></td><td>Tea</td><td>Chamonix</td><td><img src="https://media.discordapp.net/attachments/932881912714895390/939257307290796062/unknown.png" alt="Cervina Thumb" width="300" height="auto"></td><td><a href="https://discord.com/channels/932741876174454914/932881912714895390/939257309387980851">Download</a></td><td>5</td></tr>
        </tbody>
    </table>
</body></html>
'''
soup = BeautifulSoup(html,)
elements = soup.select('tr:has(td)')

def DeleteData(msgID):
    for e in elements:
        ID = e.find('a').contents[0]
        if int(msgID)==int(ID):
            e.decompose()
            return
        print('Failed to delete data from', msgID)

DeleteData(939257309387980851)

If I understand right, you like to decompose() the <tr> that contains a specific value in its <a>.

Main issue is that you try to perform soup.div.decompose() what means, that you like to decompose() first <div> of soup object.

Simply use:

div.decompose()

or even better change your variable name to a none tag name:

e.decompose()

Example

from bs4 import BeautifulSoup

html = '''
<html><body>
    <h2>Welcome to our collection of community made maps!</h2>
    <table id="example" class="cell-border" style="width:100%">
        <thead>
            <tr><th>ID</th><th>Author</th><th>Content</th><th>Thumbnail</th><th>Download</th><th>Rating</th>
            </tr>
        </thead>
        <tbody>
            <tr>
                <td><a href="https://discord.com/channels/932741876174454914/932881912714895390/939257309387980851">939257309387980851</a></td>
                <td>Matter</td><td>Cervinia Source</td><td><img src="https://media.discordapp.net/attachments/932881912714895390/939257307290796062/unknown.png" alt="Cervina Thumb" width="300" height="auto"></td><td><a href="https://discord.com/channels/932741876174454914/932881912714895390/939257309387980851">Download</a></td><td>5</td>
            </tr>
            <tr><td><a href="https://discord.com/channels/932741876174454914/932881912714895390/939257309387980851">939257309387980852</a></td><td>Tea</td><td>Chamonix</td><td><img src="https://media.discordapp.net/attachments/932881912714895390/939257307290796062/unknown.png" alt="Cervina Thumb" width="300" height="auto"></td><td><a href="https://discord.com/channels/932741876174454914/932881912714895390/939257309387980851">Download</a></td><td>5</td></tr>
        </tbody>
    </table>
</body></html>
'''
soup = BeautifulSoup(html,)
elements = soup.select('tr:has(td)')

def DeleteData(msgID):
    for e in elements:
        ID = e.find('a').contents[0]
        if int(msgID)==int(ID):
            e.decompose()
            return
        print('Failed to delete data from', msgID)

DeleteData(939257309387980851)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文