使用Beautifutsoup更新HTML

发布于 2025-01-24 03:53:31 字数 2624 浏览 1 评论 0原文

    with open("test.html") as fp:
        soup = BeautifulSoup(fp, "html.parser")
        table = soup.find("table", {"class": "wrapped relative-table confluenceTable"})
        
        for row in table.findAll("tr"):
            cells = row.find(text=re.compile("CTX.*"))
            print(cells)
            cell2s = row.find(text=re.compile("ES.*"))
            cell2s.string.replace_with("ABC50") 

以下是HTML代码。顶一个是表标签,然后是车身标签,然后有2个TR标签。我必须更新两个TD标签的ESW2值。

 <table class="wrapped relative-table confluenceTable" style="width: <tbody> <tr>
              <td class="confluenceTd">angular/cli</td>
              <td class="confluenceTd"><a href="http://angular.io" class="external-link" rel="nofollow">angular.io</a></td>
               <td colspan="1" class="confluenceTd"><br></td>
               <td class="confluenceTd">FOSS</td>
               <td colspan="1" class="confluenceTd">'11.2.14</td>
               <td class="confluenceTd">Bazaar + EVMS 22.0&#xa0;</td>
               <td class="confluenceTd"><a href="https://github.com" class="external-link" rel="nofollow">14/CTX1026329</a>&#xa0;R1A</td>
                <td colspan="1" class="confluenceTd"><br></td>
                <td class="confluenceTd">ESW2</td>
                <td colspan="1" class="confluenceTd"><br></td>
                <td class="confluenceTd"><br></td>
</tr> 
<tr>
                <td class="confluenceTd">angular/cli</td>
                <td class="confluenceTd"><a href="http://angular.io" class="external-link" rel="nofollow">angular.io</a></td>
                <td colspan="1" class="confluenceTd"><br></td>
                <td class="confluenceTd">FOSS</td>
                <td colspan="1" class="confluenceTd">'11.2.14</td>
                <td class="confluenceTd">Bazaar + EVMS 22.0&#xa0;</td>
                <td class="confluenceTd"><a href="https://bazaar.internal.ericsson.com/b-view-component.php?componentid=988161" class="external-link" rel="nofollow">14/CTX1026329</a>&#xa0;R1A</td>
                <td colspan="1" class="confluenceTd"><br></td>
                <td class="confluenceTd">ESW2</td>
                <td colspan="1" class="confluenceTd"><br></td>
</tr>
</tbody>
    with open("test.html") as fp:
        soup = BeautifulSoup(fp, "html.parser")
        table = soup.find("table", {"class": "wrapped relative-table confluenceTable"})
        
        for row in table.findAll("tr"):
            cells = row.find(text=re.compile("CTX.*"))
            print(cells)
            cell2s = row.find(text=re.compile("ES.*"))
            cell2s.string.replace_with("ABC50") 

Below is the Html code. Top one is table tag , then body tag comes and afterwards there are 2 tr tags. I have to update ESW2 value of both td tags.

 <table class="wrapped relative-table confluenceTable" style="width: <tbody> <tr>
              <td class="confluenceTd">angular/cli</td>
              <td class="confluenceTd"><a href="http://angular.io" class="external-link" rel="nofollow">angular.io</a></td>
               <td colspan="1" class="confluenceTd"><br></td>
               <td class="confluenceTd">FOSS</td>
               <td colspan="1" class="confluenceTd">'11.2.14</td>
               <td class="confluenceTd">Bazaar + EVMS 22.0 </td>
               <td class="confluenceTd"><a href="https://github.com" class="external-link" rel="nofollow">14/CTX1026329</a> R1A</td>
                <td colspan="1" class="confluenceTd"><br></td>
                <td class="confluenceTd">ESW2</td>
                <td colspan="1" class="confluenceTd"><br></td>
                <td class="confluenceTd"><br></td>
</tr> 
<tr>
                <td class="confluenceTd">angular/cli</td>
                <td class="confluenceTd"><a href="http://angular.io" class="external-link" rel="nofollow">angular.io</a></td>
                <td colspan="1" class="confluenceTd"><br></td>
                <td class="confluenceTd">FOSS</td>
                <td colspan="1" class="confluenceTd">'11.2.14</td>
                <td class="confluenceTd">Bazaar + EVMS 22.0 </td>
                <td class="confluenceTd"><a href="https://bazaar.internal.ericsson.com/b-view-component.php?componentid=988161" class="external-link" rel="nofollow">14/CTX1026329</a> R1A</td>
                <td colspan="1" class="confluenceTd"><br></td>
                <td class="confluenceTd">ESW2</td>
                <td colspan="1" class="confluenceTd"><br></td>
</tr>
</tbody>

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

若无相欠,怎会相见 2025-01-31 03:53:31

使用.find_all()查找具有您“ es。*”的所有&lt; td&gt;。然后使用.string.replace_with()

with open("test.html", "r") as fp:
    html= fp.read()
    soup = BeautifulSoup(html, "html.parser")
    table = soup.find("table", {"class": "wrapped relative-table confluenceTable"})

    for each in table.find_all('td', text=re.compile("ES.*")):
        each.string.replace_with('ESW3')  

输出:

<table class="wrapped relative-table confluenceTable" confluencetd"="" style="width: <tbody> <tr>
              <td class=">angular/cli
<td class="confluenceTd"><a class="external-link" href="http://angular.io" rel="nofollow">angular.io</a></td>
<td class="confluenceTd" colspan="1"><br/></td>
<td class="confluenceTd">FOSS</td>
<td class="confluenceTd" colspan="1">'11.2.14</td>
<td class="confluenceTd">Bazaar + EVMS 22.0 </td>
<td class="confluenceTd"><a class="external-link" href="https://github.com" rel="nofollow">14/CTX1026329</a> R1A</td>
<td class="confluenceTd" colspan="1"><br/></td>
<td class="confluenceTd">ESW3</td>
<td class="confluenceTd" colspan="1"><br/></td>
<td class="confluenceTd"><br/></td>

<tr>
<td class="confluenceTd">angular/cli</td>
<td class="confluenceTd"><a class="external-link" href="http://angular.io" rel="nofollow">angular.io</a></td>
<td class="confluenceTd" colspan="1"><br/></td>
<td class="confluenceTd">FOSS</td>
<td class="confluenceTd" colspan="1">'11.2.14</td>
<td class="confluenceTd">Bazaar + EVMS 22.0 </td>
<td class="confluenceTd"><a class="external-link" href="https://bazaar.internal.ericsson.com/b-view-component.php?componentid=988161" rel="nofollow">14/CTX1026329</a> R1A</td>
<td class="confluenceTd" colspan="1"><br/></td>
<td class="confluenceTd">ESW3</td>
<td class="confluenceTd" colspan="1"><br/></td>
</tr>
</table>

Use .find_all() to find all the <td> tags that have your "ES.*". Then use .string.replace_with()

with open("test.html", "r") as fp:
    html= fp.read()
    soup = BeautifulSoup(html, "html.parser")
    table = soup.find("table", {"class": "wrapped relative-table confluenceTable"})

    for each in table.find_all('td', text=re.compile("ES.*")):
        each.string.replace_with('ESW3')  

Output:

<table class="wrapped relative-table confluenceTable" confluencetd"="" style="width: <tbody> <tr>
              <td class=">angular/cli
<td class="confluenceTd"><a class="external-link" href="http://angular.io" rel="nofollow">angular.io</a></td>
<td class="confluenceTd" colspan="1"><br/></td>
<td class="confluenceTd">FOSS</td>
<td class="confluenceTd" colspan="1">'11.2.14</td>
<td class="confluenceTd">Bazaar + EVMS 22.0 </td>
<td class="confluenceTd"><a class="external-link" href="https://github.com" rel="nofollow">14/CTX1026329</a> R1A</td>
<td class="confluenceTd" colspan="1"><br/></td>
<td class="confluenceTd">ESW3</td>
<td class="confluenceTd" colspan="1"><br/></td>
<td class="confluenceTd"><br/></td>

<tr>
<td class="confluenceTd">angular/cli</td>
<td class="confluenceTd"><a class="external-link" href="http://angular.io" rel="nofollow">angular.io</a></td>
<td class="confluenceTd" colspan="1"><br/></td>
<td class="confluenceTd">FOSS</td>
<td class="confluenceTd" colspan="1">'11.2.14</td>
<td class="confluenceTd">Bazaar + EVMS 22.0 </td>
<td class="confluenceTd"><a class="external-link" href="https://bazaar.internal.ericsson.com/b-view-component.php?componentid=988161" rel="nofollow">14/CTX1026329</a> R1A</td>
<td class="confluenceTd" colspan="1"><br/></td>
<td class="confluenceTd">ESW3</td>
<td class="confluenceTd" colspan="1"><br/></td>
</tr>
</table>
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文