当前位置：文江博客话题详情

HTML Python lxml

如何使用LXML作为字符串中的路径中的HREF属性返回？

发布于 2025-02-09 18:03:24 字数 1156 浏览 2 评论 0 原文

我有工作代码，可以

'//*[@id=all_TorontoBlueJayspitching"]/div/table/tbody/tr/th/a/text()'

从网站

使用脚本：

import requests

from lxml import html

boxScore = "CHA/CHA202206200"

url = "https://www.baseball-reference.com/boxes/" + boxScore + ".shtml"

page = requests.get(url)

tree = html.fromstring(b''.join(line for line in page.content.splitlines() if b'<!--' not in line and b'-->' not in line))

getTeams = tree.xpath('//*[@class="scorebox"]/div/div/strong/a/text()')

for team in getTeams:

team = team.replace(" ", "")

stringy = '"all_' + team + 'pitching"'

stringx = '//*[@id=' + stringy + ']/div/table/tbody/tr/th/a/text()'


tambellini = tree.xpath(stringx)

print(tambellini)

问题是我不想打印此文本，我想打印其中一条路径。这意味着我或多或少正在尝试进入

'//*[@id=all_TorontoBlueJayspitching"]/div/table/tbody/tr/th/a'

，然后在/a中值HREF（在这种情况下为href =

-一个元素，但我不知道如何作为变量访问路径本身。

原文

I have working code that prints element

'//*[@id=all_TorontoBlueJayspitching"]/div/table/tbody/tr/th/a/text()'

From the site https://www.baseball-reference.com/boxes/CHA/CHA202206200.shtml

Using the script:

import requests

from lxml import html

boxScore = "CHA/CHA202206200"

url = "https://www.baseball-reference.com/boxes/" + boxScore + ".shtml"

page = requests.get(url)

tree = html.fromstring(b''.join(line for line in page.content.splitlines() if b'<!--' not in line and b'-->' not in line))

getTeams = tree.xpath('//*[@class="scorebox"]/div/div/strong/a/text()')

for team in getTeams:

team = team.replace(" ", "")

stringy = '"all_' + team + 'pitching"'

stringx = '//*[@id=' + stringy + ']/div/table/tbody/tr/th/a/text()'


tambellini = tree.xpath(stringx)

print(tambellini)

The problem is I do not want to print this text, I want to print one of the paths. Meaning I more or less am trying to get to

'//*[@id=all_TorontoBlueJayspitching"]/div/table/tbody/tr/th/a'

And then that value href in /a (which in this case is href=-"/players/b/berrijo01.shtml"

Any guidance here would be helpful. I know how to successfully print an element, but I don't know how to access the path itself as a variable. Thank you.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

优雅的叶子 2025-02-16 18:03:24

将字符串X更改为

stringx = '//*[@id=' + stringy + ']/div/table/tbody/tr/th/a/@href'

应该输出

[
  '/players/l/lynnla01.shtml', 
  '/players/l/lopezre01.shtml', 
  '/players/g/graveke01.shtml', 
  '/players/k/kellyjo05.shtml'
]

Change the stringx to

stringx = '//*[@id=' + stringy + ']/div/table/tbody/tr/th/a/@href'

This should output

[
  '/players/l/lynnla01.shtml', 
  '/players/l/lopezre01.shtml', 
  '/players/g/graveke01.shtml', 
  '/players/k/kellyjo05.shtml'
]

回复收藏 0 原文

~没有更多了~

关于作者

ゃ人海孤独症

暂无简介

文章

28 人气

关注发私信

alipaysp_snBf0MSZIv

文章 0 评论 0

关注

梦断已成空

文章 0 评论 0

关注

瞎闹

文章 0 评论 0

关注

凯凯我们等你回来

文章 0 评论 0

关注

寄意

文章 0 评论 0

关注

似梦非梦

文章 0 评论 0

友情链接

文江博客

如何使用LXML作为字符串中的路径中的HREF属性返回？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

alipaysp_snBf0MSZIv

梦断已成空

瞎闹

凯凯我们等你回来

寄意

似梦非梦

友情链接

如何使用LXML作为字符串中的路径中的HREF属性返回？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

alipaysp_snBf0MSZIv

梦断已成空

瞎闹

凯凯我们等你回来

寄意

似梦非梦

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。