无法使用Python和Selenium检索HREF属性
我对此非常陌生,并且花了几个小时尝试我在这里阅读的各种方法。很抱歉,如果我犯了一些愚蠢的错误
,我想创建一个乐高积木的数据库。从Brickset.com中摘下图像和信息,
我正在使用:
anchors = driver.find_elements_by_xpath('//*[@id="ui-tabs-2"]/ul/li[1]/a')
anchors = [a.get_attribute('href') for a in anchors]
打印(锚)返回:
anchors = driver.find_elements_by_xpath('//*[@id="ui-tabs-2"]/ul/li[1]/a')
我要定位的是:
div id="ui-tabs-2" class="ui-tabs-panel ui-widget-content ui-corner-bottom" aria-live="polite" aria-labelledby="ui-id-4" role="tabpanel" aria-expanded="true" aria-hidden="false" style="display: block;">
<ul class="moreimages">
<li>
<a href="https://images.brickset.com/sets/AdditionalImages/21054-1/21054_alt10.jpg" class="highslide plain " onclick="return hs.expand(this)">
<img src="https://images.brickset.com/sets/AdditionalImages/21054-1/tn_21054_alt10_jpg.jpg" title="" onerror="this.src='/assets/images/spacer2.png'" loading="lazy">
</a><div class="highslide-caption">
我失去了想解决这个问题的想法。
更新 仍然没有获得HREF属性。为了添加更多细节,我正在尝试将图像在此URL上的“图像”选项卡下获取: https://brickset.com/sets/sets/21330-1/home-alone 这是有问题的代码:
anchors = driver.find_elements(By.XPATH, '//*[@id="ui-tabs-2"]/ul/li/a')
links = [anchors.get_attribute('href') for a in anchors]
print('Found ' + str(len(anchors)) + ' links to images')
我也尝试过:
#anchors = driver.find_elements_by_css_selector("a[href*='21330']")
这仅返回了一个HREF,即使应该有大约十二个。
谢谢大家的帮助!
I'm very new to this and have spent hours trying various methods I've read here. Apologies if I'm making some silly mistake
I want to create a database of my LEGO sets. Pulling images and info from brickset.com
I'm using:
anchors = driver.find_elements_by_xpath('//*[@id="ui-tabs-2"]/ul/li[1]/a')
anchors = [a.get_attribute('href') for a in anchors]
print (anchors) returns:
anchors = driver.find_elements_by_xpath('//*[@id="ui-tabs-2"]/ul/li[1]/a')
What I'm trying to target:
div id="ui-tabs-2" class="ui-tabs-panel ui-widget-content ui-corner-bottom" aria-live="polite" aria-labelledby="ui-id-4" role="tabpanel" aria-expanded="true" aria-hidden="false" style="display: block;">
<ul class="moreimages">
<li>
<a href="https://images.brickset.com/sets/AdditionalImages/21054-1/21054_alt10.jpg" class="highslide plain " onclick="return hs.expand(this)">
<img src="https://images.brickset.com/sets/AdditionalImages/21054-1/tn_21054_alt10_jpg.jpg" title="" onerror="this.src='/assets/images/spacer2.png'" loading="lazy">
</a><div class="highslide-caption">
I'm losing my mind trying to figure this out.
Update
Still not getting the href attributes. To add more detail, I'm trying to get the images under the "images" tab on this URL:
https://brickset.com/sets/21330-1/Home-Alone
Here is the problematic code:
anchors = driver.find_elements(By.XPATH, '//*[@id="ui-tabs-2"]/ul/li/a')
links = [anchors.get_attribute('href') for a in anchors]
print('Found ' + str(len(anchors)) + ' links to images')
I've also tried:
#anchors = driver.find_elements_by_css_selector("a[href*='21330']")
This only returned one href, even though there should be about a dozen.
Thank you all for the assistance!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您不应该为多个变量使用相同的名称。
根据第一行代码:
锚
是WebElements的列表。 下,使用href
属性创建另一个列表,您应该使用另一个名称,例如理想情况 BE:
使用 list classence 在一行:
You shouldn't be using the same name for multiple variables.
As per the first line of code:
anchors
is the list of WebElements. Ideally to create another list with thehref
attributes you should use another name, e.g. hrefsEffectively your code block will be:
Using list comprehension in a single line:
第一件事,
driver.find_elements_by_xpath
已弃用,使用driver.find_element(by.xpath,'locator'')
而不是。现在,如果您想获取页面上链接的所有
href
:请注意,我不使用
[1]
来获取一个元素,而是而是所有元素。First thing,
driver.find_elements_by_xpath
is deprecated, usedriver.find_element(By.XPATH, 'locator')
instead.Now, if you'd like to get all
href
s of the links on the page:Notice that I'm not using
[1]
to get a single element, but rather all elements.您可能想尝试一下。
注意:我在此处不使用
硒
。这应该输出:
You might want to try this.
NOTE: I'm not using
selenium
here.This should output: