XPath一无所获

发布于 2025-02-13 01:15:15 字数 1243 浏览 1 评论 0原文

我正在尝试获取电话号码，但是从XPath中什么都没有解决这些问题。 -details.cfm？eunid = 99999999999“ rel =” nofollow noreferrer”> https://aaos22.mapyourshow.com/8_0/exhibitor/exhibitor/exhibitor-details.cfm?exhid=9999999999999

import scrapy
from scrapy.http import Request
from bs4 import BeautifulSoup
from selenium import webdriver
import time
from scrapy_selenium import SeleniumRequest
import requests
import json
import pandas  as pd

class TestSpider(scrapy.Spider):
    name = 'test'
    
    
    def start_requests(self):
        yield SeleniumRequest(
            url="https://aaos22.mapyourshow.com/8_0/explore/exhibitor-gallery.cfm?featured=false",
            wait_time=3,
            screenshot=True,
            callback=self.parse,
            dont_filter=True
        )
        
    
    
    def parse(self, response):
        books = response.xpath("//h3[@class='card-Title\nbreak-word\nf3\nmb1\nmt0']//a//@href").extract()
        
        for book in books:
            url = response.urljoin(book)
            yield Request(url, callback=self.parse_book)
            
    def parse_book(self, response):
        
        phone = response.xpath("//li[@class='dib  ml3  mr3'][2]").get()
        print(phone)

原文

I am getting trying to getting phone number but give nothing from the xpath how to solve these problem these is page link https://aaos22.mapyourshow.com/8_0/exhibitor/exhibitor-details.cfm?exhid=999999999999

import scrapy
from scrapy.http import Request
from bs4 import BeautifulSoup
from selenium import webdriver
import time
from scrapy_selenium import SeleniumRequest
import requests
import json
import pandas  as pd

class TestSpider(scrapy.Spider):
    name = 'test'
    
    
    def start_requests(self):
        yield SeleniumRequest(
            url="https://aaos22.mapyourshow.com/8_0/explore/exhibitor-gallery.cfm?featured=false",
            wait_time=3,
            screenshot=True,
            callback=self.parse,
            dont_filter=True
        )
        
    
    
    def parse(self, response):
        books = response.xpath("//h3[@class='card-Title\nbreak-word\nf3\nmb1\nmt0']//a//@href").extract()
        
        for book in books:
            url = response.urljoin(book)
            yield Request(url, callback=self.parse_book)
            
    def parse_book(self, response):
        
        phone = response.xpath("//li[@class='dib  ml3  mr3'][2]").get()
        print(phone)

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

夏末染殇 2025-02-20 01:15:17

假设您获得Concret HTML可以调整XPATH - 通过其＆lt; ul＆gt;通过其class> class和最后一个＆lt; li＆gt;。因为该号码不包含在＆lt; span＆gt;中，您必须调用其sibling：

//ul[contains(@class,'showcase-web-phone')]/li[last()]/span/following-sibling::text()[1]

Assuming you get the concret HTML you could adjust your xpath - Select the <ul> by its class and the last <li>. Cause the number is not included in the <span>, you have to call its sibling:

//ul[contains(@class,'showcase-web-phone')]/li[last()]/span/following-sibling::text()[1]

回复收藏 0 原文

無心 2025-02-20 01:15:16

如果您想摆脱索引，这就是您可以实现这一目标的方法：

response.xpath("normalize-space(//*[starts-with(@class,'showcase-web-phone')]/li[./*[.='Phone:']]/span/following::text())").get()

If you want to get rid of indexing, this is how you can achieve that:

response.xpath("normalize-space(//*[starts-with(@class,'showcase-web-phone')]/li[./*[.='Phone:']]/span/following::text())").get()

回复收藏 0 原文

~没有更多了~

关于作者

把回忆走一遍

暂无简介

文章

501 人气

关注发私信

友情链接

文江博客

XPath一无所获

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

alipaysp_snBf0MSZIv

梦断已成空

瞎闹

凯凯我们等你回来

寄意

似梦非梦

友情链接

XPath一无所获

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

alipaysp_snBf0MSZIv

梦断已成空

瞎闹

凯凯我们等你回来

寄意

似梦非梦

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。