Selenium Web驱动程序即使存在CSS选择器也找不到CSS选择器(Python)

发布于 2025-02-08 09:00:44 字数 1647 浏览 1 评论 0原文

我正在尝试从seetickets.us刮擦数据。我要单击每个组织,然后通过该组织进行所有事件。刮板从每个事件中正确刮擦数据,但问题是当我回到所有事件时,网页驱动程序找不到CSS选择器。

这是站点结构:

https://ibb.co/wbjmdjf

单击世界咖啡馆在这里点击我。

https://ibb.co/clbmp19

单击任何事件将使我迈向有关事件的更多信息。

现在,当驾驶员从提取每个事件的情况下回来时,它将无法参加其他事件。我还尝试过明确等待anf time.sleep()

这是我的代码:

   #this is the func click on each event and extract data then come back to all event page
   def get_all_events_in_each_event(self):
        inner_events  = self.get_all_inner_events()
        print(len(inner_events))
        for event in inner_events:

            self.click_inner_event(event)
            self.get_event_loc()
            self.get_talent()
            self.get_facebook()
            self.get_date()
            self.get_showtime_city()
            self.get_ticket_doors()
            self.back()
            try:
                WebDriverWait(self, 10).until(
                    EC.element_to_be_clickable((By.CLASS_NAME, "event-images-box")))
            except Exception as e:
                print("Wait Timed out")
                print(e) 
   
#this is the func to click on each event in all event pages
def click_inner_event(self , inner_event):
        
        link = inner_event.find_element_by_css_selector('div[class="event-info"]')
        link.click()

这是所有事件页面的html: https://ibb.co/wckwc68

好心地帮助我在这里找到问题所在。 谢谢

I am trying to scrape data from seetickets.us. I am clicking on each org and then all events by that org. The scraper correctly scrape data from each event but the problem is that when I come back to all events page web driver cannot find the css selector.

Here is the site structure:

https://ibb.co/WBjMDJf

clicking on World Cafe Live get me here:

https://ibb.co/cLbMP19

clicking on any event will move me toward further info about event.

Now when the driver is coming back from extracting each event , It is not able to go into other event. I have also tried explicit wait anf time.sleep()

Here is my code:

   #this is the func click on each event and extract data then come back to all event page
   def get_all_events_in_each_event(self):
        inner_events  = self.get_all_inner_events()
        print(len(inner_events))
        for event in inner_events:

            self.click_inner_event(event)
            self.get_event_loc()
            self.get_talent()
            self.get_facebook()
            self.get_date()
            self.get_showtime_city()
            self.get_ticket_doors()
            self.back()
            try:
                WebDriverWait(self, 10).until(
                    EC.element_to_be_clickable((By.CLASS_NAME, "event-images-box")))
            except Exception as e:
                print("Wait Timed out")
                print(e) 
   
#this is the func to click on each event in all event pages
def click_inner_event(self , inner_event):
        
        link = inner_event.find_element_by_css_selector('div[class="event-info"]')
        link.click()

Here is HTML of all events page:
https://ibb.co/wcKWc68

Kindly help me with finding what's wrong here.
Thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

雨轻弹 2025-02-15 09:00:44

正如@arundeep Chohan一样,正确指出了Web驱动程序在来回移动时丢失参考,因此我必须重新抓住所有元素。
正确的代码是:

def get_all_events_in_each_event(self):
        inner_events  = self.get_all_inner_events()
        
        for i in range(len(inner_events)):
            
            self.click_inner_event(inner_events[i])
            self.get_event_loc()
            self.get_talent()
            self.get_facebook()
            self.get_date()
            self.get_showtime_city()
            self.get_ticket_doors()
            self.back()
            inner_events = self.get_all_inner_events() #regrabbing the elements

感谢Arundeep的答案。

As @Arundeep Chohan , correctly pointed that web driver loses reference when moving back and forth so I had to re grab all the elements.
Correct code is:

def get_all_events_in_each_event(self):
        inner_events  = self.get_all_inner_events()
        
        for i in range(len(inner_events)):
            
            self.click_inner_event(inner_events[i])
            self.get_event_loc()
            self.get_talent()
            self.get_facebook()
            self.get_date()
            self.get_showtime_city()
            self.get_ticket_doors()
            self.back()
            inner_events = self.get_all_inner_events() #regrabbing the elements

Thanks arundeep for the answer.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文