使用rselenium提取HREF标签

发布于 2025-01-25 13:25:23 字数 1227 浏览 4 评论 0原文

我正在尝试使用rselenium来获取多个国家 /地区的苹果商店的商店地址。

library(RSelenium)
library(tidyverse)
library(netstat)

# start the server
rs_driver_object <- rsDriver(browser = "chrome",
                             chromever = "100.0.4896.60",
                             verbose = F,
                             port = free_port())

# create a client object
remDr <- rs_driver_object$client

# maximise window size
remDr$maxWindowSize()

# navigate to the website
remDr$navigate("https://www.apple.com/uk/retail/storelist/")

# click on search bar
search_box <- remDr$findElement(using = "id", "dropdown")
country_name <- "United States" # for a single country. I can loop over multiple countries

# in the search box, pass on the country name and hit enter
search_box$sendKeysToElement(list(country_name, key = "enter"))
search_box$clickElement() # I am not sure if I need to click but I am doing anyway 

该页面现在向我展示了每个商店的位置。每个商店都有一个超链接,它将将我带到商店网站,在这里我要提取的完整地址

,但是我在最后一步中如何单击单个商店地址。 我以为我会在特定页面中的所有商店中获得href

store_address <- remDr$findElement(using = 'class', 'store-address')
store_address$getElementAttribute('href')

但它返回了我一个空列表。我该如何从这里去?

I am trying to get the store address of apple stores for multiple countries using Rselenium.

library(RSelenium)
library(tidyverse)
library(netstat)

# start the server
rs_driver_object <- rsDriver(browser = "chrome",
                             chromever = "100.0.4896.60",
                             verbose = F,
                             port = free_port())

# create a client object
remDr <- rs_driver_object$client

# maximise window size
remDr$maxWindowSize()

# navigate to the website
remDr$navigate("https://www.apple.com/uk/retail/storelist/")

# click on search bar
search_box <- remDr$findElement(using = "id", "dropdown")
country_name <- "United States" # for a single country. I can loop over multiple countries

# in the search box, pass on the country name and hit enter
search_box$sendKeysToElement(list(country_name, key = "enter"))
search_box$clickElement() # I am not sure if I need to click but I am doing anyway 

The page now shows me the location of each store. Each store has a hyperlink that will take me to the store website where the full address is which I want to extract

However, I am stuck on how do I click on individual store address in the last step.
I thought I will get the href for all the stores in the particular page

store_address <- remDr$findElement(using = 'class', 'store-address')
store_address$getElementAttribute('href')

But it returns me an empty list. How do I go from here?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

再浓的妆也掩不了殇 2025-02-01 13:25:23

在获得商店列表中获得页面后,我们可以做到,

link = remDr$getPageSource()[[1]] %>%
  read_html() %>% html_nodes('.state') %>% html_nodes('a') %>% html_attr('href') %>% paste0('https://www.apple.com', .)

[1] "https://www.apple.com/retail/thesummit/"              "https://www.apple.com/retail/bridgestreet/"          
[3] "https://www.apple.com/retail/anchorage5thavenuemall/" "https://www.apple.com/retail/chandlerfashioncenter/" 
[5] "https://www.apple.com/retail/santanvillage/"          "https://www.apple.com/retail/arrowhead/"  

After obtaining page with list of stores we can do,

link = remDr$getPageSource()[[1]] %>%
  read_html() %>% html_nodes('.state') %>% html_nodes('a') %>% html_attr('href') %>% paste0('https://www.apple.com', .)

[1] "https://www.apple.com/retail/thesummit/"              "https://www.apple.com/retail/bridgestreet/"          
[3] "https://www.apple.com/retail/anchorage5thavenuemall/" "https://www.apple.com/retail/chandlerfashioncenter/" 
[5] "https://www.apple.com/retail/santanvillage/"          "https://www.apple.com/retail/arrowhead/"  
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文