在R中使用RVEST HTML_ELEMENTS提取重复的类
你好吗?我正在尝试使用RVEST提取有关此运动底网页的一些信息。几天前,我问了一个相关的问题,几乎得到了我的目标的100%。到目前为止,感谢您,使用下一个代码成功提取了标题,得分和比赛时间:
library(rvest)
library(tidyverse)
page <- "https://www.supermatch.com.uy/live_recargar_menu/" %>%
read_html()
data=data.frame(
Titulo = page %>%
html_elements(".titulo") %>%
html_text(),
Marcador = page %>%
html_elements(".marcador") %>%
html_text(),
Tiempo = page %>%
html_elements(".marcador+ span") %>%
html_text() %>%
str_squish()
)
现在我想获得重复的值,例如,如果比赛国家是“巴西人”,我想要将其放在该类别中每场比赛的国家中都是巴西的数据框架。到目前为止,我只设法单独提取所有国家。运动名称和比赛也是如此。
你能帮我吗?已经谢谢。
how are you? I am trying to extract some info about this sportbetting webpage using rvest. I asked a related question a few days ago and i get almost 100% of my goals. So far , and thanks to you, extracted succesfully the title, the score and the time of the matches being played using the next code:
library(rvest)
library(tidyverse)
page <- "https://www.supermatch.com.uy/live_recargar_menu/" %>%
read_html()
data=data.frame(
Titulo = page %>%
html_elements(".titulo") %>%
html_text(),
Marcador = page %>%
html_elements(".marcador") %>%
html_text(),
Tiempo = page %>%
html_elements(".marcador+ span") %>%
html_text() %>%
str_squish()
)
Now i want to get repeated values, for example if the country of the match is "Brasil" I want to put it in the data frame that the country is Brasil for every match in that category. So far i only managed to extract all the countries but individually. Same applies for sport name and tournament.
Can you help me with that? Already thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您可以重写代码以使用与不同信息级别一起使用的单独功能。这些可以以嵌套的方式调用,从而使代码易于阅读。
从本质上讲,使用嵌套的map_dfr()调用来从DOM内不同级别的列表工作的功能中产生单个数据帧。
在下面,您可以将其视为体育的外部循环,然后是在国家 /地区的中间循环,以及在体育和国家内发生的事件中最内向的循环。
You could re-write your code to use separate functions that work with different levels of information. These can be called in a nested fashion making the code easier to read.
Essentially, using nested map_dfr() calls to produce a single dataframe from functions working with lists at different levels within the DOM.
Below, you could think of it like an outer loop of sports, then an intermediate loop over countries, and an innermost loop over events within a sport and country.