这个 URL 循环有什么错误?
对于一个 url,该代码可以工作,但对于列表中的多个 url,该代码不起作用,会出现错误。我是r新手,请帮忙。
library(rvest)
for (url in data_list){
webpage = read_html(url)
extracted_urls = webpage %>%
rvest::html_nodes("a") %>%
rvest::html_attr("href")
extracted_urls = extracted_urls[grep("roster", extracted_urls)]
extracted_urls}
错误:
x
必须是长度为 1 的字符串
编辑
OP 评论中的链接。
data_list <- c(
"ephsports.williams.edu",
"wilsonphoenix.com",
"wingatebulldogs.com",
"ycpspartans.com"
)
For one url the code works, but for multiple urls in a list this does not work, gives an error. I'm new to r, please help.
library(rvest)
for (url in data_list){
webpage = read_html(url)
extracted_urls = webpage %>%
rvest::html_nodes("a") %>%
rvest::html_attr("href")
extracted_urls = extracted_urls[grep("roster", extracted_urls)]
extracted_urls}
Error:
x
must be a string of length 1
Edit
Links in OP's comment.
data_list <- c(
"ephsports.williams.edu",
"wilsonphoenix.com",
"wingatebulldogs.com",
"ycpspartans.com"
)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
在 for 循环中创建的变量每次迭代都会被覆盖。在这里,extracted_urls 被反复破坏。在循环外部创建接收器对象(尝试 r <- list())允许将结果逐步添加到全局环境中的对象,这将在 for 循环内的本地环境外部保持可访问性。
Variables created in a for loop get overwritten each iteration. Here, extracted_urls gets repeatedly clobbered. Creating a receiver object outside the loop (try r <- list()) permits adding results stepwise to an object in the global environment, which will remain accessible outside the local environment within the for loop.
由于某些网址不起作用,我们可以使用
possible
函数跳过它们。现在我们循环包含 urls
data_list
的向量并跳过有错误的向量。As some of the urls are not working, we can skip them using
possibly
function.Now we loop over vector containing urls
data_list
and skipping the one with errors.