通过 R Console 从网络下载文件
我想通过下载链接使用 R 下载日志文件,但我只得到未评估的 html。
这是我尝试过的,但没有成功:
url = "http://statcounter.com/p7447608/csv/download_log_file?ufrom=1323783441&uto=1323860282"
# SSL-certificate:
CAINFO = paste(system.file(package="RCurl"), "/CurlSSL/ca-bundle.crt", sep = "")
curlH = getCurlHandle(
header = FALSE,
verbose = TRUE,
netrc = TRUE,
maxredirs = as.integer(20),
followlocation = TRUE,
userpwd = "me:mypassw",
ssl.verifypeer = TRUE)
setwd(tempdir())
destfile = "log.csv"
x = getBinaryURL(url, curl = curlH,
cainfo = CAINFO)
shell.exec(dir())
I'd like to download a log-file with R via a download link, but I get only the un-evaluated html.
this is what I tried, without any success:
url = "http://statcounter.com/p7447608/csv/download_log_file?ufrom=1323783441&uto=1323860282"
# SSL-certificate:
CAINFO = paste(system.file(package="RCurl"), "/CurlSSL/ca-bundle.crt", sep = "")
curlH = getCurlHandle(
header = FALSE,
verbose = TRUE,
netrc = TRUE,
maxredirs = as.integer(20),
followlocation = TRUE,
userpwd = "me:mypassw",
ssl.verifypeer = TRUE)
setwd(tempdir())
destfile = "log.csv"
x = getBinaryURL(url, curl = curlH,
cainfo = CAINFO)
shell.exec(dir())
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
以下是下载文件的两种方法。
当将文件重命名为 log.html 并打开它时,我们的登录似乎无效。这就是为什么你会得到 html 结构。您需要将登录凭据添加到 URL。
您可以从 html 源代码中获取名称值对:
如您所见,用户名的名称值对称为 form_user=USERNAME,密码的名称值对称为 form_pass=PASSWORD。
这就是为什么curl userpwd 设置不起作用,它无法识别ID 或名称。
Here are two ways of downloading the file.
It seems when renaming the file to log.html and opening it, that we have an invalid login. This is why you get the html structure. You need to add the login credentials to the URL.
You can get the name value pairs from the html source code:
As you can see the name value pair for the username is called form_user=USERNAME and the name value pair for the password is called form_pass=PASSWORD.
This is why the curl userpwd setting doesn't work, it doesn't recognize the ids or the names.
您似乎不需要 SSL 证书等,因为网址是
http:
,而不是https:
...所以也许download.file(url, "log .csv")
在这种情况下可以正常工作吗?我首先要确保 URL 及其响应在 R 之外是正确的。
...我使用 Chrome 访问 URL 并获得了下载的文件“StatCounter-Log-7447608.csv”。它包含 csv 标头和 HTML?!
You don't seem to need SSL certificates etc since the url is
http:
, nothttps:
... So maybedownload.file(url, "log.csv")
would work fine in this case?I'd first make sure the url and its response is correct outside of R.
...I used Chrome to access the URL and got a downloaded file "StatCounter-Log-7447608.csv". It contains a csv header and HTML?!