修改r中全文检索的GET请求

发布于 2025-02-07 19:41:53 字数 1421 浏览 1 评论 0 原文

我正在使用r软件包europepmc()和功能EPMC_FTXT用于获取给定PMC ID的某些文章的全文。但是,对于许多文章,我一直遇到以下错误:

“请求失败[404]。在1秒内重试... EPMC_FTXT中的错误(“ PMC2701033”): 找不到(HTTP 404)。未能检索全文。”

是因为本文不属于OpenAccess子集(我想)。但是我检查了一下,发现我的大学已获得访问该文章的许可。所以我的问题是。 ..如何在功能中编辑get请求,以便告诉EPMC_FTXT我可以实际访问该文章? 下面的代码:

#' Fetch Europe PMC full texts
#'
#' This function loads full texts into R. Full texts are in XML format and are
#' only provided for the Open Access subset of Europe PMC.
#'
#' @param ext_id character, PMCID. 
#'   All full text publications have external IDs starting 'PMC_'
#'
#' @export
#' @return xml_document
#'
#' @examples
#'   \dontrun{
#'   epmc_ftxt("PMC3257301")
#'   epmc_ftxt("PMC3639880")
#'   }
epmc_ftxt <- function(ext_id = NULL) {
  if (!grepl("^PMC", ext_id))
    stop("Please provide a PMCID, i.e. ids starting with 'PMC'")
  # call api
  req <-
    httr::RETRY("GET",
                base_uri(),
                path = paste(rest_path(), ext_id,
                             "fullTextXML", sep = "/"))
  # check for http status
  httr::stop_for_status(req, "retrieve full text.")
  # load xml into r
  httr::content(req, as = "text", encoding = "utf-8") %>%
    xml2::read_xml()
} 

I am using the R package europepmc (https://cran.r-project.org/web/packages/europepmc/europepmc.pdf) and the function epmc_ftxt for obtaining the full texts of some articles given their PMC ID. However for many articles I keep getting the following error:

"Request failed [404]. Retrying in 1 seconds...
Error in epmc_ftxt("PMC2701033") :
Not Found (HTTP 404). Failed to retrieve full text.."

That is because the article does not belong to the OpenAccess subset (I guess). However I checked and saw that my University has the license to access that article. So my question is... How can I edit the get request in the function in order to tell epmc_ftxt that I can actually access that article?
Code below:

#' Fetch Europe PMC full texts
#'
#' This function loads full texts into R. Full texts are in XML format and are
#' only provided for the Open Access subset of Europe PMC.
#'
#' @param ext_id character, PMCID. 
#'   All full text publications have external IDs starting 'PMC_'
#'
#' @export
#' @return xml_document
#'
#' @examples
#'   \dontrun{
#'   epmc_ftxt("PMC3257301")
#'   epmc_ftxt("PMC3639880")
#'   }
epmc_ftxt <- function(ext_id = NULL) {
  if (!grepl("^PMC", ext_id))
    stop("Please provide a PMCID, i.e. ids starting with 'PMC'")
  # call api
  req <-
    httr::RETRY("GET",
                base_uri(),
                path = paste(rest_path(), ext_id,
                             "fullTextXML", sep = "/"))
  # check for http status
  httr::stop_for_status(req, "retrieve full text.")
  # load xml into r
  httr::content(req, as = "text", encoding = "utf-8") %>%
    xml2::read_xml()
} 

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文