忽略 R 字符串中的转义字符(反斜杠)

发布于 2024-10-12 02:28:24 字数 349 浏览 8 评论 0原文

在 SPSS 中运行 R 插件时,我收到一个 Windows 路径字符串作为输入,例如

'C:\Users\mhermans\somefile.csv'

我想在后续 R 代码中使用该路径,但需要用正斜杠替换斜杠,否则 R 会将其解释为转义符(例如“\U 没有使用十六进制数字”错误)。

然而,我还没有找到一个可以用正斜杠替换反斜杠或双重转义它们的函数。所有这些函数都假设这些字符已被转义。

那么,有没有类似的事情:

>gsub('\\', '/', 'C:\Users\mhermans')
C:/Users/mhermans

While running an R-plugin in SPSS, I receive a Windows path string as input e.g.

'C:\Users\mhermans\somefile.csv'

I would like to use that path in subsequent R code, but then the slashes need to be replaced with forward slashes, otherwise R interprets it as escapes (eg. "\U used without hex digits" errors).

I have however not been able to find a function that can replace the backslashes with foward slashes or double escape them. All those functions assume those characters are escaped.

So, is there something along the lines of:

>gsub('\\', '/', 'C:\Users\mhermans')
C:/Users/mhermans

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

娇柔作态 2024-10-19 02:28:24

您可以尝试在 scan() 中使用 'allowEscapes' 参数

X=scan(what="character",allowEscapes=F)
C:\Users\mhermans\somefile.csv

print(X)
[1] "C:\\Users\\mhermans\\somefile.csv"

You can try to use the 'allowEscapes' argument in scan()

X=scan(what="character",allowEscapes=F)
C:\Users\mhermans\somefile.csv

print(X)
[1] "C:\\Users\\mhermans\\somefile.csv"
久光 2024-10-19 02:28:24

自 2020 年 4 月推出的 4.0 版本开始,R 提供了用于指定原始字符串的语法。示例中的字符串可以写为:

path <- r"(C:\Users\mhermans\somefile.csv)"

From ?Quotes:

原始字符常量也可以使用与 C++ 中使用的语法类似的语法:r"(...)" with ...任何字符序列,但它不能包含结束序列 )"。分隔符也可以使用 [] 和 {} 对,并且可以使用 R 代替 r 为了获得额外的灵活性,可以在左引号和左分隔符之间放置多个破折号,只要出现相同数量的破折号即可。结束分隔符和结束报价之间。

As of version 4.0, introduced in April 2020, R provides a syntax for specifying raw strings. The string in the example can be written as:

path <- r"(C:\Users\mhermans\somefile.csv)"

From ?Quotes:

Raw character constants are also available using a syntax similar to the one used in C++: r"(...)" with ... any character sequence, except that it must not contain the closing sequence )". The delimiter pairs [] and {} can also be used, and R can be used in place of r. For additional flexibility, a number of dashes can be placed between the opening quote and the opening delimiter, as long as the same number of dashes appear between the closing delimiter and the closing quote.

一枫情书 2024-10-19 02:28:24

首先,您需要将其分配给一个名称:

pathname <- 'C:\\Users\\mhermans\\somefile.csv'

请注意,为了将其放入名称向量中,您需要将它们全部加倍,这给出了有关如何使用正则表达式的提示。实际上,如果您从文本文件中读取它,那么 R 将为您完成所有加倍操作。请注意,不要真正加倍反斜杠。它被存储为单个反斜杠,但它是这样显示的,并且需要像这样从控制台输入。否则,R 解释器会尝试(并且经常失败)将其转换为特殊字符。更糟糕的是,正则表达式还使用反斜杠作为转义符。因此,要使用 grep 或 sub 或 gsub 检测转义,您需要将反斜杠加四倍

 gsub("\\\\", "/", pathname)
# [1] "C:/Users/mhermans/somefile.csv"

您需要将反斜杠加倍“加倍”。每对 \ 中的第一个是向 grep 机器发出信号,表明接下来是一个文字。

考虑:

 nchar("\\A")
#  returns `[1] 2`

First you need to get it assigned to a name:

pathname <- 'C:\\Users\\mhermans\\somefile.csv'

Notice that in order to get it into a name vector you needed to double them all, which gives a hint about how you could use regex. Actually, if you read it in from a text file, then R will do all the doubling for you. Mind you it not really doubling the backslashes. It is being stored as a single backslash, but it's being displayed like that and needs to be input like that from the console. Otherwise the R interpreter tries (and often fails) to turn it into a special character. And to compound the problem, regex uses the backslash as an escape as well. So to detect an escape with grep or sub or gsub you need to quadruple the backslashes

 gsub("\\\\", "/", pathname)
# [1] "C:/Users/mhermans/somefile.csv"

You needed to doubly "double" the backslashes. The first of each couple of \'s is to signal to the grep machine that what next comes is a literal.

Consider:

 nchar("\\A")
#  returns `[1] 2`
人间不值得 2024-10-19 02:28:24

如果文件 E:\Data\junk.txt 包含以下文本(不带引号): C:\Users\mhermans\somefile.csv

您可能会收到包含以下语句的警告,但它会起作用:

 texinp <- readLines("E:\\Data\\junk.txt")

如果文件 E:\Data \junk.txt 包含以下文本(带引号):“C:\Users\mhermans\somefile.csv”

上面的 readlines 语句也可能会向您发出警告,但现在将包含:

“\”C:\Users\mhermans \somefile.csv\""

因此,要获得您想要的内容,请确保传入文件中没有引号,然后使用:

 texinp <- suppressWarnings(readLines("E:\\Data\\junk.txt"))

If file E:\Data\junk.txt contains the following text (without quotes): C:\Users\mhermans\somefile.csv

You may get a warning with the following statement, but it will work:

 texinp <- readLines("E:\\Data\\junk.txt")

If file E:\Data\junk.txt contains the following text (with quotes): "C:\Users\mhermans\somefile.csv"

The above readlines statement might also give you a warning, but will now contain:

"\"C:\Users\mhermans\somefile.csv\""

So, to get what you want, make sure there aren't quotes in the incoming file, and use:

 texinp <- suppressWarnings(readLines("E:\\Data\\junk.txt"))
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文