忽略 R 字符串中的转义字符(反斜杠)
在 SPSS 中运行 R 插件时,我收到一个 Windows 路径字符串作为输入,例如
'C:\Users\mhermans\somefile.csv'
我想在后续 R 代码中使用该路径,但需要用正斜杠替换斜杠,否则 R 会将其解释为转义符(例如“\U 没有使用十六进制数字”错误)。
然而,我还没有找到一个可以用正斜杠替换反斜杠或双重转义它们的函数。所有这些函数都假设这些字符已被转义。
那么,有没有类似的事情:
>gsub('\\', '/', 'C:\Users\mhermans')
C:/Users/mhermans
While running an R-plugin in SPSS, I receive a Windows path string as input e.g.
'C:\Users\mhermans\somefile.csv'
I would like to use that path in subsequent R code, but then the slashes need to be replaced with forward slashes, otherwise R interprets it as escapes (eg. "\U used without hex digits" errors).
I have however not been able to find a function that can replace the backslashes with foward slashes or double escape them. All those functions assume those characters are escaped.
So, is there something along the lines of:
>gsub('\\', '/', 'C:\Users\mhermans')
C:/Users/mhermans
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
您可以尝试在 scan() 中使用 'allowEscapes' 参数
You can try to use the 'allowEscapes' argument in scan()
自 2020 年 4 月推出的 4.0 版本开始,R 提供了用于指定原始字符串的语法。示例中的字符串可以写为:
From
?Quotes
:As of version 4.0, introduced in April 2020, R provides a syntax for specifying raw strings. The string in the example can be written as:
From
?Quotes
:首先,您需要将其分配给一个名称:
请注意,为了将其放入名称向量中,您需要将它们全部加倍,这给出了有关如何使用正则表达式的提示。实际上,如果您从文本文件中读取它,那么 R 将为您完成所有加倍操作。请注意,不要真正加倍反斜杠。它被存储为单个反斜杠,但它是这样显示的,并且需要像这样从控制台输入。否则,R 解释器会尝试(并且经常失败)将其转换为特殊字符。更糟糕的是,正则表达式还使用反斜杠作为转义符。因此,要使用 grep 或 sub 或 gsub 检测转义,您需要将反斜杠加四倍
您需要将反斜杠加倍“加倍”。每对 \ 中的第一个是向 grep 机器发出信号,表明接下来是一个文字。
考虑:
First you need to get it assigned to a name:
Notice that in order to get it into a name vector you needed to double them all, which gives a hint about how you could use regex. Actually, if you read it in from a text file, then R will do all the doubling for you. Mind you it not really doubling the backslashes. It is being stored as a single backslash, but it's being displayed like that and needs to be input like that from the console. Otherwise the R interpreter tries (and often fails) to turn it into a special character. And to compound the problem, regex uses the backslash as an escape as well. So to detect an escape with grep or sub or gsub you need to quadruple the backslashes
You needed to doubly "double" the backslashes. The first of each couple of \'s is to signal to the grep machine that what next comes is a literal.
Consider:
如果文件 E:\Data\junk.txt 包含以下文本(不带引号): C:\Users\mhermans\somefile.csv
您可能会收到包含以下语句的警告,但它会起作用:
如果文件 E:\Data \junk.txt 包含以下文本(带引号):“C:\Users\mhermans\somefile.csv”
上面的 readlines 语句也可能会向您发出警告,但现在将包含:
“\”C:\Users\mhermans \somefile.csv\""
因此,要获得您想要的内容,请确保传入文件中没有引号,然后使用:
If file E:\Data\junk.txt contains the following text (without quotes): C:\Users\mhermans\somefile.csv
You may get a warning with the following statement, but it will work:
If file E:\Data\junk.txt contains the following text (with quotes): "C:\Users\mhermans\somefile.csv"
The above readlines statement might also give you a warning, but will now contain:
"\"C:\Users\mhermans\somefile.csv\""
So, to get what you want, make sure there aren't quotes in the incoming file, and use: