在 PHP 中用换行符替换文字字符串 \r\n 时出现问题
我有一个文本文件,其中包含文字字符串 \r\n
。我想用实际的换行符 (\n) 替换它。
我知道正则表达式 /\\r\\n/
应该匹配它(我已经在 Reggy 中测试了它),但我无法让它在 PHP 中工作。
我尝试了以下变体:
preg_replace("/\\\\r\\\\n/", "\n", $line);
preg_replace("/\\\ \[r]\\\\[n]/", "\n", $line);
preg_replace("/[\\\\][r][\\\\][ n]/", "\n", $line);
preg_replace("/[\\\\]r[\\\\]n/", "\n", $line);
如果我只是尝试替换反斜杠,它就可以正常工作。我一添加 r,它就找不到匹配项。
我正在读取的文件编码为 UTF-16。
编辑:
我也已经尝试过使用 str_replace()
。
我现在认为这里的问题是文件的字符编码。我尝试了以下操作,它确实有效:
$testString = "\\r\\n";
echo preg_replace("/\\\\r\\\\n/", "\n", $testString);
但它在我从文件中读取的行上不起作用。
I have a text file that has the literal string \r\n
in it. I want to replace this with an actual line break (\n).
I know that the regex /\\r\\n/
should match it (I have tested it in Reggy), but I cannot get it to work in PHP.
I have tried the following variations:
preg_replace("/\\\\r\\\\n/", "\n", $line);
preg_replace("/\\\\[r]\\\\[n]/", "\n", $line);
preg_replace("/[\\\\][r][\\\\][n]/", "\n", $line);
preg_replace("/[\\\\]r[\\\\]n/", "\n", $line);
If I just try to replace the backslash, it works properly. As soon as I add an r, it finds no matches.
The file I am reading is encoded as UTF-16.
Edit:
I have also already tried using str_replace()
.
I now believe that the problem here is the character encoding of the file. I tried the following, and it did work:
$testString = "\\r\\n";
echo preg_replace("/\\\\r\\\\n/", "\n", $testString);
but it does not work on lines I am reading in from my file.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
省去弄清楚正则表达式的精力,尝试使用
str_replace()
来代替:Save yourself the effort of figuring out the regex and try
str_replace()
instead:省去弄清楚正则表达式和双引号内转义的精力:
对于它的价值,
preg_replace("/\\\\r\\\\n/", "\n", $line) ;
应该没问题。作为演示:给出:
string(17) "Cake is yummyNLNL"
也不错的是:
'/\\\r\\\n/'
和' /\\\\r\\\\n/'
重要 - 如果上述方法不起作用,您是否确定文字
\r\n
是你想要匹配的吗?..Save yourself the effort of figuring out the regex and the escaping within double quotes:
For what it is worth,
preg_replace("/\\\\r\\\\n/", "\n", $line);
should be fine. As a demonstration:Gives:
string(17) "Cake is yummyNLNL"
Also fine is:
'/\\\r\\\n/'
and'/\\\\r\\\\n/'
Important - if the above doesn't work, are you even sure literal
\r\n
is what you're trying to match?..UTF-16 就是问题所在。如果您只是使用原始字节,那么您可以使用完整序列进行替换:
这假定为大端 UTF-16,否则将零字节交换到非零之后:
如果这不起作用,请发布输入文件的字节转储,以便我们可以看到它实际包含的内容。
UTF-16 is the problem. If you're just working with raw the bytes, then you can use the full sequences for replacing:
This assumes big-endian UTF-16, else swap the zero bytes to come after the non zeros:
If that doesn't work, please post a byte-dump of your input file so we can see what it actually contains.
上面的正则表达式将 Windows 上通常使用的换行符类型 (
\r\n
) 替换为 linux 换行符 (\n
)。参考资料:
The regex above replaces the type of line break normally used on windows (
\r\n
) with linux line breaks (\n
).References:
我一直在寻找这个话题,并且总是回到我写的个人台词上。
它看起来很整洁,并且基于正则表达式:
PHP
或
I always keep searching for this topic, and I always come back to a personal line I wrote.
It looks neat and its based on RegEx:
PHP
or