MySQL 中 SELECT ... INTO OUTFILE ... 时的额外反斜杠 \

发布于 2024-10-21 12:35:19 字数 727 浏览 11 评论 0原文

所以我正在尝试将 MySQL 表导出到 CSV 中。我正在使用这个查询:

SELECT * FROM business WHERE id > 0 AND id <= 20000 INTO OUTFILE "business.csv"
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY "\n";

输出如下:

http://postimage.org/image/2ghyenh5w /full/

这样做的问题是,在有换行符的地方(例如地址字段中)总是有一个额外的反斜杠 \ 。

但是,从 phpMyAdmin 导出的 CSV 没有它:

http://postimage.org/image/2gi026tno/ full/

有什么办法可以让 SELECT ... OUTFILE ... 做同样的事情吗?

我正在处理的表有 2000 万条记录,phpMyAdmin 对于每个导出操作只能处理大约 500,000 条记录 - 否则它将变为空白或 mysql 服务器消失等。

So I'm trying to export a MySQL table into CSV. I'm using this query:

SELECT * FROM business WHERE id > 0 AND id <= 20000 INTO OUTFILE "business.csv"
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY "\n";

That output something like this:

http://postimage.org/image/2ghyenh5w/full/

The problem with this is that there's always an extra backslash \ where there's newline such as in the address field.

However, CSV exported from phpMyAdmin doesn't have it:

http://postimage.org/image/2gi026tno/full/

Any way to make the SELECT ... OUTFILE ... do the same?

The table I'm dealing with has 20 million records, phpMyAdmin can only handle about 500,000 records for every export action - or it will go blank or mysql server gone away, etc.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

不美如何 2024-10-28 12:35:19

看起来 MySQL 导出不可能正确导出换行符和引号。

导出时,MySQL 会自动对

  • 字段分隔符和
  • 行分隔符

进行转义。默认情况下,转义字符是反斜杠。您可以通过在查询中添加 ESCAPED BY '' 来覆盖此设置。

不幸的是,在“正常”(Excel 兼容)CSV 文件中,您可能需要对换行符和引号使用不同的编码。具体来说,您希望换行符不转义,并且引号加倍。

例如,如果一个值包含换行符,如下所示:

这是第 1 行
这是“第 2 行”,其中包含引号

应该变成的

引号

“这是第 1 行
这是“第 2 行”,其中包含引号”

我找到的解决方案是预先转义引号,并将 ESCAPED BY '' (空字符串)添加到我的查询中。

SELECT REPLACE(field1, '"', '""'),  
       REPLACE(field2, '"', '""'),  
       ...  
FROM ...  
WHERE ...  
INTO OUTFILE '/someFile.csv'  
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' ESCAPED BY ''  
LINES TERMINATED BY '\n'

It looks like it's impossible for a MySQL export to correctly export both Newlines and Quotes.

When exporting, MySQL will automatically escape both

  • Field delimiters, and
  • Line delimiters

By default, the escape character is a backslash. You can override this by adding ESCAPED BY '' to your query.

Unfortunately, in a "normal" (Excel-compatible) CSV file, you probably want different encodings for newlines and quotes. Specifically, you want newlines to be unescaped, and quotes to be doubled.

E.g. If a value contains a newline like so:

This is line 1
And this is "Line 2", which contains quotes

it should become

"This is line 1
And this is ""Line 2"", which contains quotes"

The solution I found was to pre-escape the quotes, and add ESCAPED BY '' (an empty string) to my query.

SELECT REPLACE(field1, '"', '""'),  
       REPLACE(field2, '"', '""'),  
       ...  
FROM ...  
WHERE ...  
INTO OUTFILE '/someFile.csv'  
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' ESCAPED BY ''  
LINES TERMINATED BY '\n'
寄风 2024-10-28 12:35:19

试试这个:

SELECT * FROM business WHERE id > 0 AND id <= 20000 INTO OUTFILE "business.csv"
fields terminated by ',' OPTIONALLY ENCLOSED BY '"' escaped by '"' 
LINES TERMINATED BY '\n';

我认为问题是 MySQL 试图在文本字段中转义换行符('\n'),因为它是行终止符。

FIELDS ESCAPED BY 控制如何写入特殊字符。如果 FIELDS ESCAPED BY 字符不为空,则在输出中将其用作前缀,位于以下字符之前:

按字符转义的字段

[可选]由字符封闭的字段

FIELDS TERMINATED BY 和 LINES TERMINATED BY 值的第一个字符

ASCII NUL(零值字节;转义字符后面实际写入的是 ASCII“0”,而不是零值字节)

MySQL

我真的不明白为什么它会在你的情况下做它正在做的事情,但我能够在我的Mac和查询上得到类似的东西上面似乎修复了我的情况下的输出。

希望有帮助!

Try this:

SELECT * FROM business WHERE id > 0 AND id <= 20000 INTO OUTFILE "business.csv"
fields terminated by ',' OPTIONALLY ENCLOSED BY '"' escaped by '"' 
LINES TERMINATED BY '\n';

I think the issue is that MySQL is trying to escape newline ('\n') in your text fields because it's your line terminator.

FIELDS ESCAPED BY controls how to write special characters. If the FIELDS ESCAPED BY character is not empty, it is used as a prefix that precedes following characters on output:

The FIELDS ESCAPED BY character

The FIELDS [OPTIONALLY] ENCLOSED BY character

The first character of the FIELDS TERMINATED BY and LINES TERMINATED BY values

ASCII NUL (the zero-valued byte; what is actually written following the escape character is ASCII “0”, not a zero-valued byte)

(MySQL)

I don't really understand why it's doing what it's doing in your case, but I was able to get something like that on my Mac and the query above seemed to fix the output in my case.

Hope that helps!

自由如风 2024-10-28 12:35:19

我遇到了同样的问题,我发现(将csv文件导入电子表格后)MySQL表中的一些varchar字段存在换行符。删除换行符后,导出工作正常。

I had the same problem, and I found out (after importing the csv file into a spreadsheet) that there were line breaks in some varchar fields in the MySQL table. After deleting the line breaks, the export worked correctly.

不甘平庸 2024-10-28 12:35:19

试试这个:

SELECT * FROM business WHERE id > 0 AND id <= 20000 INTO OUTFILE "business.csv"
FIELDS TERMINATED BY  ',' OPTIONALLY ENCLOSED BY '"' ESCAPED BY '\' 
LINES TERMINATED BY '\n';

我意识到使用 escaped by '\' 可以删除导出结果中的反斜杠。

Try this:

SELECT * FROM business WHERE id > 0 AND id <= 20000 INTO OUTFILE "business.csv"
FIELDS TERMINATED BY  ',' OPTIONALLY ENCLOSED BY '"' ESCAPED BY '\' 
LINES TERMINATED BY '\n';

I realised using escaped by '\' makes removes the backslash on exported results.

愁杀 2024-10-28 12:35:19

在这里得到答案 https://bugs.mysql.com/bug.php?id= 46434

要点是:
1. INTO OUTFILE 旨在生成可供 LOAD DATA 加载的结果
2. 默认情况下 ESCAPED BY 为 '\'
3. 要禁用转义,请使用 ESCAPED BY ''

Got an answer here https://bugs.mysql.com/bug.php?id=46434

Main points are:
1. INTO OUTFILE is intended to produce results ready to load by LOAD DATA
2. By default ESCAPED BY is '\'
3. To disable escaping, use ESCAPED BY ''

原野 2024-10-28 12:35:19

我通过指定 \r\n 作为行终止符而不是 \n 解决了这个问题:

SELECT * FROM business WHERE id > 0 AND id <= 20000
INTO OUTFILE "business.csv"
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"'
ESCAPED BY '\'
LINES TERMINATED BY '\r\n';

现在每行由 \r\n 分隔,但数据中的任何换行符都不会转义 - 假设其中存在的行分隔符都是 \n,而不是 \r\n

令人惊讶的是,这在 Linux 上运行良好,符合我的目的 - 使用 League\Csv (PHP) 导入。我猜想导入生成的 CSV 的任何软件都必须足够智能,能够区分 \n\r\n 换行符。

I solved this by specifying \r\n as the line terminator, rather than \n:

SELECT * FROM business WHERE id > 0 AND id <= 20000
INTO OUTFILE "business.csv"
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"'
ESCAPED BY '\'
LINES TERMINATED BY '\r\n';

Each row is now separated by \r\n, but any newlines inside your data will be left unescaped – assuming that the line separators present therein are all \n, not \r\n.

Surprisingly, this worked fine on Linux for my purposes – importing using League\Csv (PHP). I'm guessing that whatever software will be importing your generated CSVs has to be smart enough to differentiate between \n and \r\n for line breaks.

白云悠悠 2024-10-28 12:35:19
SELECT * FROM business WHERE id > 0 AND id <= 20000 INTO OUTFILE "business.csv"fields terminated by ',' 
OPTIONALLY ENCLOSED BY '"' ESCAPED BY '' LINES TERMINATED BY '\n';

首先,不要将 '"" 作为转义符,这会改变您的内容。

其次,如果您在 cli 上使用此查询,如下所示,您还需要远程添加多行的额外 '\n' 行。

mysql -e "SELECT * FROM business WHERE id > 0 AND id <= 20000 INTO OUTFILE "business.csv"
fields terminated by ',' OPTIONALLY ENCLOSED BY '"' ESCAPED BY '' 
LINES TERMINATED BY '\n';"
SELECT * FROM business WHERE id > 0 AND id <= 20000 INTO OUTFILE "business.csv"fields terminated by ',' 
OPTIONALLY ENCLOSED BY '"' ESCAPED BY '' LINES TERMINATED BY '\n';

First, do not put '"" as escape, this will change your content.

Second, if you are using this query on a cli as bellow you need to also remote an extra '\n' add row with multiple lines.

mysql -e "SELECT * FROM business WHERE id > 0 AND id <= 20000 INTO OUTFILE "business.csv"
fields terminated by ',' OPTIONALLY ENCLOSED BY '"' ESCAPED BY '' 
LINES TERMINATED BY '\n';"
夏了南城 2024-10-28 12:35:19

我遇到过类似的问题。

  1. 为了防止字段值中存在换行符时出现额外的反冲 \,我添加了 ESCAPED BY '\'
  2. 为了防止 MySQL 终止字段值中存在换行符的字段,我将换行符 \n 替换为空格 ' '
SELECT
    REPLACE(field1, '\n', ' '),
    REPLACE(field2, '\n', ' '),
    REPLACE(fieldN, '\n', ' ')
FROM business
WHERE id > 0 AND id <= 20000 INTO OUTFILE "business.csv"
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' ESCAPED BY '\' LINES TERMINATED BY '\n';

I experienced similar problem.

  1. To prevent extra backlash \ where there are newlines in field values, I added ESCAPED BY '\'.
  2. To prevent MySQL terminating fields where newlines exist within a field value, I replaced newline \n with space ' '.
SELECT
    REPLACE(field1, '\n', ' '),
    REPLACE(field2, '\n', ' '),
    REPLACE(fieldN, '\n', ' ')
FROM business
WHERE id > 0 AND id <= 20000 INTO OUTFILE "business.csv"
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' ESCAPED BY '\' LINES TERMINATED BY '\n';
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文