MySQL 导出到 outfile:CSV 转义字符

发布于 2024-07-27 00:31:15 字数 918 浏览 7 评论 0原文

我有一个包含一些常见字段的时间表数据库表。

id, client_id, project_id, task_id, description, time, date 

还有更多,但这就是要点。

我连夜将该表导出到 CSV 文件,以便为用户提供数据备份。 它还可用作带有一些自定义报告的宏 Excel 文件的数据导入。

这一切都与我使用 php 循环遍历时间表并将行打印到文件中有关。

问题在于大型数据库可能需要几个小时才能运行,这是不可接受的。 因此,我使用 MySQL INTO OUTFILE 命令重写了它,并将其运行时间缩短至几秒钟,这非常棒。

现在的问题是我似乎无法转义描述字段中的所有换行符等。 实际上,用户可以在此处键入任何字符组合,包括回车符/换行符。

这是我拥有的 MySQL 代码片段:

SELECT id, 
       client,
       project,
       task,
       REPLACE(REPLACE(ifnull(ts.description,''),'\n',' '),'\r',' ') AS description, 
       time,
       date  
      INTO OUTFILE '/path/to/file.csv'
      FIELDS ESCAPED BY '""'
      TERMINATED BY ',' ENCLOSED BY '"'
      LINES TERMINATED BY '\n'
      FROM ....

但是...

当我尝试查看输出文件的源时,文件中仍然存在换行符,因此 Excel 的 CSV 导入会破坏所有花哨的宏和数据透视表Excel 向导已创建。

关于最佳行动方案有什么想法吗?

I've a database table of timesheets with some common feilds.

id, client_id, project_id, task_id, description, time, date 

There are more but thats the gist of it.

I have an export running on that table to a CSV file overnight to give the user a backup of their data. It also is used as a data import for a macro Excel file with some custom reports.

This all works with me looping through the timesheets with php and printing the lines to a file.

The problem is with a big database it can take hours to run which isn't acceptable. So I rewrote it with the MySQL INTO OUTFILE command and it reduced it down to a few seconds to run which was great.

The problem now is I can't seem to escape all the new line characters, etc., in the description field. Really, a user can type potentially any combination of characters in here including carriage returns/new lines.

This is a snippet of the MySQL code I have:

SELECT id, 
       client,
       project,
       task,
       REPLACE(REPLACE(ifnull(ts.description,''),'\n',' '),'\r',' ') AS description, 
       time,
       date  
      INTO OUTFILE '/path/to/file.csv'
      FIELDS ESCAPED BY '""'
      TERMINATED BY ',' ENCLOSED BY '"'
      LINES TERMINATED BY '\n'
      FROM ....

But...

When I try look at the source of the output file, newlines still exist in the file, therefore the CSV import for the Excel breaks all the fancy macros and pivot tables the Excel wizard has created.

Any thoughts on a best course of action?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

美羊羊 2024-08-03 00:31:15

我认为你的声明应该是这样的:

SELECT id, 
   client,
   project,
   task,
   description, 
   time,
   date  
  INTO OUTFILE '/path/to/file.csv'
  FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
  LINES TERMINATED BY '\n'
  FROM ts

主要没有 FIELDS ESCAPED BY '""' 选项,OPTIONALLY ENCLOSED BY '"' 将为描述字段等做点什么,你的数字将被视为 Excel 中的数字(不是由数字组成的字符串)

另外,请尝试调用:

SET NAMES utf8;

在选择输出文件之前,这可能有助于内联字符编码(全部为 UTF8)

让我们知道您的进展情况。

I think your statement should look like:

SELECT id, 
   client,
   project,
   task,
   description, 
   time,
   date  
  INTO OUTFILE '/path/to/file.csv'
  FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
  LINES TERMINATED BY '\n'
  FROM ts

Mainly without the FIELDS ESCAPED BY '""' option, OPTIONALLY ENCLOSED BY '"' will do the trick for description fields etc and your numbers will be treated as numbers in Excel (not strings comprising of numerics)

Also try calling:

SET NAMES utf8;

before your outfile select, that might help getting the character encodings inline (all UTF8)

Let us know how you get on.

栩栩如生 2024-08-03 00:31:15

这是这里的工作原理:
模拟 Excel 2003(另存为 CSV 格式)

SELECT 
REPLACE( IFNULL(notes, ''), '\r\n' , '\n' )   AS notes
FROM sometables
INTO OUTFILE '/tmp/test.csv' 
FIELDS TERMINATED BY ',' ENCLOSED BY '"' ESCAPED BY '"'
LINES TERMINATED BY '\r\n';
  1. Excel 保存 \r\n 作为行分隔符。
  2. Excel 在列数据中保存换行符 \n
  3. 必须首先替换数据中的 \r\n 否则 Excel 会认为它是下一行的开始。

Here is what worked here:
Simulates Excel 2003 (Save as CSV format)

SELECT 
REPLACE( IFNULL(notes, ''), '\r\n' , '\n' )   AS notes
FROM sometables
INTO OUTFILE '/tmp/test.csv' 
FIELDS TERMINATED BY ',' ENCLOSED BY '"' ESCAPED BY '"'
LINES TERMINATED BY '\r\n';
  1. Excel saves \r\n for line separators.
  2. Excel saves \n for newline characters within column data
  3. Have to replace \r\n inside your data first otherwise Excel will think its a start of the next line.
半仙 2024-08-03 00:31:15

如果您尝试以下操作会发生什么?

不要使用双重 REPLACE 语句,而是尝试:

REPLACE(IFNULL(ts.description, ''),'\r\n', '\n')

另外,我认为它应该是 LINES TERMINATED BY '\r\n' 而不是 '\n'< /代码>

What happens if you try the following?

Instead of your double REPLACE statement, try:

REPLACE(IFNULL(ts.description, ''),'\r\n', '\n')

Also, I think it should be LINES TERMINATED BY '\r\n' instead of just '\n'

菊凝晚露 2024-08-03 00:31:15

在没有实际查看输出文件进行确认的情况下,我的猜测是您必须删除 FIELDS ESCAPED BY 值。

MySQL 的 FIELDS ESCAPED BY 可能有两种您没有想到的行为:(1)它仅意味着一个字符,所以在您的情况下它可能只等于一个引号; (2) 它用于 MySQL 认为需要转义的每个字符之前,包括 FIELDS TERMINATED BYLINES TERMINATED BY 值。 这对于大多数计算领域来说都是有意义的,但这并不是 Excel 的转义方式。

我认为你的双 REPLACE 正在工作,并且你成功地用空格替换了文字换行符(在 Windows 样式换行符的情况下是两个空格)。 但是,如果数据中有任何逗号(文字,而不是字段分隔符),则这些逗号前面会带有引号,Excel 处理引号的方式与 MySQL 的处理方式有很大不同。 如果是这样的话,那么导致 Excel 出错的错误换行符实际上是 MySQL 原本打算作为行终止符的换行符。

Without actually seeing your output file for confirmation, my guess is that you've got to get rid of the FIELDS ESCAPED BY value.

MySQL's FIELDS ESCAPED BY is probably behaving in two ways that you were not counting on: (1) it is only meant to be one character, so in your case it is probably equal to just one quotation mark; (2) it is used to precede each character that MySQL thinks needs escaping, including the FIELDS TERMINATED BY and LINES TERMINATED BY values. This makes sense to most of the computing world, but it isn't the way Excel does escaping.

I think your double REPLACE is working, and that you are successfully replacing literal newlines with spaces (two spaces in the case of Windows-style newlines). But if you have any commas in your data (literals, not field separators), these are being preceded by quotation marks, which Excel treats much differently than MySQL. If that's the case, then the erroneous newlines that are tripping up Excel are actually newlines that MySQL had intended as line terminators.

千秋岁 2024-08-03 00:31:15

可能不会有帮助,但您可以尝试使用该内容创建 CSV 表:

DROP TABLE IF EXISTS foo_export;
CREATE TABLE foo_export LIKE foo;
ALTER TABLE foo_export ENGINE=CSV;
INSERT INTO foo_export SELECT id, 
   client,
   project,
   task,
   REPLACE(REPLACE(ifnull(ts.description,''),'\n',' '),'\r',' ') AS description, 
   time,
   date
  FROM ....

Probably won't help but you could try creating a CSV table with that content:

DROP TABLE IF EXISTS foo_export;
CREATE TABLE foo_export LIKE foo;
ALTER TABLE foo_export ENGINE=CSV;
INSERT INTO foo_export SELECT id, 
   client,
   project,
   task,
   REPLACE(REPLACE(ifnull(ts.description,''),'\n',' '),'\r',' ') AS description, 
   time,
   date
  FROM ....
新一帅帅 2024-08-03 00:31:15

下面的过程帮助我解决了所有转义问题,并使该过程更加通用。

CREATE PROCEDURE `export_table`(
IN tab_name varchar(50), 
IN select_columns varchar(1000),
IN filename varchar(100),
IN where_clause varchar(1000),
IN header_row varchar(2000))

BEGIN
INSERT INTO impl_log_activities(TABLE_NAME, LOG_MESSAGE,CREATED_TS) values(tab_name, where_clause,sysdate());
COMMIT;
SELECT CONCAT( "SELECT ", header_row,
    " UNION ALL ",
    "SELECT ", select_columns, 
    " INTO OUTFILE ", "'",filename,"'"
    " FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\"' ESCAPED BY '""' ",
    " LINES TERMINATED BY '\n'"
    " FROM ", tab_name, " ",
    (case when where_clause is null then "" else where_clause end)
) INTO @SQL_QUERY;

INSERT INTO impl_log_activities(TABLE_NAME, LOG_MESSAGE,CREATED_TS) values(tab_name, @SQL_QUERY, sysdate());
COMMIT;

PREPARE stmt FROM @SQL_QUERY;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;

结尾

Below procedure worked for me to resolve all the escaping issues and have the procedure more a generic utility.

CREATE PROCEDURE `export_table`(
IN tab_name varchar(50), 
IN select_columns varchar(1000),
IN filename varchar(100),
IN where_clause varchar(1000),
IN header_row varchar(2000))

BEGIN
INSERT INTO impl_log_activities(TABLE_NAME, LOG_MESSAGE,CREATED_TS) values(tab_name, where_clause,sysdate());
COMMIT;
SELECT CONCAT( "SELECT ", header_row,
    " UNION ALL ",
    "SELECT ", select_columns, 
    " INTO OUTFILE ", "'",filename,"'"
    " FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\"' ESCAPED BY '""' ",
    " LINES TERMINATED BY '\n'"
    " FROM ", tab_name, " ",
    (case when where_clause is null then "" else where_clause end)
) INTO @SQL_QUERY;

INSERT INTO impl_log_activities(TABLE_NAME, LOG_MESSAGE,CREATED_TS) values(tab_name, @SQL_QUERY, sysdate());
COMMIT;

PREPARE stmt FROM @SQL_QUERY;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;

END

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文