批量插入、SQL Server 2000、unix 换行符

发布于 2024-07-12 22:09:23 字数 257 浏览 8 评论 0原文

我正在尝试将 .csv 文件插入带有 unix 换行符的数据库中。我正在运行的命令是：

BULK INSERT table_name
FROM 'C:\file.csv' 
WITH 
( 
    FIELDTERMINATOR = ',', 
    ROWTERMINATOR = '\n' 
)

如果我将文件转换为 Windows 格式，则加载可以工作，但如果可以避免的话，我不想执行此额外步骤。有任何想法吗？

原文

I am trying to insert a .csv file into a database with unix linebreaks. The command I am running is:

BULK INSERT table_name
FROM 'C:\file.csv' 
WITH 
( 
    FIELDTERMINATOR = ',', 
    ROWTERMINATOR = '\n' 
)

If I convert the file into Windows format the load works, but I don't want to do this extra step if it can be avoided. Any ideas?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

离线来电— 2024-07-19 22:09:36

归根结底就是这样。 Unix 使用 LF (ctrl-J)，MS-DOS/Windows 使用 CR/LF (ctrl-M/Ctrl-J)。

当您在 Unix 上使用 '\n' 时，它会被转换为 LF 字符。在 MS-DOS/Windows 上，它被转换为 CR/LF。当您的导入在 Unix 格式的文件上运行时，它只看到一个 LF。因此，首先通过 unix2dos 运行文件通常更容易。但正如您在最初的问题中所说，您不想这样做（我认为您不能这样做有充分的理由）。

为什么不能这样做：

(ROWTERMINATOR = CHAR(10))

可能是因为在解析 SQL 代码时，它没有用 LF 字符替换 char(10)（因为它已经用单引号括起来了）。或者它可能被解释为：

(ROWTERMINATOR =
     )

当您回显 @bulk_cmd 的内容时会发生什么？

It comes down to this. Unix uses LF (ctrl-J), MS-DOS/Windows uses CR/LF (ctrl-M/Ctrl-J).

When you use '\n' on Unix, it gets translated to a LF character. On MS-DOS/Windows it gets translated to CR/LF. When the your import runs on the Unix formatted file, it sees only a LF. Hence, its often easier to run the file through unix2dos first. But as you said in you original question, you don't want to do this (I'll assume there is a good reason why you can't).

Why can't you do:

(ROWTERMINATOR = CHAR(10))

Probably because when the SQL code is being parsed, it is not replacing the char(10) with the LF character (because it's already encased in single-quotes). Or perhaps its being interpreted as:

(ROWTERMINATOR =
     )

What happens when you echo out the contents of @bulk_cmd?

回复收藏 0 原文

深海夜未眠 2024-07-19 22:09:35

我认为“ROWTERMINATOR = '\n'”会起作用。我建议在显示“隐藏字符”的工具中打开文件，以确保该行像您想象的那样被终止。我使用记事本++来做这样的事情。

回复收藏 0 原文

紧拥背影 2024-07-19 22:09:34

在我看来，可以采取两种一般途径：一些替代方法在 SQL 脚本中读取 CSV 或使用多种方法中的任何一种预先转换 CSV（bcp、unix2dos，如果它是一个 -时间之王，您甚至可以使用代码编辑器来修复文件）。

但你必须采取额外的步骤！

如果此 SQL 是从程序启动的，您可能需要转换该程序中的行结尾。在这种情况下，如果您决定自己编写转换代码，则需要注意以下事项：
1. 行结尾可能是\n
2. 或 \r\n
3.甚至\r（Mac！）
4.天哪，可能有些行有 \r\n 而其他行有 \n，任何组合都是可能的，除非你控制 CSV 的来源，好吧

，好吧。可能性4是牵强的。它发生在电子邮件中，但那是另一回事了。

回复收藏 0 原文

锦爱 2024-07-19 22:09:33

一种选择是使用 bcp，并设置一个控制文件以 '\n' 作为换行符。

尽管您已经表示不希望这样做，但另一种选择是使用 unix2dos 来将文件预处理为带有 '\r\n' 换行符的文件。

最后，您可以在BULK INSERT 上使用FORMATFILE 选项。这将使用 bcp 控制文件来指定导入格式。

回复收藏 0 原文

羞稚 2024-07-19 22:09:32

比这更复杂一点！当您告诉 SQL Server ROWTERMINATOR='\n' 时，它会将其解释为 Windows 下的默认行终止符，实际上是“\r\n”（使用 C/C++ 表示法）。如果您的行终止符实际上只是“\n”，您将必须使用上面显示的动态 SQL。我刚刚花了一个小时的时间弄清楚为什么 \n 在与 BULK INSERT 一起使用时并不真正意味着 \n！

回复收藏 0 原文

一向肩并 2024-07-19 22:09:31

我确认该语法

ROWTERMINATOR = '''+CHAR(10)+'''

与 EXEC 命令一起使用时有效。

如果您有多个 ROWTERMINATOR 字符（例如管道和 unix 换行符），则语法为：

ROWTERMINATOR = '''+CHAR(124)+''+CHAR(10)+'''

I confirm that the syntax

ROWTERMINATOR = '''+CHAR(10)+'''

works when used with an EXEC command.

If you have multiple ROWTERMINATOR characters (e.g. a pipe and a unix linefeed) then the syntax for this is:

ROWTERMINATOR = '''+CHAR(124)+''+CHAR(10)+'''

回复收藏 0 原文

凯凯我们等你回来 2024-07-19 22:09:28

感谢所有回答的人，但我找到了我喜欢的解决方案。

当您告诉 SQL Server ROWTERMINATOR='\n' 时，它会将其解释为 Windows 下的默认行终止符，实际上是“\r\n”（使用 C/C++ 表示法）。如果您的行终止符实际上只是“\n”，您将必须使用如下所示的动态 SQL。

DECLARE @bulk_cmd varchar(1000)
SET @bulk_cmd = 'BULK INSERT table_name
FROM ''C:\file.csv''
WITH (FIELDTERMINATOR = '','', ROWTERMINATOR = '''+CHAR(10)+''')'
EXEC (@bulk_cmd)

为什么你不能说 BULK INSERT ...(ROWTERMINATOR = CHAR(10)) 超出了我的范围。您似乎无法计算命令的WITH 部分中的任何表达式。

上面的作用是创建一个命令字符串并执行它。巧妙地避免了创建额外文件或执行额外步骤的需要。

Thanks to all who have answered but I found my preferred solution.

When you tell SQL Server ROWTERMINATOR='\n' it interprets this as meaning the default row terminator under Windows which is actually "\r\n" (using C/C++ notation). If your row terminator is really just "\n" you will have to use the dynamic SQL shown below.

DECLARE @bulk_cmd varchar(1000)
SET @bulk_cmd = 'BULK INSERT table_name
FROM ''C:\file.csv''
WITH (FIELDTERMINATOR = '','', ROWTERMINATOR = '''+CHAR(10)+''')'
EXEC (@bulk_cmd)

Why you can't say BULK INSERT ...(ROWTERMINATOR = CHAR(10)) is beyond me. It doesn't look like you can evaluate any expressions in the WITH section of the command.

What the above does is create a string of the command and execute that. Neatly sidestepping the need to create an additional file or go through extra steps.

回复收藏 0 原文