SQL Server BULK INSERT - 插入日期时间值

发布于 2024-11-30 18:11:58 字数 1531 浏览 4 评论 0原文

我有 600 万行数据想要插入到我的 SQL Server 数据库中。我可以用 600 万条 INSERT 语句以缓慢的方式完成(根据我的计算,运行需要 18 个小时),或者我可以尝试 BULK INSERT。

BULK INSERT 存在无法转义字符的问题,但本例中的数据非常简单,因此不应遇到此问题。

然而,SQL Server 似乎不喜欢将任何形式的日期/时间数据插入到字段中。

这是表(伪 SQL)

CREATE TABLE Tasks (
    TaskId bigint NOT NULL IDENTITY(1,1) PRIMARY KEY,
    TriggerId bigint NOT NULL FOREIGN KEY,
    Created datetime NOT NULL,
    Modified datetime NOT NULL,
    ScheduledFor datetime NULL,
    LastRan datetime NULL,
    -- and about 10 more fields after this
)

这是我的 BULK INSERT 语句:

SET DATEFORMAT dmy
BULK INSERT Tasks
FROM 'C:\TasksBulk.dat'
WITH (
    -- CHECK_CONSTRAINTS is not necessary as the only constraints are always enforced regardless of this option (UNIQUE, PRIMARY KEY, and NOT NULL)
    CODEPAGE = 'RAW',
    DATAFILETYPE = 'native',

    KEEPIDENTITY,
    MAXERRORS = 1,
    ORDER ( CallId ASC ),

    FIELDTERMINATOR = '\t',
    ROWTERMINATOR   = '\0'
)

这是 TasksBulk.dat 中的第一行数据:(

1000\t1092\t01/01/2010 04:00:17\t01/01/2010 04:00:17\t\t01/01/2010 04:00:14\0

为了可读性,用制表符重新格式化为 4 个空格:)

1000    1092    01/01/2010 04:00:17    01/01/2010 04:00:17        01/01/2010 04:00:14\0

但是,当我运行 BULK 时INSERT 语句,我收到此错误:

消息 4864,级别 16,状态 1,第 2 行批量加载数据转换错误 (为指定的代码页键入不匹配或无效字符) 第 1 行,第 3 列(已创建)。

我尝试使用不同的行和字段终止符以及每种不同的日期/时间格式(包括“01/01/2010”、“2010-01-01”,带或不带“04:00:17”时间组件)。我不知道我在这里做错了什么。

I have 6 million rows worth of data I want to insert into my SQL Server database. I can do it the slow way with 6 million INSERT statements (by my calculation it would take 18 hours to run) or I can try BULK INSERT.

BULK INSERT has issues with not being able to escape characters, but the data in this case is very simple and so shouldn't run into this problem.

However SQL Server doesn't seem to like any form of date/time data to be inserted into a field.

Here is the table (psuedo-SQL)

CREATE TABLE Tasks (
    TaskId bigint NOT NULL IDENTITY(1,1) PRIMARY KEY,
    TriggerId bigint NOT NULL FOREIGN KEY,
    Created datetime NOT NULL,
    Modified datetime NOT NULL,
    ScheduledFor datetime NULL,
    LastRan datetime NULL,
    -- and about 10 more fields after this
)

Here is my BULK INSERT statement:

SET DATEFORMAT dmy
BULK INSERT Tasks
FROM 'C:\TasksBulk.dat'
WITH (
    -- CHECK_CONSTRAINTS is not necessary as the only constraints are always enforced regardless of this option (UNIQUE, PRIMARY KEY, and NOT NULL)
    CODEPAGE = 'RAW',
    DATAFILETYPE = 'native',

    KEEPIDENTITY,
    MAXERRORS = 1,
    ORDER ( CallId ASC ),

    FIELDTERMINATOR = '\t',
    ROWTERMINATOR   = '\0'
)

And here is the first row of data in TasksBulk.dat:

1000\t1092\t01/01/2010 04:00:17\t01/01/2010 04:00:17\t\t01/01/2010 04:00:14\0

(For readability, reformatted with tab characters replaced with 4-spaces:)

1000    1092    01/01/2010 04:00:17    01/01/2010 04:00:17        01/01/2010 04:00:14\0

However when I run the BULK INSERT statement, I get this error:

Msg 4864, Level 16, State 1, Line 2 Bulk load data conversion error
(type mismatch or invalid character for the specified codepage) for
row 1, column 3 (Created).

I have tried using different row and field terminators and every different date/time format (including "01/01/2010", "2010-01-01", both with and without the "04:00:17" time component). I don't know what I'm doing wrong here.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

§普罗旺斯的薰衣草 2024-12-07 18:11:58

事实证明,将 DATAFILETYPE 从“native”更改为“char”解决了问题。 “native”类型意味着所有内容都有严格的数据格式,而“char”则适用于更纯文本的文件。

It turns out that changing the DATAFILETYPE from 'native' to 'char' solved the problem. The 'native' type implies a strict data format for everything, whereas 'char' is meant for more plaintext files.

迷你仙 2024-12-07 18:11:58

您将 CODDEPAGE 设置为 RAW(大概是为了速度)。

该错误消息意味着您的数据包含代码页之外的字符。

代码页 [ = 'ACP' | 'OEM'| '原始'| '代码页']

指定数据文件中数据的代码页。代码页是
仅当数据包含 char、varchar 或 text 列时相关
字符值大于 127 或小于 32。

但这可能会产生误导。您的示例数据行包含缺失的列。如果不使用格式文件,则必须使用表中的每个字段。

因此,您可以创建一个格式文件,或者为日期时间列创建一个使用 varchar(25) 的临时表,导入然后执行从临时表到目标表的更新。这样您就可以更好地控制转换和丢失数据。

You have your CODDEPAGE set to RAW (presumably for speed).

The error message implies your data contains characters outside the codepage.

CODEPAGE [ = 'ACP' | 'OEM' | 'RAW' | 'code_page' ]

Specifies the code page of the data in the data file. CODEPAGE is
relevant only if the data contains char, varchar, or text columns with
character values greater than 127 or less than 32.

But that could be misleading. Your example data line contains a missing column. If you don't use a format file, every field in the table must be used.

So you could either create a format file or create a staging table with varchar(25) for the datetime columns, importing and then perform an update from the staging table into the destination table. That way you have more control over conversions and missing data.

我要还你自由 2024-12-07 18:11:58

我熟悉的一种方法是以整数的形式插入日期。

我使用从某个日期开始的秒数(我使用过去 10 多年的秒数,因为我不会访问或生成比该日期更旧的任何数据)

日期 2012-01-02 12: 15:10.000 将存储为 378637886,使用 2000 年 1 月 1 日的参考点。

查询数据库时,可以使用 DateAdd(SS,column_name,'2000-01-01')。

如果需要这种精度,您也可以在毫秒内完成此操作。

我使用自己的自定义函数将时间(以秒为单位)转换为我想要的任何格式,并使用另一个自定义函数将日期转换回秒。

我意识到这可能不是一个好方法,因为它可能需要您更改数据库和代码,但也许它可能是其他人会发现有用的解决方案概念。

A method I'm familiar with is to insert your dates in the form of an integer.

I use the number of seconds starting from a certain date (I use one over 10 years in the past, since there isn't any data I would be accessing or generating that is older than that)

The date 2012-01-02 12:15:10.000 would be stored as 378637886, using the reference point of January 1st, 2000.

When querying the database, you could have the column returned using DateAdd(SS, column_name, '2000-01-01').

You could do this in millieseconds too if that kind of precision was needed.

I use my own custom function to convert my time in seconds into whatever format I would like, and I use another custom function to turn dates back into seconds.

I realize this may not be a good method because it could require database changes and code changes for you, but perhaps it could be a solution concept that others would find useful.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文