将多行记录上传到SQL Server
我们从客户端收到固定长度的数据集,如下所示:
1 SOMEFILE 20110922
2 20110101ABC999
3 JOHN SMITH 19800201
4 5000000 1000
2 20060101DEF999
3 JANE KOTZE 19811001
4 200000 800
5 5200000 1800
其中每行第一个位置的数字表示该行中的信息类型。类型是:
1 Header record (only appears once, in the first line)
2 Contract record
3 Person record
4 Amounts record
5 Trailer record (only appears once, in the last line)
2、3和4中的信息实际上都与一条记录相关,我需要在上传阶段找到一种方法将它们合并为一条记录。没有标识符明确指定 2、3 和 4 的哪些组合属于彼此,但在所有情况下,它们在原始数据中都被排序为直接出现在另一个的下方。
我需要的是一个预处理步骤,它将获取原始数据,然后将正确的 2,3 和 4 行组合成一条记录(然后再次输出为 txt 文件),如下所示:
20110101ABC999JOHN SMITH 198002015000000 1000
20060101DEF999JANE KOTZE 19811001200000 800
我想过 bcp'ing into SQL (或者甚至只使用 Access)并分配一个自动递增的整数作为 PK。即:
PK Type Record
1 1 SOMEFILE 20110922
2 2 20110101ABC999
3 3 JOHN SMITH 19800201
4 4 5000000 1000
5 2 20060101DEF999
6 3 JANE KOTZE 19811001
7 4 200000 800
8 5 5200000 1800
然后做类似的事情:
select
type2.[record]+type3.[record]+type4.[record]
from
(select [record] from uploaded where [type]=2) as type2
join
(select [record] from uploaded where [type]=3) as type3
on type2.PK + 1 = type3.PK
join
(select [record] from uploaded where [type]=4) as type4
on type2.PK + 2 = type4.PK
但我担心的是,这完全依赖于 SQL Server 按照数据出现在输入文件中的顺序分配 PK;我不确定情况是否一定如此。
有谁知道吗?或者知道更好的方法来做到这一点?
谢谢
卡尔
We receive fixed length datasets from a client that look something like this:
1 SOMEFILE 20110922
2 20110101ABC999
3 JOHN SMITH 19800201
4 5000000 1000
2 20060101DEF999
3 JANE KOTZE 19811001
4 200000 800
5 5200000 1800
where the number in the first position on each line indicates the type of information in the line. The types are:
1 Header record (only appears once, in the first line)
2 Contract record
3 Person record
4 Amounts record
5 Trailer record (only appears once, in the last line)
The information in 2, 3 and 4 all actually relate to one record, and I need to find a way at upload stage to combine them into one. There are no identifiers that explicitly specify which combinations of 2, 3 and 4 belong with one another, but in all cases they have been ordered in the raw data to appear directly below one another.
What I need is a preprocessing step that will take the original data and then combine the correct 2,3 and 4 lines into one record (and then output again as a txt file), like this:
20110101ABC999JOHN SMITH 198002015000000 1000
20060101DEF999JANE KOTZE 19811001200000 800
I have thought of bcp'ing into SQL (or even just using Access) and assigning an auto-incremented integer as PK. i.e:
PK Type Record
1 1 SOMEFILE 20110922
2 2 20110101ABC999
3 3 JOHN SMITH 19800201
4 4 5000000 1000
5 2 20060101DEF999
6 3 JANE KOTZE 19811001
7 4 200000 800
8 5 5200000 1800
and then doing something like:
select
type2.[record]+type3.[record]+type4.[record]
from
(select [record] from uploaded where [type]=2) as type2
join
(select [record] from uploaded where [type]=3) as type3
on type2.PK + 1 = type3.PK
join
(select [record] from uploaded where [type]=4) as type4
on type2.PK + 2 = type4.PK
But what I am worried about is that this is entirely dependent on SQL Server assigning the PKs in the order that the data appears in die input file; I am not sure that this would necessarily be the case.
Does anyone know? Or know of a better way to do this?
Thanks
Karl
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
编辑:添加了第二个解决方案
解决方案1:
您无法确定 SQL Server 插入顺序。
在将数据导入 SQL Server 之前,您必须进行一些文本文件处理。例如,您可以使用
PowerShell
将PK
添加到文件中,如下所示:Before (
MyData.txt
content):After
PowerShell< /code> 处理(
MyDataResults.txt
内容):在这两个 PS 脚本中,我假设您可以插入 max. 99999 行。
解决方案 2:
结果:
Edit: added second solution
Solution 1:
You can not be sure regarding SQL Server insert order.
You have to do some text file processings before importing your data in SQL Server. For example, you can use
PowerShell
to add aPK
into file thus:Before (
MyData.txt
content):After
PowerShell
processing (MyDataResults.txt
content):In both PS scripts I assume you can insert max. 99999 rows.
Solution 2:
Results: