使用 SSIS 读取具有多种记录类型的平面文件
我们正在评估 SSIS,看看它是否适合即将推出的新项目。其中一个进程必须处理带有分隔记录的平面文件。该文件将包含订单。有标题行、(可选)送货地址行和一个或多个详细信息行。每行的字段都被分隔,但格式不同。
我读到了这个答案:
我可以使用条件拆分任务拆分数据以生成多个输出,但不确定如何从那里继续。我有两个需要解决的问题:
应首先在地址和详细信息之前插入订单标题,因为地址和详细信息将引用订单记录,所以我认为我需要先处理该输出,但我我不确定在 SSIS 中如何使条件拆分任务的该分支在其他分支之前处理。理想情况下,我想处理订单标头,然后将订单 ID 存储在用户变量中,以便在处理详细信息时,我可以引用该变量。
文件中会有多个订单,因此拆分会更复杂。
我总是可以用 C# 编写一个应用程序来预处理文件或将文件读入暂存表,但我不确定我是否喜欢这种方法。
有经历过这个过程的人可以分享一些他们如何处理这个过程的见解吗?
谢谢,
克里斯
We are evaluating SSIS to see if it will be appropriate for a new project that is coming up. One of the processes will have to process a flat file with delimited records. The file will contain orders. There is a header line, an (optional) shipping address line, and one or more detail lines. Each line's fields are delimited but are not the same format.
I read this answer:
SSIS transactional data (different record types, one file)
And I can split the data using the Conditional Split task to produce several outputs, but am not sure how to proceed from there. I have two issues that I need to resolve:
The order header should be inserted first, before the address and details since the address and details will reference the order record, so I think I need to process that output first, but I'm not sure in SSIS how to make that branch of the Conditional Split task be processed before the other branches. Ideally, I would like to process the order header and then store the order id in a user variable so that when processing the details, I can reference that variable.
There will be multiple orders in the file, so splitting it is more complex.
I could always write an application in C# that will preprocess the file or read the file into a staging table, but I'm not sure I like that approach.
Can anyone who has been through this process share some insights into how they dealt with it?
Thanks,
Chris
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
分割后,将每种类型的记录存入其自己的暂存表中 - 或存入 ssis 原始数据目标中,这对于像这样的中间步骤来说更快且更好。然后将所有标头加载到最终表中并继续进行,不会出现引用错误。
我假设详细记录中有 headerID?这应该会让你的第二个问题的处理变得容易。如果没有,请告诉我们。
After the split, deposit each type of record into it's own staging table - or into a ssis raw data destination, which is faster and good for intermediate steps like this. Then load all the headers into their final table and proceed without referential errors.
I'm assuming the detail records have a headerID in them? That should make dealing with your 2nd question easy. If not, let us know.