SSIS 将平面文件传输到具有固定位置的表时出现问题
我有几个关于我所遇到的任务的问题,任何答案将不胜感激。
我必须从平面文件 (CSV) 中提取数据作为输入,并根据位置以特定格式将数据加载到目标表中。
例如,如果我有 order_id,Total_sales,Date_Ordered
,其中包含一些数据,我必须提取数据并将其加载到表中,如下所示:
第一个字段的固定长度为2 以数字作为数据类型。
total_sales
插入到表中的total_sales
列中,数据类型为数字,长度为 10。日期为日期时间,其格式与平面文件,例如
ccyy-mm-dd.hh.mm.ss.xxxxxxxx
(此处x
必须用零填充)。
也许我没有正确的想法来解决这个问题 - 任何解决方案将不胜感激。
我尝试使用以下方法:
使用平面文件源获取 CSV 文件,然后将其作为 OLE DB 目标的输入,并创建固定数据类型的表。这里的问题是列已加载,但我必须用零填充它们,以防加载日期或在大多数列中,如果我没有利用总长度,那么它必须在前面加零它。
例如,如果我有一个长度为 4 的 Orderid,并且在平面文件中我有一个类似于
201
的订单 ID,那么当它发生时,它必须更改为0201
已加载到表中。我还尝试了另一种使用平面文件源的方法,并创建了一个变量,该变量将整行作为输入,并尝试将其与派生列分开。我在一定程度上成功地获得了它,但最终派生列中的数据类型明确固定为布尔类型,我无法将其更改为我想要的数据类型。
请给我一些关于如何处理这个问题的建议......
I have a couple of questions about the task on which I am stuck and any answer would be greatly appreciated.
I have to extract data from a flat file (CSV) as an input and load the data into the destination table with a specific format based on position.
For example, if I have order_id,Total_sales,Date_Ordered
with some data in it, I have to extract the data and load it in a table like so:
The first field has a fixed length of 2 with numeric as a datatype.
total_sales
is inserted into the column oftotal_sales
in the table with a numeric datatype and length 10.date as datetime in a format which would be different than that of the flat file, like
ccyy-mm-dd.hh.mm.ss.xxxxxxxx
(herex
has to be filled up with zeros).
Maybe I don't have the right idea to solve this - any solution would be appreciated.
I have tried using the following ways:
Used a flat file source to get the CSV file and then gave it as an input to OLE DB destination with a table of fixed data types created. The problem here is that the columns are loaded, but I have to fill them up with zeros in case the date when it is been loaded or in most of the columns if I am not utilizing the total length then it has to preceded with zeros in it.
For example, if I have an Orderid of length 4 and in the flat file I have an order id like
201
then it has to be changed to0201
when it is loaded in the table.I also tried another way of using a flat file source and created a variable which takes the entire row as an input and tried to separate it with derived columns. I was to an extent successful in getting it, but at last the data type in the derived column got fixed to Boolean type explicitly, which I am not able to change to the data type I want.
Please give me some suggestions on how to handle this issue...
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
假设您有一个以下格式的 csv 文件,
我将首先创建一个平面文件源(在数据流任务内),但不是将其固定宽度,而是将格式设置为“分隔”。勾选第一个数据行中的列名称。在列选项卡上,确保行分隔符设置为“{CR}{LF}”,列分隔符设置为“逗号(,)”。最后,在“高级”选项卡上,将每列的数据类型设置为整数、小数和日期。
您提到在将数字数据类型存储在数据库中时,您希望用前导零填充数字数据类型。数据库中的数字数据类型往往不包含前导零。所以你有两个选择;将数据保留为目标系统中的类型(int、decimal 和 dateTime),或者使用“派生列”控件将其转换为字符串。如果您决定将它们存储为字符串,则添加类似于派生列控件的表达式
将为订单 ID 添加最多 5 个前导零(不要忘记将数据类型长度设置为 5),并会产生订单 ID “00001”
在数据流目标中创建目标并相应地进行表/字段映射(或让 SSIS 为您创建新表/映射)。
Assuming you have a csv file in the following format
I would start by creating a Flat File Source (inside a Data Flow Task), but rather than having it fixed width, set the format to Delimited. Tick the Column names in the first data row. On the column tab, make sure row delimiter is set to "{CR}{LF}" and column delimiter is set to "Comma(,)". Finally, on the Advanced tab, set the data types of each column to integer, decimal and date.
You mention that you want to pad the numeric data types with leading zero's when storing them in the database. Numeric data types in databases tend not to hold leading zero's. So you have two options; either hold the data as the type they are in the target system (int, decimal and dateTime) or use the Derived Column control to convert them to strings. If you decide to store them as strings, adding an expression like
to the Derived Column control will add up to 5 leading zeros to order id (don't forget to set the data type length to 5) and would result in an order id of "00001"
Create your target within a Data Flow Destination and make the table/field mappings accordingly (or let SSIS create a new table / mappings for you).