SSIS:从平面文件中删除字符
在 SSIS 中从平面文件中删除字符的最佳方法是什么?就我而言,我需要在处理之前从文件中删除所有引号。
编辑:
如何针对 SSIS 中的某些文件运行可执行文件?我可以以某种方式使用源连接作为输入,还是必须将文件名作为参数传递?
What's the best way to strip out characters from flat files in SSIS? In my case, I need to remove all quotes from the file before processing.
EDIT:
How can I run an executable against some files from SSIS? Can I somehow use the source connection as an input or would I have to pass in the file names as parameters?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
最简单的方法是创建一个“转换”脚本组件并使用代码去除引号。
The easiest way to do this would be to create a "Transformation" script component and use code to strip the quotes.
我确实使用了派生列转换
,例如如果我想使用派生列替换 ssis 中的 ', " ", # ,我会写
但是我觉得,如果需要过滤掉 1 或 2 列,那么这个方法是好的。如果涉及更多,则继续
脚本任务
I did using
Derived Column Transformation
e.g. If I want to Replace ', " " , # in ssis using derived column, I would write
But I feel that, if 1 or 2 columns needs to be filtered out, then this approach is good. If it involves more, then go ahead with
Script task
由于您必须对文件上的所有字段执行此操作,因此我建议将其作为该过程的第一步,而不是作为转换工作流程中的操作。
您可以编写自己的 .NET 脚本并将其嵌入到脚本任务中。
您还可以通过执行流程任务调用第三方工具或组件。
例如,如果您有权访问 cygwin unix 命令行,则类似这样的操作应该可以完成工作:
您可以通过提到的“执行进程任务”组件调用可执行文件,并且可以通过在组件属性上设置表达式来参数化其输入。这些表达式可以基于可通过配置文件配置的输入变量。 (这只是 SSIS 提供的实现此类目标的众多方法之一)
Since it's something that you have to do for all the fields on your files, I'd recommend doing it as a first step of the process and not as an operation in the transformation workflow.
You can code your own .NET script and embed it on a Script task.
You can also call a third party tool or component via an Execute Process Task.
For instance if you have access to cygwin unix command-line, something like this should do the work:
You can call an executable via the mentioned "Execute process task" component, and you can parametrize its inputs by setting expressions on the component's attributes. Those expressions can be based on input variables that might be configured via configuration files. (this is just one of the many ways that SSIS provides to achieve something like this)
如果我正确理解您的问题,您想从文本文件中的任何列值中删除任何引号吗?如果是这种情况,您将使用派生列转换。您可以在派生列名称下拉列表中选择替换“column_name”。然后,您可以使用以下代码填充表达式属性:
REPLACE( "\"", [column_name] , "")
希望这会有所帮助。
If I understand your question correctly, you would like to remove any quotes from any of the column values that are in your text file? If this is the case, you would use a derived column transformation. You would select Replace "column_name" in the Derived Column Name drop down. You would then populate the expression property with the following code:
REPLACE( "\"", [column_name] , "")
Hope this helps.
Unclepaul84 和 Ryan Fonnetts 的解决方案都可以工作,但就我个人而言,我更倾向于 Unclepaul 的解决方案,因为我发现我知道有多个文件需要删除引号,并且我可以对每个文件使用相同的转换代码(这很好)。
Both unclepaul84 and Ryan Fonnetts' solutions would work but personally I go more towards unclepaul's because I found that I know have multiple files that need quotes stripped out and I can use the same transformation code for every one (which is nice).
这些是解决方案,但我相信平面文件连接管理器中内置了更简单的功能。
在第一个选项卡“常规”上,您将在“格式”部分下找到第一个文本框,称为“文本限定符”。在此输入一个双引号(如果所有列都用单引号括起来,则输入单引号),然后单击“预览”。
这应该可以解决您的问题!
These are solutions, but I believe there is easier funcitonality built right into the Flat File Connection Manager.
On the ver first tab, General, you will find the first text box under the Format sectin, called Text Qualifier. Enter one Double Quotation (or a single quote if all columns are wrapped in single quotes) here and click on preview.
This should solve your issue!