基于数组变量的条件分割
我需要类似 T-SQL IN 语句的东西来根据数组变量(或类似的东西)过滤条件拆分中的记录,
我需要有一个可以过滤列的项目列表。
I need something like a T-SQL IN statement to filter records in a conditional split based on an array variable (or something similar)
I need to have a list of items that a column can be filtered on.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
正如 Filip 所指出的,表达式语言中没有 IN 运算符。我确实提出了一些选择,因为我认为这听起来是一个有趣的问题。
我的长篇分析在我的博客上:Filter list in SSIS
条件分割
如果您可以将值列表转换为分隔字符串,那么您可以使用
FINDSTRING
和当前值来确定它是否在列表中。这为我的测试场景提供了最佳吞吐量。(FINDSTRING(@[User::MyListStr], [MyColumn],1)) > 0
脚本任务
我曾假设在脚本任务中使用列表来确定成员身份将提供最佳性能,但我错了。 Row.IsInList = MyListObj.Contains(Row.MyColumn);
查找/缓存连接管理器
我想出的第三种方法是将列表转储到缓存连接管理器中,然后在查找任务中使用它。我认为这是最容易概念化和维护的,但缺乏性能。
结论
对于这个问题域,FINDSTRING 方法是最有效的,效率显着提高。其他两种方法的平均吞吐量始终在每毫秒 7 行以内。我确实发现有趣的是 FINDSTRING 方法的标准差波动如此之大。虽然这个盒子比较旧并且速度较慢,但在包执行期间没有进行大量的活动。
As Filip has indicated, there is no IN operator in the expression language. I did come up with some options though as I thought this sounded like an interesting problem.
My long analysis is on my blog: Filter list in SSIS
Conditional split
If you can transform your list of values into a delimited string, then you can use
FINDSTRING
and the current value to determine whether it's in the list. This provided the best throughput for my testing scenario.(FINDSTRING(@[User::MyListStr], [MyColumn],1)) > 0
Script task
I had assumed using a List in a script task to determine membership would provide the best performance but I was wrong.
Row.IsInList = MyListObj.Contains(Row.MyColumn);
Lookup/Cache Connection Manager
The third approach I had come up with was dumping the list into a Cached Connection Manager and then using that in a lookup task. I thought this was the easiest to conceptualize and maintain but the performance was lacking.
Conclusion
For this problem domain, the FINDSTRING approach was the most efficient, by a considerable margin. The other two approaches consistently averaged a throughput of within 7 rows per millisecond of each other. I did find it interesting that the standard deviation of the FINDSTRING approach fluctuated so much. While this box is older and slower, there was not a considerable amount of activity going on during the package executions.
SSIS 表达式运算符中没有 IN 运算符。并且没有类似的运算符。由于没有这样的运算符,因此您无法使用内置表达式和内置条件拆分来做到这一点。但是您可以执行以下操作之一:
There is no IN operator in SSIS expression operators. And there is no similar operator. Since there is no such operator, You can't do that with built-in expressions and built-in Conditional Split. But You can do one of the following: