基于数组变量的条件分割

发布于 2024-12-18 05:45:45 字数 79 浏览 2 评论 0原文

我需要类似 T-SQL IN 语句的东西来根据数组变量(或类似的东西)过滤条件拆分中的记录,

我需要有一个可以过滤列的项目列表。

I need something like a T-SQL IN statement to filter records in a conditional split based on an array variable (or something similar)

I need to have a list of items that a column can be filtered on.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

夏见 2024-12-25 05:45:45

正如 Filip 所指出的,表达式语言中没有 IN 运算符。我确实提出了一些选择,因为我认为这听起来是一个有趣的问题。

我的长篇分析在我的博客上:Filter list in SSIS

条件分割

如果您可以将值列表转换为分隔字符串,那么您可以使用 FINDSTRING 和当前值来确定它是否在列表中。这为我的测试场景提供了最佳吞吐量。 (FINDSTRING(@[User::MyListStr], [MyColumn],1)) > 0

脚本任务

我曾假设在脚本任务中使用列表来确定成员身份将提供最佳性能,但我错了。 Row.IsInList = MyListObj.Contains(Row.MyColumn);

查找/缓存连接管理器

我想出的第三种方法是将列表转储到缓存连接管理器中,然后在查找任务中使用它。我认为这是最容易概念化和维护的,但缺乏性能。

结论

对于这个问题域,FINDSTRING 方法是最有效的,效率显着提高。其他两种方法的平均吞吐量始终在每毫秒 7 行以内。我确实发现有趣的是 FINDSTRING 方法的标准差波动如此之大。虽然这个盒子比较旧并且速度较慢,但​​在包执行期间没有进行大量的活动。

As Filip has indicated, there is no IN operator in the expression language. I did come up with some options though as I thought this sounded like an interesting problem.

My long analysis is on my blog: Filter list in SSIS

Conditional split

If you can transform your list of values into a delimited string, then you can use FINDSTRING and the current value to determine whether it's in the list. This provided the best throughput for my testing scenario. (FINDSTRING(@[User::MyListStr], [MyColumn],1)) > 0

Script task

I had assumed using a List in a script task to determine membership would provide the best performance but I was wrong. Row.IsInList = MyListObj.Contains(Row.MyColumn);

Lookup/Cache Connection Manager

The third approach I had come up with was dumping the list into a Cached Connection Manager and then using that in a lookup task. I thought this was the easiest to conceptualize and maintain but the performance was lacking.

Conclusion

For this problem domain, the FINDSTRING approach was the most efficient, by a considerable margin. The other two approaches consistently averaged a throughput of within 7 rows per millisecond of each other. I did find it interesting that the standard deviation of the FINDSTRING approach fluctuated so much. While this box is older and slower, there was not a considerable amount of activity going on during the package executions.

同展鸳鸯锦 2024-12-25 05:45:45

SSIS 表达式运算符中没有 IN 运算符。并且没有类似的运算符。由于没有这样的运算符,因此您无法使用内置表达式和内置条件拆分来做到这一点。但是您可以执行以下操作之一:

  • 使用脚本转换来检查特定列是否在变量数组中,如果包含则添加附加列(标志),其值为 1,如果不包含则为 0;然后在脚本转换中添加的此标志上使用条件拆分,或者
  • 最好将变量放入数据库表中,然后使用查找或合并连接来检查行是否存在

There is no IN operator in SSIS expression operators. And there is no similar operator. Since there is no such operator, You can't do that with built-in expressions and built-in Conditional Split. But You can do one of the following:

  • use Script Transformation to check if particular column has that is in variable array, and add additional column (flag) with value 1 if it contains, 0 if not; then use Conditional Split on this flag added in Script Transformation, or
  • it's better to put variables in database table and then use Lookup or Merge Join to check if row exists
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文