SSIS 或 SQL Server 中的字符级分析

发布于 2024-11-25 18:42:03 字数 356 浏览 1 评论 0原文

我需要分析数据库中的参考字段以了解它们组成的模式。这需要在字符级别完成,因为参考字段中不会有空格或标点符号。

作为一个例子,我正在寻找一个将接受如下输入的解决方案:

ABA1235DV6778 ABA1235DV6788 ABA2335DV6778

并建议如下模式:

ABA\d\d35DV67\d\d

一旦我能够理解这些列中的允许值,这将用于稍后验证这些参考字段。

我查看了 SSIS 中的分析功能,但它似乎缺乏粒度。有谁知道我如何调整 SSIS 2008 中的分析或具有可用于实现此目的的 SQL Server 2008 的有效功能?

任何帮助将不胜感激,

尼尔

I need to profile reference fields in a database to understand the patterns they are composed of. This needs to be done at a character level as there will be no spaces or punctuation in the reference fields.

As an example I'm looking for a solution that will take input like:

ABA1235DV6778
ABA1235DV6788
ABA2335DV6778

And suggest patterns like:

ABA\d\d35DV67\d\d

This will be used to later validate those reference fields once I can understand the permissable values in those columns.

I have looked at the profiling functionality in SSIS but it seems to lack granularity. Does anybody know how I can tune the profiling in SSIS 2008 or have an efficient function for SQL Server 2008 that can be used to achieve this?

Any help would be greatly appreciated,

Niall

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

囚你心 2024-12-02 18:42:03

从您的帖子中并不清楚您想要对字符串应用什么逻辑。我猜您想使用某种形式的 编辑距离 计算来识别相似的字符串,然后 < a href="http://www.regular-expressions.info/regexmagic.html" rel="nofollow">生成与它们全部匹配的正则表达式。这些任务通常是在用适当语言编写的外部程序中实现的,而不是在 SSIS 或 SQL Server 中实现的。这当然不是您可以使用预先存在的 SSIS 功能完成的事情。

因此,我现在会忘记 SSIS,并找出在 .NET(或您熟悉的任何其他语言)中实现算法的最佳方法。完成此操作后,您可以决定是否:

  • 编写独立的可执行文件并从执行进程任务调用它
  • 编写 .NET DLL 并在脚本任务、脚本组件或 CLR 存储过程中使用它
  • 编写您自己的自定义 SSIS组件
  • 编写完整的程序而不是使用SSIS

It's not really clear from your post exactly what logic you want to apply to the strings. I'm guessing you want to use some form of edit distance calculation to identify similar strings, then generate a regular expression that matches them all. Those are typically tasks that would be implemented in an external program written in an appropriate language, not in SSIS or SQL Server. It is certainly not something you can do with pre-existing SSIS functionality.

So I would forget SSIS for now and work out the best way to implement your algorithm in .NET (or whatever other language you're comfortable with). Once you've done that you can decide whether to:

  • Write a self-contained executable and call it from an Execute Process task
  • Write a .NET DLL and use it in a Script Task, Script Component or CLR stored procedure
  • Write your own custom SSIS component
  • Write a complete program instead of using SSIS
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文