关系数据到平面文件

发布于 2024-08-14 04:22:54 字数 1037 浏览 8 评论 0原文

我希望您能帮助找到一个问题的答案,该问题将成为工作中反复出现的主题。这涉及将 RDBMS 表中的数据非规范化为具有跨列重复组(共享域和含义)的平面文件格式。不幸的是,这是不可避免的。

这是我需要的转换的一个非常简单的示例:

      TABLE A                               TABLE B
-------------------    1 -> MANY     ----------------------------
  A_KEY    FIELD_A                     B_KEY    A_KEY    FIELD_B
A_KEY_01 A_VALUE_01                  B_KEY_01 A_KEY_01 B_VALUE_01
A_KEY_02 A_VALUE_02                  B_KEY_02 A_KEY_01 B_VALUE_02
                                     B_KEY_03 A_KEY_02 B_VALUE_03

这将变成:

A_KEY       FIELD_A      B_KEY1     FIELD_B1     B_KEY2     FIELD_B2
A_KEY_01   A_VALUE_01   B_KEY_01   B_VALUE_01   B_KEY_02   B_VALUE_02
A_KEY_02   A_VALUE_02   B_KEY_03   B_VALUE_03

TABLE A 中的每个条目将在输出平面文件中包含一行,其中 TABLE B 中的每个相关字段有一列。输出文件中的列可以具有从TABLE B 获取的字段的空值。

我意识到这将创建一个非常的文件,但这是一个要求。我已经看过MapForce和Apatar,但我认为这个问题太奇怪了或者我无法正确使用它们。

我的问题:是否已经有一种工具可以完成此任务,或者我应该从头开始开发一个工具(我不想重新发明轮子)?

I hope you can help find an answer to a problem that will become a recurring theme at work. This involves denormalising data from RDBMS tables to flat file formats with repeating groups (sharing domain and meaning) across columns. Unfortunately this is unavoidable.

Here's a very simplified example of the transformation I'd require:

      TABLE A                               TABLE B
-------------------    1 -> MANY     ----------------------------
  A_KEY    FIELD_A                     B_KEY    A_KEY    FIELD_B
A_KEY_01 A_VALUE_01                  B_KEY_01 A_KEY_01 B_VALUE_01
A_KEY_02 A_VALUE_02                  B_KEY_02 A_KEY_01 B_VALUE_02
                                     B_KEY_03 A_KEY_02 B_VALUE_03

This will become:

A_KEY       FIELD_A      B_KEY1     FIELD_B1     B_KEY2     FIELD_B2
A_KEY_01   A_VALUE_01   B_KEY_01   B_VALUE_01   B_KEY_02   B_VALUE_02
A_KEY_02   A_VALUE_02   B_KEY_03   B_VALUE_03

Each entry from TABLE A will have one row in the output flat file with one column per related field from TABLE B. Columns in the output file can have empty values for fields obtained from TABLE B.

I realise this will create an extremely wide file, but this is a requirement. I've had a look at MapForce and Apatar, but I think this problem is too bizarre or I can't use them correctly.

My question: is there already a tool that will accomplish this or should I develop one from scratch (I don't want to reinvent the wheel)?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

痕至 2024-08-21 04:22:54

我很确定你无法用简单的 SQL 解决这个问题,但根据你的 RDBMS,可能可以创建一个存储过程或类似的东西。否则,用脚本语言来做这件事相当容易。您使用哪种技术?

I'm pretty sure you can't solve this in plain SQL, but depending on your RDBMS, it may be possible to create a stored procedure or some such thing. Otherwise it's a fairly easy thing to do in a scripting language. Which technology are you using?

菊凝晚露 2024-08-21 04:22:54

感谢您的帮助。事实证明,关系是ONE -> 。 MAX 为 3 并且此约束不会更改,因为数据现在是静态的,因此以下常规 SQL 可以工作:

select A.A_KEY, A.FIELD_A, B.B_KEY, B.FIELD_B, B2.B_KEY, B2.FIELD_B, B3.B_KEY,
B3.FIELD_B 

from 

 A left join B on (A.A_KEY = B.A_KEY)
left join B B2 on (A.A_KEY = B2.A_KEY and B2.B_KEY != B.B_KEY)
left join B B3 on (A.A_KEY = B3.A_KEY and B3.B_KEY != B.B_KEY 
                   and B3.B_KEY != B2.B_KEY)

group by A.A_KEY
order by A.A_KEY

Thanks for all your help. As it turns out the relationship is ONE -> MAX of 3 and this constraint will not change as the data is now static so the following run-of-the-mill SQL works:

select A.A_KEY, A.FIELD_A, B.B_KEY, B.FIELD_B, B2.B_KEY, B2.FIELD_B, B3.B_KEY,
B3.FIELD_B 

from 

 A left join B on (A.A_KEY = B.A_KEY)
left join B B2 on (A.A_KEY = B2.A_KEY and B2.B_KEY != B.B_KEY)
left join B B3 on (A.A_KEY = B3.A_KEY and B3.B_KEY != B.B_KEY 
                   and B3.B_KEY != B2.B_KEY)

group by A.A_KEY
order by A.A_KEY
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文