在 Pentaho Kettle 中保留自动递增的列
在 Pentaho Kettle 中,假设想要在两个相同的表 A 和 B 之间复制数据。如果有一个自动递增的列,是否有办法在将数据从 A 复制到 B 时保留该列的值?自动递增列数据库的行为是特定的还是 Kettle 已经找到了通用的解决方案?
In Pentaho Kettle, say one wants to copy data between two identical tables, A and B. If there is a column that is auto incremented is there a way to preserve the value of that column when copying the data from A to B? Is the behaviour of the auto incremented columns database specific or has Kettle found a general solution for this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
如果您有两个表,其列定义相同,并且两个表中都有一个自动递增列,然后您用数据填充其中一个表,则将增加表 1 中的数字。现在,按顺序要将这些行插入到镜像表中,您必须按照插入顺序从第一个表中提取它们,以便可以按照相同的顺序将它们插入到镜像表中。然后,也只有到那时,自动递增的数字才会发生变化。也就是说,这将被视为脆弱的设计。
另一种方法是使表在列数据类型方面相同,但镜像表中没有自动递增列。只需将镜像表中的该列设置为整数即可,没有任何自动增量功能。
If you have two tables, identically defined with regard to their columns, and there's an auto-incrementing column in both tables, and then you populate one of the tables with data, you will be incrementing the number in table 1. Now, in order to insert those rows into the mirror table, you would have to extract them from the first table in the same order that they were inserted, so that they can be inserted into the mirror table in the identical order. Then, and only then, would the auto-incrementing numbers jibe. That said, this would be considered fragile design.
The alternative would be to make the tables identical with regard to column datatypes, but not have an auto-incrementing column in the mirror table. Just make that column in the mirror table an integer, without any auto-increment capability.
蒂姆的回答是有道理的。我通常会做同样的事情:使表 B 与表 A 相同,除了使表 B 的 PK 不是自动递增列之外。
(尝试总是以相同的顺序插入数据是一个坏主意。正如蒂姆提到的那样,它肯定很脆弱。但实际上,它会比这更糟糕。首先,你不能进行批量插入。你需要另外,您通常无法确定下一个使用的值是否是下一个整数,而选择下一个值是 DBMS 的责任。在很多情况下,插入的下一个值不会是下一个更大的整数。 。)
但一个重要的细节是 DBMS 处理自动递增字段的方式不同。在许多情况下,自动递增字段的行为是在插入的值为 NULL 时提供一个值,但在提供时接受显式值。在其他情况下,数据库将拒绝将值插入自动递增字段的尝试。
因此,如果您面临前一种情况,那么让表 A 和表 B 完全相同确实不是问题。您的 ETL 作业会将值插入表 B,并且 PK 列的自动递增性质将被忽略。
Tim's answer is reasonable. I would normally do the same: Make table B identical to table A except for making table B's PK not an auto-incrementing column.
(Attempting to always insert the data in the same order would be a bad idea. It would certainly be fragile, as Tim mentioned. But really, it would be worse than that. First, you could not do bulk inserts. You would need to commit each row individually. Also, you normally cannot be certain that the next value used will be the next integer. It's the DBMSs responsibility to pick the next value. There are lots of situations where the next value inserted will not be the next higher integer.)
But an important detail is that DBMSs handle auto-incrementing fields differently. In many cases the behavior of the auto-incrementing field is to provide a value if the inserted value is NULL but to accept an explicit value if one is provided. In other cases the database will reject attempts to insert a value into an auto-incrementing field.
So if you are facing the former case then it's really not a problem to leave table A and table B as completely identical. Your ETL job will insert values into table B, and the auto-incrementing nature of the PK column will just be ignored.