如何在DataStage中的源文件和目标表之间进行复制记录检查

发布于 2025-02-11 18:57:33 字数 245 浏览 1 评论 0原文

我想进行两种重复检查,

  1. 以前是否已经加载了该名称的文件。

例如,将文件a加载到目标表中,然后运行后续运行,如果我们收到文件A,则该时间序列应因已经加载而中止。

  1. 如果我们已经加载了与相同记录的加载a

,则文件A已经在目标表中,下次我们在该文件B中接收文件B时工作应该流产,

谁能帮助我解决这种情况?

谢谢 Venkat。

I want to do two types of duplicate checking

  1. If we already have loaded A file With That name previously.

For instance, file A is loaded into the target table, and subsequent run, if we receive the file A, this time sequence should be aborted because it's already loaded.

  1. If we have already loaded a with the identical records

For instance, file A is already in the target table, and next time we receive file B in this file B, those already loaded in the target with file A should not be loaded, and the job should be aborted

Can anyone help me with this scenario?

Thanks
Venkat.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

ゃ人海孤独症 2025-02-18 18:57:33

您需要保留已加载哪些文件名的记录,通常是通过将文件移至存档(或“处理过”)目录的记录。因此,您可以使用此文件名使用简单的 ls 命令来确定是否存在,以求解您的第一个要求。
确定文件B是否具有相同的记录来归档A是一个更复杂的问题。您可以使用 diff 命令吗?否则,您可能需要做一些聪明的事情。即使在此之前,您如何确定该文件是您必须比较的文件?如果有钥匙值,则可以对目标表进行检查。

You need to keep records of which file names have been loaded, typically by having moved the file to an archive (or "processed") directory. So you can use a simple ls command with this file name to determine whether it exists, to solve your first requirement.
Determining whether file B has identical records to file A is a more complex question. Can you use a diff command? Otherwise you may need to do something cleverer. Even before that, how do you establish that file A is the one against which you have to compare? If there are key values, you may be able to check against the target table.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文