从 SQL*Loader 控制文件中访问数据文件名

发布于 2024-08-09 23:52:48 字数 762 浏览 5 评论 0原文

如何从 SQL*Loader 控制文件中访问输入数据文件名,以便可以将其与输入文件中的数据一起插入表中?

假设我有以下控制文件:

LOAD DATA

APPEND
INTO TABLE STG_AM02_BA_RAW
WHEN (1:2) = 'DT'
(
        SUBSCRIBER_NO                   POSITION(11:18)CHAR, 
        ACCOUNT_NO                      POSITION(19:32)CHAR, 
        SUBSCRIBER_NAME                 POSITION(33:92)CHAR
)

我想做类似的事情:

LOAD DATA

APPEND
INTO TABLE STG_AM02_BA_RAW
WHEN (1:2) = 'DT'
(
        SUBSCRIBER_NO                   POSITION(11:18)CHAR, 
        ACCOUNT_NO                      POSITION(19:32)CHAR, 
        SUBSCRIBER_NAME                 POSITION(33:92)CHAR, 
        INPUTFILE                       INPUTFILENAME()CHAR
)

假设我无权编辑将使用此控制文件调用 SQL*Loader 的 shell 脚本。

How do I access the input data file name from within SQL*Loader control file so that I can insert it into the table along with data from the input file?

Let's say for example I have the following control file:

LOAD DATA

APPEND
INTO TABLE STG_AM02_BA_RAW
WHEN (1:2) = 'DT'
(
        SUBSCRIBER_NO                   POSITION(11:18)CHAR, 
        ACCOUNT_NO                      POSITION(19:32)CHAR, 
        SUBSCRIBER_NAME                 POSITION(33:92)CHAR
)

I want to do something like:

LOAD DATA

APPEND
INTO TABLE STG_AM02_BA_RAW
WHEN (1:2) = 'DT'
(
        SUBSCRIBER_NO                   POSITION(11:18)CHAR, 
        ACCOUNT_NO                      POSITION(19:32)CHAR, 
        SUBSCRIBER_NAME                 POSITION(33:92)CHAR, 
        INPUTFILE                       INPUTFILENAME()CHAR
)

Assume that I don't have access nor permission to edit the shell script that will invoke SQL*Loader with this control file.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

夜访吸血鬼 2024-08-16 23:52:48

从 11g 开始,无法直接从 SQL*Loader 控制文件访问文件名。

您基本上必须从脚本环境中处理它。

如果您无法修改加载脚本,也许您可​​以向数据文件添加标头记录?

看起来您在位置 1:2 有一个记录类型字段 - 您可以修改数据文件创建以包含文件名记录类型吗?

例如,“FN”数据类型:

FN                ...        inputfile.txt
DT     12345678XXX...XXXXXYYYYYYYYYYYYYYYY
DT     12345678XXX...XXXXXYYYYYYYYYYYYYYYY
DT     12345678XXX...XXXXXYYYYYYYYYYYYYYYY
DT     12345678XXX...XXXXXYYYYYYYYYYYYYYYY
DT     12345678XXX...XXXXXYYYYYYYYYYYYYYYY

您的加载脚本可以更改为:

LOAD DATA

APPEND
INTO TABLE STG_AM02_BA_RAW
WHEN (1:2) = 'FN'
(
        INPUTFILE                       POSITION(1:92)CHAR
)
WHEN (1:2) = 'DT'
(
        SUBSCRIBER_NO                   POSITION(11:18)CHAR, 
        ACCOUNT_NO                      POSITION(19:32)CHAR, 
        SUBSCRIBER_NAME                 POSITION(33:92)CHAR
)

一切取决于您是否可以更新数据文件...

例如,

echo "FNinputfile.txt" > header.txt
cat header.txt inputfile.txt > newinputfile.txt

如果您需要针对每个数据行引用文件名,则可以加载数据到多个临时表中:

LOAD DATA
TRUNCATE INTO TABLE STAGE_FILENAME
WHEN (1:2) = 'FN'
(
        INPUTFILE                       POSITION(1:92)CHAR
)
TRUNCATE INTO TABLE STAGE_DATA
WHEN (1:2) = 'DT'
(
        SUBSCRIBER_NO                   POSITION(11:18)CHAR, 
        ACCOUNT_NO                      POSITION(19:32)CHAR, 
        SUBSCRIBER_NAME                 POSITION(33:92)CHAR
)

...并使用 SQL 将它们连接在一起:

insert into STG_AM02_BA_RAW
    (
    subscriber_no,
    account_no,
    subscriber_name,
    input_filename
    )
select
    d.subscriber_no,
    d.account_no,
    d.subscriber_name,
    f.inputfile
from
    stage_data d,
    inputfile d

如果有并发负载,则此过程会失败。

您在评论中说您可以更改数据文件 - 您能否将文件更改为将文件名附加到每条记录中?如果是这样,问题就会消失。你只需要包括:

    SUBSCRIBER_NAME                 POSITION(92:*)CHAR

As of 11g, it isn't possible to access the filename directly from the SQL*Loader control file.

You basically have to handle it from your scripting environment.

If you're not able to modify the loading script, perhaps you could add a header record to the datafile?

It looks like you have a record type field in position 1:2 - can you modify the datafile creation to include a filename record type?

For example, a "FN" data type:

FN                ...        inputfile.txt
DT     12345678XXX...XXXXXYYYYYYYYYYYYYYYY
DT     12345678XXX...XXXXXYYYYYYYYYYYYYYYY
DT     12345678XXX...XXXXXYYYYYYYYYYYYYYYY
DT     12345678XXX...XXXXXYYYYYYYYYYYYYYYY
DT     12345678XXX...XXXXXYYYYYYYYYYYYYYYY

Your load script could then change to:

LOAD DATA

APPEND
INTO TABLE STG_AM02_BA_RAW
WHEN (1:2) = 'FN'
(
        INPUTFILE                       POSITION(1:92)CHAR
)
WHEN (1:2) = 'DT'
(
        SUBSCRIBER_NO                   POSITION(11:18)CHAR, 
        ACCOUNT_NO                      POSITION(19:32)CHAR, 
        SUBSCRIBER_NAME                 POSITION(33:92)CHAR
)

All depends if you can update the data file...

For example,

echo "FNinputfile.txt" > header.txt
cat header.txt inputfile.txt > newinputfile.txt

If you need to reference the filename against each data row, you can load the data into multiple staging tables:

LOAD DATA
TRUNCATE INTO TABLE STAGE_FILENAME
WHEN (1:2) = 'FN'
(
        INPUTFILE                       POSITION(1:92)CHAR
)
TRUNCATE INTO TABLE STAGE_DATA
WHEN (1:2) = 'DT'
(
        SUBSCRIBER_NO                   POSITION(11:18)CHAR, 
        ACCOUNT_NO                      POSITION(19:32)CHAR, 
        SUBSCRIBER_NAME                 POSITION(33:92)CHAR
)

... and join them together using SQL:

insert into STG_AM02_BA_RAW
    (
    subscriber_no,
    account_no,
    subscriber_name,
    input_filename
    )
select
    d.subscriber_no,
    d.account_no,
    d.subscriber_name,
    f.inputfile
from
    stage_data d,
    inputfile d

This process falls over if you have concurrent loads.

You said in the comments that you can change the data file - could you get the file changed to that the filename is appended to each record? If so, makes the issue go away. You'd just have to include:

    SUBSCRIBER_NAME                 POSITION(92:*)CHAR
爱,才寂寞 2024-08-16 23:52:48

我认为在您指定的情况下没有办法做到这一点,据我所知,无法正确引用“数据”部分中的文件名。

解决方法的几个想法:

  • 使用单独的 SQL 语句更新新插入的记录。您也许能够从调用 SQL*Loader 的批处理文件构建语句。
  • 修改数据文件以包含文件名(同样,可以从批处理文件完成)。
  • 让批处理文件构建控制文件以将文件名作为常量包含在内,这样您就可以得到类似的内容

    INPUTFILE CONSTANT "my_data.dat"

内容希望这会有所帮助。

I don't think that there is a way to this in the circumstances you specified, AFAIK there is no way to properly reference the filename in the "data" part.

Couple of ideas for a workaround:

  • Update the newly inserted records with a separate SQL statement. You might be able to build the statement from the batch file that invokes SQL*Loader.
  • Modify the data file to include the filename (again, might be done from the batch file).
  • Have the batch file build the control file to include the filename as a constant, so you could have something like

    INPUTFILE CONSTANT "my_data.dat"

Hope this helps.

别在捏我脸啦 2024-08-16 23:52:48

解决此问题的简单方法是在每个记录的末尾添加一个带有文件名的额外列,并将该列位置映射到字段。

The easy way to tackle this issue is by adding an extra column to the end of the each record with the filename and map that column position to the field.

[旋木] 2024-08-16 23:52:48

我知道这已经太老了,但我仍然在 2023 年到达这里,而且仍然没有答案,所以这就是我的做法。

我修改了 .ctl 文件以使用通用文件名进行输入:

INFILE '[path to data files]\myfile.dat'

然后,我创建了一个带有 DO 循环的批处理文件来逐步遍历所有数据文件名:

FOR %%X in ("[path to data files]\*.[your extension]") DO (
copy /Y %%~dpnX [path to batch files]\myfile.dat
[path to your batch file that runs sqlldr]\load_my_data.bat
)

我加载了 1218 个数据文件和大约 10-125 万行多一点的数据文件。 15分钟。

I know this is super old, but I still got here in 2023, and there's still no answer, so here's how I did it.

I modified my .ctl file to use a generic file name for input:

INFILE '[path to data files]\myfile.dat'

Then I created a batch file with a DO loop to step through all the data file names:

FOR %%X in ("[path to data files]\*.[your extension]") DO (
copy /Y %%~dpnX [path to batch files]\myfile.dat
[path to your batch file that runs sqlldr]\load_my_data.bat
)

I loaded 1218 data files and a little over 1.25 million rows in about 10-15 minutes.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文