Oracle由数据定义的解析函数窗口
我有一个表,它表示从特定文本文件格式读取的数据的逐行转储。每行可以代表“主”行或“详细”行,通过 rec_type
代码指示。我想编写一个查询来获取“主”行以及相关的详细信息行。我想出了一些可以完成这项工作的方法,但它看起来有点老套,并且对更好的方法(如果有的话)感兴趣。
CREATE TABLE mdtest
(rec_seq NUMBER PRIMARY KEY
,rec_type VARCHAR2(3) NOT NULL
,rec_data VARCHAR2(100) NOT NULL);
INSERT INTO mdtest VALUES (1, '100', 'Bill Jones');
INSERT INTO mdtest VALUES (2, '200', '20080115,100.25');
INSERT INTO mdtest VALUES (3, '100', 'John Smith');
INSERT INTO mdtest VALUES (4, '200', '20090701,80.95');
INSERT INTO mdtest VALUES (5, '200', '20091231,110.35');
期望的结果:
SEQ_EMP EMP_NAME SEQ_DATA EMP_DATA
======= ========== ======== ===============
1 Bill Jones 2 20080115,100.25
3 John Smith 4 20090701,80.95
3 John Smith 5 20091231,110.35
假设:
- 按照rec_seq的顺序处理记录,
- 第一个记录类型是“
100
”, - 每个“
100
”记录有1个或多个“200
” code>" 记录如下
注意:这是针对 Oracle 9i 的,但今年我们应该升级到 11g R1。
I have a table which represents a line-by-line dump of the data read from a particular text file format. Each line may represent a "master" or a "detail" line, indicated via rec_type
code. I'd like to write a query that gets the "master" lines alongside the associated detail lines. I've come up with something that does the job, but it seems a bit hackish and am interested in better ways if any.
CREATE TABLE mdtest
(rec_seq NUMBER PRIMARY KEY
,rec_type VARCHAR2(3) NOT NULL
,rec_data VARCHAR2(100) NOT NULL);
INSERT INTO mdtest VALUES (1, '100', 'Bill Jones');
INSERT INTO mdtest VALUES (2, '200', '20080115,100.25');
INSERT INTO mdtest VALUES (3, '100', 'John Smith');
INSERT INTO mdtest VALUES (4, '200', '20090701,80.95');
INSERT INTO mdtest VALUES (5, '200', '20091231,110.35');
Desired Result:
SEQ_EMP EMP_NAME SEQ_DATA EMP_DATA
======= ========== ======== ===============
1 Bill Jones 2 20080115,100.25
3 John Smith 4 20090701,80.95
3 John Smith 5 20091231,110.35
Assumptions:
- records are processed in sequence of rec_seq
- first record type is a "
100
" - each "
100
" record has 1 or more "200
" records following
Note: this is for Oracle 9i, however we should be upgrading to 11g R1 this year.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
这是我到目前为止所做的:
如您所见,我使用 MAX 报告分析函数,窗口从集合的顶部开始一直到当前行,以获取当前“200”的相关“100”记录“ 记录;然后在外部查询中我将丢弃不需要的“100”记录。
为了获得 emp_name,我必须将数据附加到 rec_seq 中,以便 MAX 函数仍然选择正确的头记录;然后在外部查询中我砍掉了rec_seq。
我尝试过其他分析函数和语法,包括 FIRST_VALUE 和 KEEP 语法,但这些似乎都没有使这项工作变得更简单;困难在于窗口是由 rec_type 的值定义的,而不是某个常量偏移量。
Here's what I have so far:
As you can see, I'm using the MAX reporting analytic function with a window starting from the top of the set down to the current row, to get the relevant "100" record for the current "200" record; then in the outer query I'm discarding the unneeded "100" records.
To get emp_name, I've had to append the rec_seq with the data so that the MAX function still picks the correct header record; then in the outer query I chop the rec_seq off.
I've played with other analytic functions and syntax including FIRST_VALUE and the KEEP syntax, but neither of these seem to make this job simpler; the difficulty is that the window is defined by the value of rec_type instead of being some constant offset.
为了简单起见,您认为值得在处理之前将每种记录类型加载到单独的导入表中吗?
这可能有助于提供更易于维护的解决方案,并让您可以单独解析和验证通信分隔的 EMP_DATA 字段。
只是一个想法 - 抱歉,如果您只是在寻找分析解决方案。
In a desire for simplicity, do you think it's worth loading each record type into a separate import table before processing?
This might lend itself to a more maintainable solution and will let you parse out and validate the comm-separated EMP_DATA field separately.
Just a thought - sorry if you're only looking for an Analytics solution.