如何找出哪些行已被删除?
带有 SE 引擎的 Informix-SQL 7.32:
我有一位客户从 SE 表中删除了几行。 (我没有使用事务日志记录或审计)。该表有一个序列列。我想创建一个 Ace 报告来打印缺少的序列列。我尝试了以下快速而肮脏的报告,但它不起作用!..你能建议更好的方法吗?
define
variable next_id integer
end
select tbl_id
from tbl
order by tbl_id {I'm ordering tbl_id because all the rows are periodically re-clustered}
end {by an fk_id in order to group all rows belonging to the same customer}
format
on every row
let next_id = tbl_id + 1
after group of tbl_id
if tbl_id + 1 <> next_id then
print column 1, tbl_id + 1 using "######"
end
或者创建一个临时表,其中包含 1 到 5000 之间的连续数字的 INT 列,并执行如下所示的选择语句:
SELECT tbl_id
FROM tbl
WHERE tbl_id NOT IN
(SELECT tmp_int
FROM tmp);
或使用 HAVING、OUTER 等的选择语句。
Informix-SQL 7.32 with SE engine:
I have a customer who deleted several rows from an SE table. (I'm not using transaction logging or audit). The table has a serial column. I would like to create an Ace report to print the missing serial columns. I tried the following quick and dirty report, but it didn't work!.. can you suggest a better way?
define
variable next_id integer
end
select tbl_id
from tbl
order by tbl_id {I'm ordering tbl_id because all the rows are periodically re-clustered}
end {by an fk_id in order to group all rows belonging to the same customer}
format
on every row
let next_id = tbl_id + 1
after group of tbl_id
if tbl_id + 1 <> next_id then
print column 1, tbl_id + 1 using "######"
end
or maybe create a temporary table with an INT column containing sequential numbers from 1 to 5000 and do a select statement like:
SELECT tbl_id
FROM tbl
WHERE tbl_id NOT IN
(SELECT tmp_int
FROM tmp);
or a select statement with HAVING, OUTER, etc.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
由于这是 SE,我们必须使用老式表示法,而不是 SQL-92 JOIN 表示法。
下面的四个查询是两个可能答案的共同基础:
表 x3 和 x4 现在(分别)包含 tbl_id 的值,这些值没有直接后继也没有直接前驱。每个值都是 tbl_id 值的连续范围的开始或结束。在 IDS(而不是 SE)中,您可以使用标准 SQL OUTER JOIN 表示法,并在两个查询(而不是四个)中过滤联接结果;在东南部你没有那么奢侈。
具有二次(或更糟)行为的非解决方案
现在您只需弄清楚如何组合两个表:
此查询的主要部分出现两次并生成范围开始小于或等于的所有行对范围的末尾(等于允许“范围”由一个值组成,两侧都有已删除的行)。 NOT EXISTS 子句确保不存在具有相同起始值和较小最终值的其他对。
如果数据中有很多间隙,临时表上的查询可能不会很快;如果间隙很少,那么应该没问题。
最后一个查询在范围数量方面表现出二次行为。当我只有十几个范围时,这很好(亚秒响应时间);当我有 1,200 个范围时,情况不太好 - 没有在合理的时间内完成。
避免二次行为
既然二次行为不好,我们如何重新表述查询...
对于范围的每个低端,找到大于或等于低端的范围的最小高端,或者在 SQL 中:
请注意,这可以轻松地合并到 ACE 报告中。它为您提供了存在的数字范围,而不是不存在的数字范围。您可以弄清楚如何生成另一个。
计时
在具有 22100 行且数据中包含 1200 个间隙的表上,该方法表现得相当好。在基准模式 (-B) 下使用(我的)SQLCMD 程序,并将 SELECT 输出发送到 /dev/null,并使用在 MacOS X 10.6.7(MacBook Pro、Intel Core 2 Duo,3 GHz 和4 GB RAM),结果是:
可以;不到几秒钟的处理时间。
Since this is SE, we have to use the old-fashioned notation, not the SQL-92 JOIN notations.
The four queries that follow are a common foundation for the two possible answers:
The tables x3 and x4 now contain (respectively) the values for tbl_id that have no immediate successor and no immediate predecessor. Each value is the start or end of a contiguous ranges of tbl_id values. In IDS instead of SE, you could use the standard SQL OUTER JOIN notation and filter the results of the join in two queries instead of four; you do not have that luxury in SE.
Non-solution with quadratic (or worse) behaviour
Now you just have to work out how to combine the two tables:
The main part of this query occurs twice and generates all pairs of rows where the start of the range is less than or equal to the end of the range (equal allows for 'ranges' consisting of one value on its own, with deleted rows on either side). The NOT EXISTS clause ensures that there is no other pair with the same start value and a smaller end value.
The queries on the temp tables may not be very fast if there are many gaps in the data; if there are very few gaps, then they should be OK.
The last query exhibits quadratic behaviour in terms of the number of ranges. When I had just a dozen ranges, it was fine (sub-second response time); when I had 1,200 ranges, it was not OK - did not complete in a reasonable time.
Avoiding quadratic behaviour
Since quadratic behaviour is not good, how can we rephrase the query...
For each low end of the range, find the minimum high end of a range that is greater than or equal to the low end, or in SQL:
Note that this can easily be incorporated into an ACE report. It gives you the ranges of number present - not those which are absent. You can work out how to generate the other.
Timing
That performed pretty well on a table with 22100 rows containing 1200 gaps in the data. Using (my) SQLCMD program in its benchmark mode (-B), and sending SELECT output to /dev/null, and using IDS 11.70.FC1 run on MacOS X 10.6.7 (MacBook Pro, Intel Core 2 Duo at 3 GHz and 4 GB RAM), the results were:
It will do; less than a couple of seconds processing time.
请参阅: 是否有 SQL生成给定范围的序列号的函数?以获得更简单且更高效的解决方案。
see: Is there an SQL function which generates a given range of sequential numbers? for a simpler and more engine-efficient solution.