通过检查上一行中的值来获取最新行

发布于 2025-01-12 14:59:53 字数 900 浏览 0 评论 0原文

我想获取处于失败状态的最新行作业，但同一作业的上一行应该是成功。我不想获得持续失败的工作记录。我正在使用 MySQL 版本 5.6.10

源表 job_status：

Job_name start_date     end_date            status
A        8/3/2022 12 PM 8/3/2022 1.30 PM    failed
B        8/3/2022 12 PM 8/3/2022 1.00 PM    failed
C        8/3/2022  3 PM 8/3/2022 3.10 PM    success
B        8/3/2022  3 PM 8/3/2022 3.30 PM    failed
C        8/3/2022  3 PM 8/3/2022 3.20 PM    success
A        8/3/2022  2 PM 8/3/2022 2.10 PM    success
A        8/3/2022  3 PM 8/3/2022 3.20 PM    failed

所需输出：

Job_name start_date     end_date            status
A        8/3/2022  3 PM 8/3/2022 3.20 PM    failed

我正在使用下面的查询，不确定如何检查以前的运行。

select distinct(*) 
from job_status 
where status = 'failed' 
order by start_date desc

原文

I would like to get latest row of job which is in failed state, but previous row for the same job should be success. I don't want to get records of job which is continuously failing. I am using MySQL version 5.6.10

Source Table job_status:

Job_name start_date     end_date            status
A        8/3/2022 12 PM 8/3/2022 1.30 PM    failed
B        8/3/2022 12 PM 8/3/2022 1.00 PM    failed
C        8/3/2022  3 PM 8/3/2022 3.10 PM    success
B        8/3/2022  3 PM 8/3/2022 3.30 PM    failed
C        8/3/2022  3 PM 8/3/2022 3.20 PM    success
A        8/3/2022  2 PM 8/3/2022 2.10 PM    success
A        8/3/2022  3 PM 8/3/2022 3.20 PM    failed

Desired Output:

Job_name start_date     end_date            status
A        8/3/2022  3 PM 8/3/2022 3.20 PM    failed

I am using below query, not sure how to check for previous run.

select distinct(*) 
from job_status 
where status = 'failed' 
order by start_date desc

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

最后的乘客 2025-01-19 14:59:53

该问题的任何解决方案都将取决于一些非常重要的假设，应该在问题中解决这些假设：

状态只能
通过“最新”从失败进展到成功，您的意思是最大 end_date 的行
end_date 属性是 DATETIME 类型
end_date、job_status 和 job_name 的组合是唯一的

如果其中任何一个不是这种情况，那么在 SQL 中解决方案是不可能的（并且在没有其他信息的情况下使用过程语言仍然非常困难）。

，这不是一个有效的函数，尽管它会受益于上面 4 中描述的组合上按 job_status、job_name、end_date 的顺序存在的索引（否则这对于索引来说是一个糟糕的选择）......

SELECT c.*
FROM (
    SELECT a.job_name, MAX(b.end_date) AS last_fail
    FROM job_status a
    JOIN job_status b
    ON a.job_name=b.job_name
    WHERE a.status='success'
    AND b.status='failed'
    GROUP BY a.job_name
) ilv 
JOIN job_status c
ON ilv.job_name=c.job_name
AND ilv.last_fail=c.end_date

同样这只是分组最大问题的一个变体，这里仍然每周被问一次，并且有手册中自己的章节。

根据数据的分布和行数，更有效的解决方案可能是读取表中的每一行，按 job_name、job_status（降序）和 end_date 排序，然后在每次“成功”后立即返回条目，其中job_name 与前一行匹配。这应该可以通过类似......

SELECT job_name, job_status, start_date, end_date
FROM (
  SELECT @prev_status AS prev_status, @prev_job_name AS prev_job_name,
  a.*,
  @prev_status:=a.status AS currstatus,
  @prev_job_name:=a.job_name AS currjobname
  FROM (
   SELECT b.*
   FROM job_status b
   ORDER BY b.job_name, b.status DESC, d.end_date DESC
  ) a
) c
WHERE c.prev_status-'success' 
AND c.job_status='failed'
AND c.prev_job_name=job_name

Any solution to the problem is going to depend on some very important assumptions which should have been addressed in the question:

the state can only progress from failed to success
by "latest" you mean the row the greatest end_date
the end_date attribute is a DATETIME type
the combination of end_date, job_status and job_name is unique

If any of these are not the case then the solution is impossible in SQL (and still very difficult with a procedural language in the absence of other information).

Again this is not an efficient function, although it would benefit from the existence of an index on the combination described in 4 above in the order job_status, job_name, end_date (which would otherwise be a poor choice for an index)....

SELECT c.*
FROM (
    SELECT a.job_name, MAX(b.end_date) AS last_fail
    FROM job_status a
    JOIN job_status b
    ON a.job_name=b.job_name
    WHERE a.status='success'
    AND b.status='failed'
    GROUP BY a.job_name
) ilv 
JOIN job_status c
ON ilv.job_name=c.job_name
AND ilv.last_fail=c.end_date

Really it's just a variant on the groupwise maximum question which still gets asked about once a week here and has its own chapter in the manual.

Depending on the distribution of the data, and the number of rows, a more efficient solution would probably be to read each row in the table, sorted by job_name, job_status (descending) and end_date then return the entries immediately after each "success" where the job_name matches the preceeding row. That should be possible with something like.....

SELECT job_name, job_status, start_date, end_date
FROM (
  SELECT @prev_status AS prev_status, @prev_job_name AS prev_job_name,
  a.*,
  @prev_status:=a.status AS currstatus,
  @prev_job_name:=a.job_name AS currjobname
  FROM (
   SELECT b.*
   FROM job_status b
   ORDER BY b.job_name, b.status DESC, d.end_date DESC
  ) a
) c
WHERE c.prev_status-'success' 
AND c.job_status='failed'
AND c.prev_job_name=job_name

回复收藏 0 原文

萌︼了一个春 2025-01-19 14:59:53

就性能而言，这不会赢得任何奖牌，但是......

select *
from t
where status = 'failed'
and coalesce((
    -- previous status
    -- coalesce is needed when there is no previous row
    select status
    from t as x
    where x.job_name = t.job_name and x.start_date < t.start_date
    order by x.start_date desc
    limit 1
), 'success') = 'success'
and not exists (
    -- no next record exists
    select 1
    from t as x
    where x.job_name = t.job_name and x.start_date > t.start_date
)

This won't win any medals as far as performance is concerned but...

select *
from t
where status = 'failed'
and coalesce((
    -- previous status
    -- coalesce is needed when there is no previous row
    select status
    from t as x
    where x.job_name = t.job_name and x.start_date < t.start_date
    order by x.start_date desc
    limit 1
), 'success') = 'success'
and not exists (
    -- no next record exists
    select 1
    from t as x
    where x.job_name = t.job_name and x.start_date > t.start_date
)

回复收藏 0 原文

羞稚 2025-01-19 14:59:53

未经测试，但您应该能够使用单个相关的 exists 子句来表达这一点：

select * 
from job_status s
where s.status='failed' 
and exists (
  select * from job_status s2 
    where s2.job_name = s.job_name 
      and s2.status = 'success' 
      and s2.end_date < s.end_date 
      order by s2.end_date desc
      limit 1)
order by start_date desc;

Untested but you should be able to express this with a single correlated exists clause:

select * 
from job_status s
where s.status='failed' 
and exists (
  select * from job_status s2 
    where s2.job_name = s.job_name 
      and s2.status = 'success' 
      and s2.end_date < s.end_date 
      order by s2.end_date desc
      limit 1)
order by start_date desc;

回复收藏 0 原文

~没有更多了~