如何跟踪列的值更改了多少次?
我有一个名为crewWork的表,如下所示:
CREATE TABLE crewWork(
FloorNumber int, AptNumber int, WorkType int, simTime int )
填充该表后,我需要知道apt发生了多少次更改以及楼层发生了多少次更改。通常我预计每个公寓有 10 排,每层有 40-50 排。 我可以为此编写一个标量函数,但我想知道是否有任何方法可以在 t-SQL 中做到这一点,而无需编写标量函数。
谢谢
数据将如下所示:
FloorNumber AptNumber WorkType simTime
1 1 12 10
1 1 12 25
1 1 13 35
1 1 13 47
1 2 12 52
1 2 12 59
1 2 13 68
1 1 14 75
1 4 12 79
1 4 12 89
1 4 13 92
1 4 14 105
1 3 12 115
1 3 13 129
1 3 14 138
2 1 12 142
2 1 12 150
2 1 14 168
2 1 14 171
2 3 12 180
2 3 13 190
2 3 13 200
2 3 14 205
3 3 14 216
3 4 12 228
3 4 12 231
3 4 14 249
3 4 13 260
3 1 12 280
3 1 13 295
2 1 14 315
2 2 12 328
2 2 14 346
我需要报告的信息,我不需要将其存储在任何地方。
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
如果您使用现在所写的已接受答案(2023 年 1 月 6 日),则可以使用 OP 数据集获得正确的结果,但我认为使用其他数据可能会得到错误的结果。
已确认:接受的答案有错误(截至 2023 年 1 月 6 日)
我在 我对已接受答案的评论。
在 this db<>fiddle 中,我演示了错误的结果。我使用了稍微修改过的已接受答案的形式(我的语法适用于 SQL Server 和 PostgreSQL)。我使用稍微修改过的OP数据形式(我更改了两行)。我演示了如何稍微改变已接受的答案,以产生正确的结果。
接受的答案很聪明,但是 需要进行一些小的更改才能产生正确的结果(如上面的 db<>fiddle 所示并在此处进行描述:
COUNT(
DDL:
DML:
替代答案
接受的答案最终可能会更新以消除错误,如果发生这种情况,我可以删除我的答案。警告但我仍然想要为您提供生成答案的替代方法。
我的方法如下:“检查前一行,如果前一行与当前行的值不同,则 SQL 发生变化”。本身没有想法或行顺序函数(至少不像 Excel 中那样;)
相反,SQL 有 窗口功能。通过 SQL 的窗口函数,您可以使用窗口函数
RANK
加上自JOIN
技术如此处所示将当前行值和前一行值组合起来,以便您可以比较它们。 这是一个 db<>fiddle 显示我的方法,我将其粘贴在下面。中间表显示在帖子底部,如果有更改,则显示值为 1 的列,否则为 0(即
FloorChange
、AptChange
)。 ..DDL:
...同上...
DML:
下面看到中间表的前几行(
joinOnItself
)。这显示了我的方法是如何运作的。请注意最后两列,当FloorNumber
与FloorShift
相比发生变化时,它们的值为 1(在FloorChange 中注明)
),或与AptShift
相比的AptNumber
的变化(在AptChange
中注明)。请注意,您可以使用 窗口函数
LAG
将当前行中的值与直接上一行(无需JOIN
)。我这里没有该解决方案,但在维基百科文章示例<中进行了描述/a>:If you use the accepted answer as written now (1/6/2023), you get correct results with the OP dataset, but I think you can get wrong results with other data.
CONFIRMED: ACCEPTED ANSWER HAS A MISTAKE (as of 1/6/2023)
I explain the potential for wrong results in my comments on the accepted answer.
In this db<>fiddle, I demonstrate the wrong results. I use a slightly modified form of accepted answer (my syntax works in SQL Server and PostgreSQL). I use a slightly modified form of the OP's data (I change two rows). I demonstrate how the accepted answer can be changed slightly, to produce correct results.
The accepted answer is clever but needs a small change to produce correct results (as demonstrated in the above db<>fiddle and described here:
COUNT(DISTINCT AptGroup)...
COUNT(DISTINCT CONCAT(AptGroup, '_', AptNumber))...
DDL:
DML:
ALTERNATIVE ANSWER
The accepted-answer may eventually get updated to remove the mistake. If that happens I can remove my warning but I still want leave you with this alternative way to produce the answer.
My approach goes like this: "check the previous row, if the value is different in previous row vs current row, then there is a change". SQL doesn't have idea or row order functions per se (at least not like in Excel for example; )
Instead, SQL has window functions. With SQL's window functions, you can use the window function
RANK
plus a self-JOIN
technique as seen here to combine current row values and previous row values so you can compare them. Here is a db<>fiddle showing my approach, which I pasted below.The intermediate table, showing the columns which has a value 1 if there is a change, 0 otherwise (i.e.
FloorChange
,AptChange
), is shown at the bottom of the post...DDL:
...same as above...
DML:
Below see the first few rows of the intermediate table (
joinedOnItself
). This shows how my approach works. Note the last two columns, which have a value of 1 when there is a change inFloorNumber
compared toFloorShift
(noted inFloorChange
), or a change inAptNumber
compared toAptShift
(noted inAptChange
).Note instead of using the window function
RANK
andJOIN
, you could use the window functionLAG
to compare values in the current row to the previous row directly (no need toJOIN
). I don't have that solution here, but it is described in the Wikipedia article example:如果我没有遗漏任何内容,您可以使用以下方法来查找更改数量:
确定具有相同值的连续行组;
对这些组进行计数;
减去 1。
分别对
AptNumber
和FloorNumber
应用该方法。可以像 这个答案< /a>,只是您的情况下没有
Seq
列。相反,可以使用另一个ROW_NUMBER()
表达式。下面是一个近似的解决方案:(我假设
simTime
列定义了更改的时间线。)UPDATE
下面的表格显示了如何获取不同的组
AptNumber
。这里
RN
是一个伪列,代表ROW_NUMBER() OVER (ORDER BY simTime)
。您可以看到,这只是从 1 开始的排名序列。另一个伪列
RN_Apt
包含其他ROW_NUMBER
生成的值,即ROW_NUMBER() OVER (PARTITION BY AptNumber ORDER BY simTime)。它包含具有相同AptNumber
值的各个组内的排名。您可以看到,对于新遇到的值,序列会重新开始,而对于重复出现的值,序列会从上次停止的位置继续。您还可以从表中看到,如果我们从
RN_Apt
中减去RN
(也可能相反,在这种情况下并不重要),我们得到唯一标识相同AptNumber
值的每个不同组的值。您也可以将该值称为组 ID。因此,现在我们已经获得了这些 ID,我们只需对它们进行计数(当然,对不同的值进行计数)。这将是组的数量,并且更改的数量要少一个(假设第一组不计为更改)。
If I am not missing anything, you could use the following method to find the number of changes:
determine groups of sequential rows with identical values;
count those groups;
subtract 1.
Apply the method individually for
AptNumber
and forFloorNumber
.The groups could be determined like in this answer, only there's isn't a
Seq
column in your case. Instead, anotherROW_NUMBER()
expression could be used. Here's an approximate solution:(I'm assuming here that the
simTime
column defines the timeline of changes.)UPDATE
Below is a table that shows how the distinct groups are obtained for
AptNumber
.Here
RN
is a pseudo-column that stands forROW_NUMBER() OVER (ORDER BY simTime)
. You can see that this is just a sequence of rankings starting from 1.Another pseudo-column,
RN_Apt
contains values produces by the otherROW_NUMBER
, namelyROW_NUMBER() OVER (PARTITION BY AptNumber ORDER BY simTime)
. It contains rankings within individual groups of identicalAptNumber
values. You can see that, for a newly encountered value, the sequence starts over, and for a recurring one, it continues where it stopped last time.You can also see from the table that if we subtract
RN
fromRN_Apt
(could be the other way round, doesn't matter in this situation), we get the value that uniquely identifies every distinct group of sameAptNumber
values. You might as well call that value a group ID.So, now that we've got these IDs, it only remains for us to count them (count distinct values, of course). That will be the number of groups, and the number of changes is one less (assuming the first group is not counted as a change).
请为每次更新添加一个额外的列changecount
增量
changecount
值,然后为changecount添加与其对应的列如果想知道每个字段的计数,
add an extra column changecount
increment
changecount
value for each updationif want to know count for each field then add columns corresponding to it for changecount
假设每条记录代表不同的变更,您可以通过以下方式查找每个楼层的变更:
每个公寓的变更(假设 AptNumber 唯一标识公寓)通过:
或者(假设 AptNumber 和 FloorNumber 共同唯一标识公寓)通过:
Assuming that each record represents a different change, you can find changes per floor by:
And changes per apartment (assuming AptNumber uniquely identifies apartment) by:
Or (assuming AptNumber and FloorNumber together uniquely identifies apartment) by: