过滤掉孤立表条目
假设有一个只有两列的表(示例如下)。每个“1”条目后应跟一个“0”(按下面给出的排序顺序)。然而,正如您所看到的,在表中,有一些“孤儿”,其中有两个连续的“1”。
如何创建一个返回所有行(除了任何连续“1”中的第一个行)的查询? (这会将下面的示例从 16 行减少到 14 行)
1 E
0 A
1 T
0 S
1 R
0 E
1 F
0 T
1 G
1 T
0 R
1 X
1 R
0 R
1 E
0 T
我将尝试澄清我的问题,我认为上面我过于简化了它。想象一个名为 logs
的表,有四列:
user
(包含用户名的字符串)machine
(唯一标识各种 PC 的字符串)type
(事件类型:1 表示 login,0 表示 logout)time
(事件发生的时间正在记录的事件)
[机器/时间对提供了一个唯一的密钥,如任何机器都不能同时登录或退出两次。如果需要的话,大概可以根据机器/时间排序人为创建“ID”列。]
这个想法是每个登录事件都应该伴随一个注销事件。理想情况下,可以很容易地将登录与注销进行匹配,从而分析登录所花费的时间。
但是,在断电的情况下,不会记录注销。因此(仅考虑一台机器的数据,按时间排序)如果连续有两个登录事件,我们希望忽略第一个登录,因为我们没有从中获得任何可靠的数据。这就是我正在努力解决的问题。
Suppose there is a table with only two columns (an example is shown below). Every '1' entry should be followed (in the sorted order given below) by a '0'. However, as you can see, in the table, there are some 'orphans' where there are two consecutive '1's.
How can I create a query that returns all the rows, except for the first of any consecutive '1's? (This would reduce the example below from 16 rows to 14)
1 E
0 A
1 T
0 S
1 R
0 E
1 F
0 T
1 G
1 T
0 R
1 X
1 R
0 R
1 E
0 T
I'm going to try and clarify my problem, I think that above I simplified it too much. Imagine one table called logs
, with four columns:
user
(a string containing a username)machine
(a string uniquely identifying various PCs)type
(event's type: a 1 for login and a 0 for logout)time
(the time of the event being logged)
[The machine/time pair provides a unique key, as no machine can be logged in or out of twice at the same instant. Presumably an 'ID' column could be artificially created based on machine/time sort if needed.]
The idea is that every login event should be accompanied by a logout event. In an ideal word it would be fairly easy to match logins to logouts, and hence analyse the time spent logged in.
However, in the case of a power cut, the logout will not be recorded. Therefore (considering only one machine's data, sorted by time) if there are two login events in a row, we want to ignore the first login, because we don't have any reliable data from it. This is the problem I am trying to solve.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
前提是,
你的文字说“除了任何连续的第一个”,但我认为,这就是你想要的。或者只能有 2 个,那么它是相同的。
如果您想保留双 0,请另外使用注释子句,但可能不需要。
问题编辑后编辑:
您可能需要向数据添加自动增量列以使其更简单:
生成(即写入)行号MySQL 中的索引列
其他 RDBMS(PostgreSQL、Oracle、SQL Server 等)具有诸如
row_number()
或lag()
和lead()
使此类操作变得更加容易。Provided, that
Your text says "except for the first of any consecutive", but I think, this is what you want. Or there can only ever be 2, then it is the same.
If you want to preserve double 0's, use the commented clause additionally, but probably not needed.
Edit after question edit:
You may want to add an auto-increment column to your data to make this simpler:
Generate (i.e. write) a row number index column in MySQL
Other RDBMS (PostgreSQL, Oracle, SQL Server, ..) have window functions like
row_number()
orlag()
andlead()
that make such an operation much easier.假设您获得一个 id(添加列,设置列 id = 数据库中的记录号)使用:
Assuming you get an id (add column, set column id = record number in database) use:
尝试:
Try:
使用 CTE 将滞后逻辑与选择标准分开。
编辑:
对于那些无法使用 CTE 的较差的解决方案,很容易创建一个视图:
USING a CTE to separate the lag-logic from the selection criteria.
EDIT:
for those poor soals that cannot use CTEs, it is easy to create a view instead: