SQL QUERY 将行中的 NULL 值替换为先前已知值中的值
我有 2 列
date number
---- ------
1 3
2 NULL
3 5
4 NULL
5 NULL
6 2
.......
需要用新值替换 NULL 值,该值采用日期列中前一个日期的最后一个已知值的值 例如:日期=2 数字= 3,日期4 和5 数字= 5 和5。NULL 值随机出现。
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(13)
如果您使用的是 Sql Server,这应该可以工作
If you are using Sql Server this should work
这是一个 MySQL 解决方案:
这很简洁,但不一定适用于其他品牌的 RDBMS。对于其他品牌,可能有更相关的品牌特定解决方案。这就是为什么告诉我们您正在使用的品牌很重要。
正如 @Pax 所评论的那样,独立于供应商是件好事,但如果做不到这一点,充分利用您选择的数据库品牌也很好。
上述查询的解释:
@n
是MySQL用户变量。它从 NULL 开始,并在 UPDATE 遍历行时为每一行分配一个值。当number
为非 NULL 时,@n
被赋予number
的值。当number
为 NULL 时,COALESCE()
默认为@n
的先前值。无论哪种情况,这都会成为number
列的新值,并且 UPDATE 会继续到下一行。@n
变量在行与行之间保留其值,因此后续行获取来自前一行的值。 UPDATE的顺序是可以预测的,因为MySQL特别使用ORDER BY和UPDATE(这不是标准SQL)。Here's a MySQL solution:
This is concise, but won't necessary work in other brands of RDBMS. For other brands, there might be a brand-specific solution that is more relevant. That's why it's important to tell us the brand you're using.
It's nice to be vendor-independent, as @Pax commented, but failing that, it's also nice to use your chosen brand of database to its fullest advantage.
Explanation of the above query:
@n
is a MySQL user variable. It starts out NULL, and is assigned a value on each row as the UPDATE runs through rows. Wherenumber
is non-NULL,@n
is assigned the value ofnumber
. Wherenumber
is NULL, theCOALESCE()
defaults to the previous value of@n
. In either case, this becomes the new value of thenumber
column and the UPDATE proceeds to the next row. The@n
variable retains its value from row to row, so subsequent rows get values that come from the prior row(s). The order of the UPDATE is predictable, because of MySQL's special use of ORDER BY with UPDATE (this is not standard SQL).最好的解决方案是比尔·卡文 (Bill Karwin) 提供的解决方案。我最近不得不在一个相对较大的结果集中解决这个问题(1000 行,12 列,每行都需要这种类型的“如果当前行上该值为空,则显示最后一个非空值”),并使用带有 top 1 的更新方法select 之前的已知值(或带有 top 1 的子查询)运行速度非常慢。
我使用的是 SQL 2005,变量替换的语法与 mysql 略有不同:
如果“number”不为空,第一个 set 语句将变量 @n 的值更新为“number”的当前行值(COALESCE 返回您传递给它的第一个非空参数)
第二个 set 语句将“number”的实际列值更新为其自身(如果不为空)或变量 @n(始终包含遇到的最后一个非 NULL 值)。
这种方法的优点在于,不需要花费额外的资源来一遍又一遍地扫描临时表... @n 的行内更新负责跟踪最后一个非空值。
我没有足够的代表来投票支持他的答案,但有人应该这样做。它是最优雅、性能最好的。
The best solution is the one offered by Bill Karwin. I recently had to solve this in a relatively large resultset (1000 rows with 12 columns each needing this type of "show me last non-null value if this value is null on the current row") and using the update method with a top 1 select for the previous known value (or subquery with a top 1 ) ran super slow.
I am using SQL 2005 and the syntax for a variable replacement is slightly different than mysql:
The first set statement updates the value of the variable @n to the current row's value of 'number' if the 'number' is not null (COALESCE returns the first non-null argument you pass into it)
The second set statement updates the actual column value for 'number' to itself (if not null) or the variable @n (which always contains the last non NULL value encountered).
The beauty of this approach is that there are no additional resources expended on scanning the temporary table over and over again... The in-row update of @n takes care of tracking the last non-null value.
I don't have enough rep to vote his answer up, but someone should. It's the most elegant and best performant.
这是 Oracle 解决方案(10g 或更高版本)。它使用带有
ignore nulls
选项的分析函数last_value()
,该选项会替换列的最后一个非空值。Here is the Oracle solution (10g or higher). It uses the analytic function
last_value()
with theignore nulls
option, which substitutes the last non-null value for the column.以下脚本解决了这个问题,并且仅使用纯 ANSI SQL。我在 SQL2008、SQLite3 和 Oracle11g< /a>.
The following script solves this problem and only uses plain ANSI SQL. I tested this solution on SQL2008, SQLite3 and Oracle11g.
如果您正在寻找 Redshift 的解决方案,这将与框架子句一起使用:
If you're looking for a solution for Redshift, this will work with the frame clause:
我知道这是一个非常古老的论坛,但我在解决问题时遇到了这个:)刚刚意识到其他人已经为上述问题提供了一些复杂的解决方案。请参阅下面我的解决方案:
希望这可以帮助某人:)
I know it is a very old forum, but I came across this while troubleshooting my problem :) just realised that the other guys have given bit complex solution to the above problem. Please see my solution below:
Hope this may help someone:)
首先,你真的需要存储这些值吗?您可以只使用完成这项工作的视图:
如果您确实有 ID(“日期”)列并且它是主键(聚集),那么此查询应该非常快。但请检查查询计划:最好有一个包含
Val
列的覆盖索引。此外,如果您不喜欢过程,但可以避免它们,您也可以对
UPDATE
使用类似的查询:注意:代码必须在“SQL Server”上运行。
First of all, do you really need to store the values? You may just use the view that does the job:
If you really do have the
ID ("date")
column and it is a primary key (clustered), then this query should be pretty fast. But check the query plan: it might be better to have a cover index including theVal
column as well.Also if you do not like procedures when you can avoid them, you can also use similar query for
UPDATE
:NOTE: the code must works on "SQL Server".
这是 MS Access 的解决方案。
该示例表称为
tab
,包含字段id
和val
。This is the solution for MS Access.
The example table is called
tab
, with fieldsid
andval
.这适用于 Snowflake(由 Darren Gardner 提供):
This will work on Snowflake (credit to Darren Gardner):
如果您有一个身份 (
Id
) 和一个通用 (Type
) 列:In case you have one identity (
Id
) and one common (Type
) columns:试试这个:
Try this: