SQL (SQLite) 所有列的空字段计数
我有一个名为 datapoints
的表,其中包含大约 150 列和 2600 行。我知道,150 列太多了,但是我在导入 csv 后得到了这个数据库,并且不可能缩小列数。
我必须从数据中获取一些统计数据。例如,一个问题是: 给我(所有列)字段的总数,这些字段为空。有人知道我如何才能有效地做到这一点吗?
对于一列来说这不是问题:
SELECT count(*) FROM datapoints tb1 where 'tb1'.'column1' is null;
但是我如何才能同时解决所有列的问题,而不需要为每一列手动执行此操作?
最好的, 迈克尔
I've got a table called datapoints
with about 150 columns and 2600 rows. I know, 150 columns is too much, but I got this db after importing a csv and it is not possible to shrink the number of columns.
I have to get some statistical stuff out of the data. E.g. one question would be:
Give me the total number of fields (of all columns), which are null. Does somebody have any idea how I can do this efficiently?
For one column it isn't a problem:
SELECT count(*) FROM datapoints tb1 where 'tb1'.'column1' is null;
But how can I solve this for all columns together, without doing it by hand for every column?
Best,
Michael
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
基于 Lamak 的想法,这个想法怎么样:
其中
N
是列数。诀窍在于制作 COUNT(column) 的求和序列,但对于一个好的文本编辑器和/或电子表格来说,这应该不会太糟糕。Building on Lamak's idea, how about this idea:
where
N
is the number of columns. The trick will be in making the summation series of COUNT(column), but that shouldn't be too terrible with a good text editor and/or spreadsheet.我认为没有简单的方法可以做到这一点。我将从 150 个查询开始。每次只需替换一个单词(列名称)。
i don't think there is an easy way to do it. i'd get started on the 150 queries. you only have to replace one word (column name) each time.
好吧,COUNT(以及大多数聚合函数)会忽略 NULL 值。在您的情况下,由于您使用的是
COUNT(*)
,它会计算表中的每一行,但您可以在任何列上执行此操作。像这样的事情:Well,
COUNT
(and most aggregations funcions) ignoreNULL
values. In your case, since you are usingCOUNT(*)
, it counts every row in the table, but you can do that on any column. Something like this:首先,使用可视化查询工具生成字段列表,然后在电子表格程序中使用剪切/粘贴/搜索/替换或操作将其转换为所需的内容通常很有帮助。要一步完成这一切,您可以使用类似的方法:
使用可视化查询生成器,您可以快速生成:
然后,您可以将逗号替换为需要出现在两个字段名称之间的所有文本,然后修复第一个和最后一个字段。因此,在示例中搜索“,”并替换为“WHEN NULL 1 ELSE 0 END) + SUM(CASE ”,然后修复第一个和最后一个字段。
To get started it's often helpful to use a visual query tool to generate a field list and then use cut/paste/search/replace or manipulation in a spreadsheet program to transform it into what is needed. To do it all in one step you can use something like:
With a visual query builder you can quickly generate:
You can then replace the comma with all the text that needs to appear between two field names followed by fixing up the first and last fields. So in the example search for "," and replace with " WHEN NULL 1 ELSE 0 END) + SUM(CASE " and then fix up the first and last fields.