在Postgresql串扰中,我可以自动化元组部分吗?
我试图将一个高表(只有 3 列表示变量、时间戳和值)转换为宽格式,其中时间戳是索引,列是变量名称,值是新表的值。
在 python/pandas 中,这与 PostgreSQL 的情况类似,
import pandas as pd
df = pd.read_csv("./mydata.csv") # assume timestamp, varname & value columns
df.pivot(index="timestamp", columns="varname", values="value")
存在 crosstab
,到目前为止我有:
SELECT * FROM crosstab(
$$
SELECT
"timestamp",
"varname",
"value"
FROM mydata
ORDER BY "timestamp" ASC, "varname" ASC
$$
) AS ct(
"timestamp" timestamp,
"varname1" numeric,
...
"varnameN" numeric
);
问题是我可能有数十到数百个变量名。类型始终是数字,变量名称的数量不稳定(我们可能需要更多变量或意识到其他变量是不必要的)。
有没有一种方法可以自动执行“ct”部分,以便其他一些查询(例如从mydata中选择不同的“varname”
)生成它,而不是我必须输入存在的每个变量名称?
PS:PSQL 国内版本为 12.9,生产版本为 14.0。原始表中的行数约为 200 万行,但是我将按时间戳和变量名进行过滤,因此可能只有几十万行。过滤后,我得到了大约 50 个独特的变量名,但这将在几周内增加。
I'm trying to do get a tall table (with just 3 columns indicating variable, timestamp and value) into a wide format where timestamp is the index, the columns are the variable names, and the values are the values of the new table.
In python/pandas this would be something along the lines of
import pandas as pd
df = pd.read_csv("./mydata.csv") # assume timestamp, varname & value columns
df.pivot(index="timestamp", columns="varname", values="value")
for PostgreSQL there exists crosstab
, so far I have:
SELECT * FROM crosstab(
$
SELECT
"timestamp",
"varname",
"value"
FROM mydata
ORDER BY "timestamp" ASC, "varname" ASC
$
) AS ct(
"timestamp" timestamp,
"varname1" numeric,
...
"varnameN" numeric
);
The problem is that I can potentially have dozens to hundreds of variable names. The types are always numeric, number of variable names is not stable (we could need more variables or realize that others are not necessary).
Is there a way to automate the "ct" part so that some other query (e.g. select distinct "varname" from mydata
) produces it instead of me having to type in every single variable name present?
PS: PSQL version is 12.9 at home, 14.0 in production. Number of rows in the original table is around 2 million, however I'm going to filter by timestamp and varname, so potentially only a few hundreds of thousands rows. After filtering I got ~50 unique varnames, but that will increase in a few weeks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论