使用可变形状输入列表的笛卡尔乘积填充数据框

发布于 2025-01-19 13:45:25 字数 1257 浏览 2 评论 0原文

我想创建一个脚本，用我想要在一系列实验中改变的参数的笛卡儿积值填充数据帧。我的第一个想法是使用 itertools 的乘积函数，但它似乎需要一组固定的输入列表。我正在寻找的输出可以使用此示例生成：

cols = ['temperature','pressure','power']

l1 = [1, 100, 50.0 ]
l2 = [1000, 10, np.nan] 
l3 = [0, 100, np.nan]


data = []
for val in itertools.product(l1,l2,l3): #use itertools to get the Carthesian product of the lists
    data.append(val) #make a list of lists to store each variation

df = pd.DataFrame(data, columns=cols).dropna(0) #make a dataframe from the list of lists (dropping NaN values)

但是，我想从任意形状的数据帧中提取参数，然后用产品填充数据帧，如下所示（代码不起作用）：

data = [{'parameter':'temperature','value1':1,'value2':100,'value3':50},
        {'parameter':'pressure','value1':1000,'value2':10},
        {'parameter':'power','value1':0,'value2':100},
        ]

df = pd.DataFrame(data)
l = []
cols = []
for i in range(df.shape[0]):
    l.append(df.iloc[i][1:].to_list()) #store the values of each df row to a separate list
    cols.append(df.iloc[i][0]) #store the first value of the row as column header

data = []
for val in itertools.product(l): #ask itertools to parse a list of lists
    data.append(val)

df2 = pd.DataFrame(data, columns=cols).dropna(0)

可以你推荐一个方法吗？我的目标是创建最终的数据框，因此不需要使用 itertools。

原文

I want to create a script that fills a dataframe with values that are the Carthesian product of parameters I want to vary in a series of experiments.
My first thought was to use the product function of itertools, however it seems to require a fixed set of input lists.
The output I'm looking for can be generated using this sample:

cols = ['temperature','pressure','power']

l1 = [1, 100, 50.0 ]
l2 = [1000, 10, np.nan] 
l3 = [0, 100, np.nan]


data = []
for val in itertools.product(l1,l2,l3): #use itertools to get the Carthesian product of the lists
    data.append(val) #make a list of lists to store each variation

df = pd.DataFrame(data, columns=cols).dropna(0) #make a dataframe from the list of lists (dropping NaN values)

However, I would like instead to extract the parameters from dataframes of arbitrary shape and then fill up a dataframe with the product, like so (code doesn't work):

data = [{'parameter':'temperature','value1':1,'value2':100,'value3':50},
        {'parameter':'pressure','value1':1000,'value2':10},
        {'parameter':'power','value1':0,'value2':100},
        ]

df = pd.DataFrame(data)
l = []
cols = []
for i in range(df.shape[0]):
    l.append(df.iloc[i][1:].to_list()) #store the values of each df row to a separate list
    cols.append(df.iloc[i][0]) #store the first value of the row as column header

data = []
for val in itertools.product(l): #ask itertools to parse a list of lists
    data.append(val)

df2 = pd.DataFrame(data, columns=cols).dropna(0)

Can you recommend a way about this? My goal is creating the final dataframe, so it's not a requirement to use itertools.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

魔法少女 2025-01-26 13:45:25

没有产品的另一个替代方案（product ）可能是使用 .join（）连续的跨产品：

df2 = df.T.rename(columns=df.iloc[:, 0]).drop(df.columns[0])
df2 = (
    df2.iloc[:, [0]]
    .join(df2.iloc[:, [1]], how="cross")
    .join(df2.iloc[:, [2]], how="cross")
    .dropna(axis=0)
)

结果：

   temperature pressure power
0            1     1000     0
1            1     1000   100
3            1       10     0
4            1       10   100
9          100     1000     0
10         100     1000   100
12         100       10     0
13         100       10   100
18        50.0     1000     0
19        50.0     1000   100
21        50.0       10     0
22        50.0       10   100

带有产品的compacter版本：

from itertools import product

df2 = pd.DataFrame(
    product(*df.set_index("parameter", drop=True).itertuples(index=False)),
    columns=df["parameter"]
).dropna(axis=0)

Another alternative without product (nothing wrong with product, though) could be to use .join() with how="cross" to produce successive cross-products:

df2 = df.T.rename(columns=df.iloc[:, 0]).drop(df.columns[0])
df2 = (
    df2.iloc[:, [0]]
    .join(df2.iloc[:, [1]], how="cross")
    .join(df2.iloc[:, [2]], how="cross")
    .dropna(axis=0)
)

Result:

   temperature pressure power
0            1     1000     0
1            1     1000   100
3            1       10     0
4            1       10   100
9          100     1000     0
10         100     1000   100
12         100       10     0
13         100       10   100
18        50.0     1000     0
19        50.0     1000   100
21        50.0       10     0
22        50.0       10   100

A compacter version with product:

from itertools import product

df2 = pd.DataFrame(
    product(*df.set_index("parameter", drop=True).itertuples(index=False)),
    columns=df["parameter"]
).dropna(axis=0)

回复收藏 0 原文

~没有更多了~