如何按特定顺序获得词典的笛卡尔产品？

发布于 2025-02-05 16:58:06 字数 1329 浏览 1 评论 0原文

我想从实验的完整阶乘设计中创建可能组合的DF。我正在使用Doepy，但随着DOE中的组合数量的增长，它似乎正在放慢速度。我切换到使用product_dict（）的笛卡尔产品，该产品似乎更快，但以不同的顺序给出组合。

我需要DOE组合的数据框架以Doepy给出的顺序相同，我认为这是可能的，但我不确定如何。

我的问题是如何计算gas_dict的笛卡尔产品，并以与Doepy相同的顺序给出结果？

import pandas as pd
from doepy import build
from tqdm.contrib.itertools import product

gas_dict = {
 'Velocity (m/s)': [0.00000000E+00, 0.10000000E+00, 0.20000000E+00, 0.30000000E+00, 
         0.40000000E+00, 0.60000000E+00, 0.10000000E+01], 
 
 'Pressure (Pa)': [0.10000000E+06, 0.50000000E+06, 0.10000000E+07, 0.20000000E+07, 
                   0.40000000E+07], 
 
 'Temperature': [0.30000000E+03, 0.40000000E+03, 0.50000000E+03, 0.60000000E+03,],
 'Equivalence Ratio': [0.10000000E+00, 0.50000000E+00, 0.60000000E+00, 0.70000000E+00, 
                      0.80000000E+00, 0.90000000E+00, 0.10000000E+01, 0.11000000E+01, 
                      0.12000000E+01, 0.13000000E+01]    }


def product_dict(**kwargs):
    keys = kwargs.keys()
    vals = kwargs.values()
    for instance in product(*vals):
        yield dict(zip(keys, instance))
        
gas = build.full_fact(gas_dict)      #Correct form 
gas_product = list(product_dict(**gas_dict))  #Incorrect
gas_product = pd.DataFrame(gas_product)

gas_.equals(gas_product)
compare = gas == gas_product

原文

I want to create a DF of the possible combinations from a full factorial design of experiments.
I'm using doepy but it seems to be taking a slowing down as the number of combinations in my DOE grows. I switched to taking the Cartesian product with product_dict() which seems to faster but gives the combinations in a different order.

I need the dataframe of DOE combinations to be in the same order given by doepy, I think thats possible but I'm unsure how.

My question is how to compute the Cartesian product of gas_dict and give the results in the same order as doepy?

import pandas as pd
from doepy import build
from tqdm.contrib.itertools import product

gas_dict = {
 'Velocity (m/s)': [0.00000000E+00, 0.10000000E+00, 0.20000000E+00, 0.30000000E+00, 
         0.40000000E+00, 0.60000000E+00, 0.10000000E+01], 
 
 'Pressure (Pa)': [0.10000000E+06, 0.50000000E+06, 0.10000000E+07, 0.20000000E+07, 
                   0.40000000E+07], 
 
 'Temperature': [0.30000000E+03, 0.40000000E+03, 0.50000000E+03, 0.60000000E+03,],
 'Equivalence Ratio': [0.10000000E+00, 0.50000000E+00, 0.60000000E+00, 0.70000000E+00, 
                      0.80000000E+00, 0.90000000E+00, 0.10000000E+01, 0.11000000E+01, 
                      0.12000000E+01, 0.13000000E+01]    }


def product_dict(**kwargs):
    keys = kwargs.keys()
    vals = kwargs.values()
    for instance in product(*vals):
        yield dict(zip(keys, instance))
        
gas = build.full_fact(gas_dict)      #Correct form 
gas_product = list(product_dict(**gas_dict))  #Incorrect
gas_product = pd.DataFrame(gas_product)

gas_.equals(gas_product)
compare = gas == gas_product

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

我恋#小黄人 2025-02-12 16:58:07

我认为这取决于产品功能的参数顺序，这似乎决定了阵列在创建笛卡尔产品时循环循环的方式。
如果将输入字典中数组的顺序逆转，则输出将变得相同。（在这里采取的两个实现之间的差异似乎就是这种情况，因此，如果有人有一些更详细的见解。）

# reverse dictionary keys
reversed_cols = list(gas_dict.keys())[::-1]
# create reversed input dictionary
gas_dict_rev = {c: sorted(gas_dict[c]) for c in reversed_cols}
# create cartesian product with reversed column order
gas_product_rev = list(product_dict(**gas_dict_rev)) 
gas_product_rev = pd.DataFrame(gas_product_rev)
# change the column order to conform original dict
gas_product_rev = gas_product_rev[gas_dict.keys()]

输出然后视觉上看起来相同，但是gas.equals.equals.equals（gas_product_rev）仍然向我报告。我不熟悉此功能，但我想它不会考虑浮动精度。使用允许浮点精度的numpy函数检查我们获得了预期的结果：

for column in gas:
    print(f'{column} {np.allclose(gas[column], gas_product[column])}')

# Velocity (m/s) False
# Pressure (Pa) False
# Temperature False
# Equivalence Ratio False

for column in gas:
    print(f'{column} {np.allclose(gas[column], gas_product_rev[column])}')

# Velocity (m/s) True
# Pressure (Pa) True
# Temperature True
# Equivalence Ratio True

I think this comes down to the order of arguments to the product function, which seems to determine the way that the arrays are cycled through when creating the cartesian product.
If you reverse the order of arrays in the input dictionary the outputs become the same. (This just seems to be the case for the difference between the two implementations in action here, so would be nice if someone had some more detailed insights.)

# reverse dictionary keys
reversed_cols = list(gas_dict.keys())[::-1]
# create reversed input dictionary
gas_dict_rev = {c: sorted(gas_dict[c]) for c in reversed_cols}
# create cartesian product with reversed column order
gas_product_rev = list(product_dict(**gas_dict_rev)) 
gas_product_rev = pd.DataFrame(gas_product_rev)
# change the column order to conform original dict
gas_product_rev = gas_product_rev[gas_dict.keys()]

the output then looks the same visually, but gas.equals(gas_product_rev) still reports FALSE for me. I'm not familiar with this function but I'd guess it does not take float precision into account. Checking with a numpy function that allows for float precision we get the expected result:

for column in gas:
    print(f'{column} {np.allclose(gas[column], gas_product[column])}')

# Velocity (m/s) False
# Pressure (Pa) False
# Temperature False
# Equivalence Ratio False

for column in gas:
    print(f'{column} {np.allclose(gas[column], gas_product_rev[column])}')

# Velocity (m/s) True
# Pressure (Pa) True
# Temperature True
# Equivalence Ratio True

回复收藏 0 原文

~没有更多了~