是否有一种方法可以在(nother)查找表中的值基库上计算数据框的新列

发布于 2025-01-18 03:34:39 字数 1445 浏览 3 评论 0 原文

当计算基于不同格式的查找表时,是否可以计算结果并将其添加到现有数据帧?

更准确地说:

我有一个逗号分隔的文件,我将其打开并作为 pandas 数据帧读取。读入数据帧包含 38 个不同的数据列和一个(在读入过程中创建时)超过数千行的附加索引列:

eSection of dataframe

我的查找表包含作为基础的值 计算。 同样,它是作为 pandas 数据帧读取的逗号分隔文件。它包含 24 行和 6 列以及一个附加索引列:

查找表

这是我尝试实现的计算: 在新列“M_A”中,我想编写如下计算结果:

计算公式

其中 i 代表 C00、C01、C02...C22、C23 的相应值

而SP、FR、C00、C01、C02 [...] 是“数据”数据帧的列部分, PV、W 和 RC_A 是查找表数据帧的一部分。 “数据”表和“查找”表的共同索引参数是根据查找表的“C”列的C00、C01、C02列的值。当数据列 C00、C01、C02... 与查找表行 C00、C01、C02... 匹配时,应采用计算值,

因为对于这种大小的数据集,迭代不是推荐的解决方案,我尝试过但没有找到正确的解决方案因为我的查找表的长度与数据表的长度不同。

df_data['A_calc'] = ((df_data.T / (df_data.SF * df_data.SP)) * ((df_data.C00 * df_lookup.PV * df_lookup.W * df_lookup.RC_A) + (df_data.C01 * df_lookup.PV * df_lookup.W * df_lookup.RC_A) + (df_data.C02 * df_lookup.PV * df_lookup.W * df_lookup.RC_A)+ ...)

这会导致错误消息:

AttributeError: 'DataFrame' object has no attribute 'PTU_Air密度_recalc'

有没有办法在 Python 中使用 Pandas df 实现这一点?也许比我的更优雅,我选择形象化我的意图......

有什么建议吗?

谢谢,斯瓦瓦

Is it possible to calculate and add the results to an existing dataframe while the calculation bases on a lookup table with a different format?

More precise:

I have a comma separated file that I open and read in as pandas dataframe. The read-in dataframe contains 38 different data columns and an (while read in process created) additional index column over several thousand rows:

eSection of dataframe "data"

My lookup table contains values as base for a calculation.
As well, it is a comma separated file read in as pandas dataframe. It contains 24 rows and 6 columns and an additional index column:

Lookup table

And here comes the calculation which I try to realize:
In a new column "M_A" I want to write the result of a calculation like this:

Calculation formula

while i stands for the according values of C00, C01, C02....C22, C23

While SP, FR, C00, C01, C02 [...] are a column part of the "data" dataframe,
PV, W and RC_A are part of the lookup table dataframe.
Common indexing parameter of the "data" and the "lookup" tables are values of the colums of C00, C01, C02 according to column "C" of the lookup table. Calculation values should be taken when data column C00, C01, C02... match lookup table row C00, C01, C02...

As iteration is not a recommended solution for datasets of this size I tried it without but do not find the right way as my lookup table has not the same length as my data table.

df_data['A_calc'] = ((df_data.T / (df_data.SF * df_data.SP)) * ((df_data.C00 * df_lookup.PV * df_lookup.W * df_lookup.RC_A) + (df_data.C01 * df_lookup.PV * df_lookup.W * df_lookup.RC_A) + (df_data.C02 * df_lookup.PV * df_lookup.W * df_lookup.RC_A)+ ...)

This leads to the Error message:

AttributeError: 'DataFrame' object has no attribute 'PTU_Airdensity_recalc'

Is there a way to realize this in Python with Pandas df? Maybe even more elegant than mine which I choose to visualize what my intention is...

Any suggestions?

Thanks, Swawa

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

橘味果▽酱 2025-01-25 03:34:39

因此,为了使我理解应用公式;对于每列CI,我们将其乘以值PV [i],W [i],rc_a [i],然后总和每个结果

result=0

for i in range(len(df_lookup)):
   result=result+(df_data[df_lookup.loc[i,"C"]]*df_lookup.PV.iloc[i] * 
   df_lookup.W.iloc[i] * df_lookup.RC_A.iloc[i])

#result is a  column

#then we multiply element wise

df_data['A_calc'] = ((df_data.T / (df_data.SF * df_data.SP))*multiply(result, axis="index")

So for my understanding to apply the formula; for each column Ci we multiply it with values PV[i], W[i],RC_A[i] then sum over each result

result=0

for i in range(len(df_lookup)):
   result=result+(df_data[df_lookup.loc[i,"C"]]*df_lookup.PV.iloc[i] * 
   df_lookup.W.iloc[i] * df_lookup.RC_A.iloc[i])

#result is a  column

#then we multiply element wise

df_data['A_calc'] = ((df_data.T / (df_data.SF * df_data.SP))*multiply(result, axis="index")
太傻旳人生 2025-01-25 03:34:39

该版本现已运行。非常感谢Ran A的帮助!

result=0

for i in range(len(df_lookup)):
   result=result+(df_data[df_lookup.loc[i,"C"]]*df_lookup.PV.iloc[i] * 
   df_lookup.W.iloc[i] * df_lookup.RC_A.iloc[i])
   #result is a  column

#then we multiply element wise

df_data['A_calc'] = ((df_data.T / (df_data.SF * df_data.SP)).multiply(result, axis="index")

这条线然后与“。”一起工作。而不是“*”。
但我仍在研究循环......

This version is working now. Thanks a lot to the help of Ran A!

result=0

for i in range(len(df_lookup)):
   result=result+(df_data[df_lookup.loc[i,"C"]]*df_lookup.PV.iloc[i] * 
   df_lookup.W.iloc[i] * df_lookup.RC_A.iloc[i])
   #result is a  column

#then we multiply element wise

df_data['A_calc'] = ((df_data.T / (df_data.SF * df_data.SP)).multiply(result, axis="index")

this line is working then with "." instead of "*".
But the loop I am still working on...

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文