Pandas MultiIndiendex仅根据两个索引级匹配减去

发布于 2025-01-17 18:31:33 字数 1451 浏览 5 评论 0原文

假设我有一个带有 3 个索引的 Pandas 多索引数据框：

import pandas as pd
import numpy as np
arrays = [['UK', 'UK', 'US', 'FR'], ['Firm1', 'Firm1', 'Firm2', 'Firm1'], ['Andy', 'Peter', 'Peter', 'Andy']]
idx = pd.MultiIndex.from_arrays(arrays, names = ('Country', 'Firm', 'Responsible'))
df_3idx = pd.DataFrame(np.random.randn(4,3), index = idx)
df_3idx
                                  0         1         2
Country Firm  Responsible                              
UK      Firm1 Andy         0.237655  2.049636  0.480805
              Peter        1.135344  0.745616 -0.577377
US      Firm2 Peter        0.034786 -0.278936  0.877142
FR      Firm1 Andy         0.048224  1.763329 -1.597279

我还有另一个 pd.dataframe，由上述数据中多索引级别 1 和 2 的独特组合组成：

arrays = [['UK', 'US', 'FR'], ['Firm1', 'Firm2', 'Firm1']]
idx = pd.MultiIndex.from_arrays(arrays, names = ('Country', 'Firm'))
df_2idx = pd.DataFrame(np.random.randn(3,1), index = idx)
df_2idx
                      0
Country Firm           
UK      Firm1 -0.103828
US      Firm2  0.096192
FR      Firm1 -0.686631

我想从 df_3idx 中减去值通过 df_2idx 中的相应值，因此，例如，我想从前两行的每个值中减去值 -0.103828，作为索引 1两个数据帧中的 2 和 2 都匹配。

有人知道该怎么做吗？我想我可以简单地拆开第一个数据帧然后减去，但我收到一条错误消息。

df_3idx.unstack('Responsible').sub(df_2idx, axis=0)

ValueError: cannot join with no overlapping index names

无论如何，取消堆叠可能不是一个更好的解决方案，因为我的数据非常大，并且取消堆叠可能需要很多时间。

我将不胜感激任何帮助。非常感谢！

原文

Say I have a Pandas multi-index data frame with 3 indices:

import pandas as pd
import numpy as np
arrays = [['UK', 'UK', 'US', 'FR'], ['Firm1', 'Firm1', 'Firm2', 'Firm1'], ['Andy', 'Peter', 'Peter', 'Andy']]
idx = pd.MultiIndex.from_arrays(arrays, names = ('Country', 'Firm', 'Responsible'))
df_3idx = pd.DataFrame(np.random.randn(4,3), index = idx)
df_3idx
                                  0         1         2
Country Firm  Responsible                              
UK      Firm1 Andy         0.237655  2.049636  0.480805
              Peter        1.135344  0.745616 -0.577377
US      Firm2 Peter        0.034786 -0.278936  0.877142
FR      Firm1 Andy         0.048224  1.763329 -1.597279

I have furthermore another pd.dataframe consisting of unique combinations of multi-index-level 1 and 2 from the above data:

arrays = [['UK', 'US', 'FR'], ['Firm1', 'Firm2', 'Firm1']]
idx = pd.MultiIndex.from_arrays(arrays, names = ('Country', 'Firm'))
df_2idx = pd.DataFrame(np.random.randn(3,1), index = idx)
df_2idx
                      0
Country Firm           
UK      Firm1 -0.103828
US      Firm2  0.096192
FR      Firm1 -0.686631

I want to subtract the values from df_3idx by the corresponding value in df_2idx, so, for instance, I want to subtract from every value of the first two rows the value -0.103828, as index 1 and 2 from both dataframes match.

Does anybody know how to do this? I figured I could simply unstack the first dataframe and then subtract, but I am getting an error message.

df_3idx.unstack('Responsible').sub(df_2idx, axis=0)

ValueError: cannot join with no overlapping index names

Unstacking might anyway not be a preferable solution as my data is very big and unstacking might take a lot of time.

I would appreciate any help. Many thanks in advance!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

嗳卜坏 2025-01-24 18:31:33

相关问题，但不专注于Multiiindex

但是，答案并不真正在乎。 sub方法将在匹配索引级别上对齐。

pd.dataframe.sub.sub 使用参数axis = 0

df_3idx.sub(df_2idx[0], axis=0)

                                  0         1         2
Country Firm  Responsible                              
FR      Firm1 Andy         0.027800  3.316148  0.804833
UK      Firm1 Andy        -2.009797 -1.830799 -0.417737
              Peter       -1.174544  0.644006 -1.150073
US      Firm2 Peter       -2.211121 -3.825443 -4.391965

However, the answer doesn't really care. The sub method will align on the matching index levels.

pd.DataFrame.sub with parameter axis=0

df_3idx.sub(df_2idx[0], axis=0)

                                  0         1         2
Country Firm  Responsible                              
FR      Firm1 Andy         0.027800  3.316148  0.804833
UK      Firm1 Andy        -2.009797 -1.830799 -0.417737
              Peter       -1.174544  0.644006 -1.150073
US      Firm2 Peter       -2.211121 -3.825443 -4.391965

回复收藏 0 原文

~没有更多了~