庞达斯数据框架中的特定列

发布于 2025-02-03 09:44:56 字数 696 浏览 4 评论 0原文

从下面提到的数据框架中，我试图根据S1，S2和S3来计算列V1，V2和V3列的Excel类型Sumproduct。

df = pd.DataFrame({'Name': ['A', 'B', 'C'],
                   'Qty': [100, 150, 200],
                   'Remarks': ['Bad', 'Avg', 'Good'],
                   'V1': [0,1,1],
                   'V2': [1,1,0],
                   'V3': [0,0,1],
                   'S1': [1,0,1],
                   'S2': [0,1,0],
                   'S3': [1,0,1]
            })

我正在寻求一种方法来做到这一点，而不必使用每一列的名称，例如：

df['SP'] = df[['V1', 'S1']].prod(axis=1) + df[['V2', 'S2']].prod(axis=1) + df[['V3', 'S3']].prod(axis=1)

在我的真实数据框架中，我在“ V”和“ S”类别中都有50列以上，因此不可能使用上述方法。

有什么建议吗？

谢谢！

原文

From the below mentioned data frame, I am trying to calculate excel type SUMPRODUCT of columns V1, V2 and V3 against columns S1, S2 and S3.

df = pd.DataFrame({'Name': ['A', 'B', 'C'],
                   'Qty': [100, 150, 200],
                   'Remarks': ['Bad', 'Avg', 'Good'],
                   'V1': [0,1,1],
                   'V2': [1,1,0],
                   'V3': [0,0,1],
                   'S1': [1,0,1],
                   'S2': [0,1,0],
                   'S3': [1,0,1]
            })

I am looking a way to do this without having to use each column's name like:

df['SP'] = df[['V1', 'S1']].prod(axis=1) + df[['V2', 'S2']].prod(axis=1) + df[['V3', 'S3']].prod(axis=1)

In my real data frame, I have more than 50 columns in both 'V' and 'S' categories so the above approach is not possible.

Any suggestions?

Thanks!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

卸妝后依然美 2025-02-10 09:44:56

然后过滤S和V类列，然后将S列与相应的V列相乘，并沿列轴PS总和结果

s = df.filter(regex='S\d+')
p = df.filter(regex='V\d+')

df['SP'] = s.mul(p.values).sum(1)

  Name  Qty Remarks  V1  V2  V3  S1  S2  S3  SP
0    A  100     Bad   0   1   0   1   0   1   0
1    B  150     Avg   1   1   0   0   1   0   1
2    C  200    Good   1   0   1   1   0   1   2

：该解决方案假设原始数据帧匹配中S和V列的外观顺序。

Filter the S and V like columns then multiply the S columns with the corresponding V columns and sum the result along columns axis

s = df.filter(regex='S\d+')
p = df.filter(regex='V\d+')

df['SP'] = s.mul(p.values).sum(1)

  Name  Qty Remarks  V1  V2  V3  S1  S2  S3  SP
0    A  100     Bad   0   1   0   1   0   1   0
1    B  150     Avg   1   1   0   0   1   0   1
2    C  200    Good   1   0   1   1   0   1   2

PS: This solution assumes that the order of appearance of S and V columns in the original dataframe matches.

回复收藏 0 原文

缘字诀 2025-02-10 09:44:56

您可以尝试这样的事情：

# need to edit these two lines to work with your larger DataFrame
v_cols = df.columns[3:6]  # ['V1', 'V2', 'V3']
s_cols = df.columns[6:]  # ['S1', 'S2', 'S3']

df['SP'] = (df[v_cols].to_numpy() * df[s_cols].to_numpy()).sum(axis=1)

在看到@alollz的评论有关MultiIndex的评论后，用替代性编辑更简单：

df.set_index(['Name', 'Qty', 'Remarks'], inplace=True)
n_cols = df.shape[1] // 2
v_cols = df.columns[:n_cols]
s_cols = df.columns[n_cols:]
df['SP'] = (df[v_cols].to_numpy() * df[s_cols].to_numpy()).sum(axis=1)

然后，如果您愿意，则可以重置索引：

df.reset_index(inplace=True)

结果：结果：

  Name  Qty Remarks  V1  V2  V3  S1  S2  S3  SP
0    A  100     Bad   0   1   0   1   0   1   0
1    B  150     Avg   1   1   0   0   1   0   1
2    C  200    Good   1   0   1   1   0   1   2

You could try something like this:

# need to edit these two lines to work with your larger DataFrame
v_cols = df.columns[3:6]  # ['V1', 'V2', 'V3']
s_cols = df.columns[6:]  # ['S1', 'S2', 'S3']

df['SP'] = (df[v_cols].to_numpy() * df[s_cols].to_numpy()).sum(axis=1)

Edited with an alternative after seeing comment from @ALollz about MultiIndex making alignment simpler:

df.set_index(['Name', 'Qty', 'Remarks'], inplace=True)
n_cols = df.shape[1] // 2
v_cols = df.columns[:n_cols]
s_cols = df.columns[n_cols:]
df['SP'] = (df[v_cols].to_numpy() * df[s_cols].to_numpy()).sum(axis=1)

You can then reset index if you prefer:

df.reset_index(inplace=True)

Results:

  Name  Qty Remarks  V1  V2  V3  S1  S2  S3  SP
0    A  100     Bad   0   1   0   1   0   1   0
1    B  150     Avg   1   1   0   0   1   0   1
2    C  200    Good   1   0   1   1   0   1   2

回复收藏 0 原文

dawn曙光 2025-02-10 09:44:56

如果您的vn和列中的sn

v_cols = df.filter(like='V').columns
s_cols = df.filter(like='S').columns

df['SP2'] = sum([df[[v, s]].prod(axis=1) for v, s in zip(v_cols, s_cols)])

print(df)

  Name  Qty Remarks  V1  V2  V3  S1  S2  S3  SP  SP2
0    A  100     Bad   0   1   0   1   0   1   0    0
1    B  150     Avg   1   1   0   0   1   0   1    1
2    C  200    Good   1   0   1   1   0   1   2    2

If your Vn and Sn in columns are in order

v_cols = df.filter(like='V').columns
s_cols = df.filter(like='S').columns

df['SP2'] = sum([df[[v, s]].prod(axis=1) for v, s in zip(v_cols, s_cols)])

print(df)

  Name  Qty Remarks  V1  V2  V3  S1  S2  S3  SP  SP2
0    A  100     Bad   0   1   0   1   0   1   0    0
1    B  150     Avg   1   1   0   0   1   0   1    1
2    C  200    Good   1   0   1   1   0   1   2    2

回复收藏 0 原文

~没有更多了~