庞达斯数据框架中的特定列

发布于 2025-02-03 09:44:56 字数 696 浏览 4 评论 0原文

从下面提到的数据框架中,我试图根据S1,S2和S3来计算列V1,V2和V3列的Excel类型Sumproduct。

df = pd.DataFrame({'Name': ['A', 'B', 'C'],
                   'Qty': [100, 150, 200],
                   'Remarks': ['Bad', 'Avg', 'Good'],
                   'V1': [0,1,1],
                   'V2': [1,1,0],
                   'V3': [0,0,1],
                   'S1': [1,0,1],
                   'S2': [0,1,0],
                   'S3': [1,0,1]
            })

我正在寻求一种方法来做到这一点,而不必使用每一列的名称,例如:

df['SP'] = df[['V1', 'S1']].prod(axis=1) + df[['V2', 'S2']].prod(axis=1) + df[['V3', 'S3']].prod(axis=1)

在我的真实数据框架中,我在“ V”和“ S”类别中都有50列以上,因此不可能使用上述方法。

有什么建议吗?

谢谢!

From the below mentioned data frame, I am trying to calculate excel type SUMPRODUCT of columns V1, V2 and V3 against columns S1, S2 and S3.

df = pd.DataFrame({'Name': ['A', 'B', 'C'],
                   'Qty': [100, 150, 200],
                   'Remarks': ['Bad', 'Avg', 'Good'],
                   'V1': [0,1,1],
                   'V2': [1,1,0],
                   'V3': [0,0,1],
                   'S1': [1,0,1],
                   'S2': [0,1,0],
                   'S3': [1,0,1]
            })

I am looking a way to do this without having to use each column's name like:

df['SP'] = df[['V1', 'S1']].prod(axis=1) + df[['V2', 'S2']].prod(axis=1) + df[['V3', 'S3']].prod(axis=1)

In my real data frame, I have more than 50 columns in both 'V' and 'S' categories so the above approach is not possible.

Any suggestions?

Thanks!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

卸妝后依然美 2025-02-10 09:44:56

然后过滤S和V类列,然后将S列与相应的V列相乘,并沿列轴PS总和结果

s = df.filter(regex='S\d+')
p = df.filter(regex='V\d+')

df['SP'] = s.mul(p.values).sum(1)

  Name  Qty Remarks  V1  V2  V3  S1  S2  S3  SP
0    A  100     Bad   0   1   0   1   0   1   0
1    B  150     Avg   1   1   0   0   1   0   1
2    C  200    Good   1   0   1   1   0   1   2

:该解决方案假设原始数据帧匹配中S和V列的外观顺序。

Filter the S and V like columns then multiply the S columns with the corresponding V columns and sum the result along columns axis

s = df.filter(regex='S\d+')
p = df.filter(regex='V\d+')

df['SP'] = s.mul(p.values).sum(1)

  Name  Qty Remarks  V1  V2  V3  S1  S2  S3  SP
0    A  100     Bad   0   1   0   1   0   1   0
1    B  150     Avg   1   1   0   0   1   0   1
2    C  200    Good   1   0   1   1   0   1   2

PS: This solution assumes that the order of appearance of S and V columns in the original dataframe matches.

缘字诀 2025-02-10 09:44:56

您可以尝试这样的事情:

# need to edit these two lines to work with your larger DataFrame
v_cols = df.columns[3:6]  # ['V1', 'V2', 'V3']
s_cols = df.columns[6:]  # ['S1', 'S2', 'S3']

df['SP'] = (df[v_cols].to_numpy() * df[s_cols].to_numpy()).sum(axis=1)

在看到@alollz的评论有关MultiIndex的评论后,用替代性编辑更简单:

df.set_index(['Name', 'Qty', 'Remarks'], inplace=True)
n_cols = df.shape[1] // 2
v_cols = df.columns[:n_cols]
s_cols = df.columns[n_cols:]
df['SP'] = (df[v_cols].to_numpy() * df[s_cols].to_numpy()).sum(axis=1)

然后,如果您愿意,则可以重置索引:

df.reset_index(inplace=True)

结果:结果:

  Name  Qty Remarks  V1  V2  V3  S1  S2  S3  SP
0    A  100     Bad   0   1   0   1   0   1   0
1    B  150     Avg   1   1   0   0   1   0   1
2    C  200    Good   1   0   1   1   0   1   2

You could try something like this:

# need to edit these two lines to work with your larger DataFrame
v_cols = df.columns[3:6]  # ['V1', 'V2', 'V3']
s_cols = df.columns[6:]  # ['S1', 'S2', 'S3']

df['SP'] = (df[v_cols].to_numpy() * df[s_cols].to_numpy()).sum(axis=1)

Edited with an alternative after seeing comment from @ALollz about MultiIndex making alignment simpler:

df.set_index(['Name', 'Qty', 'Remarks'], inplace=True)
n_cols = df.shape[1] // 2
v_cols = df.columns[:n_cols]
s_cols = df.columns[n_cols:]
df['SP'] = (df[v_cols].to_numpy() * df[s_cols].to_numpy()).sum(axis=1)

You can then reset index if you prefer:

df.reset_index(inplace=True)

Results:

  Name  Qty Remarks  V1  V2  V3  S1  S2  S3  SP
0    A  100     Bad   0   1   0   1   0   1   0
1    B  150     Avg   1   1   0   0   1   0   1
2    C  200    Good   1   0   1   1   0   1   2
dawn曙光 2025-02-10 09:44:56

如果您的vn列中的sn

v_cols = df.filter(like='V').columns
s_cols = df.filter(like='S').columns

df['SP2'] = sum([df[[v, s]].prod(axis=1) for v, s in zip(v_cols, s_cols)])
print(df)

  Name  Qty Remarks  V1  V2  V3  S1  S2  S3  SP  SP2
0    A  100     Bad   0   1   0   1   0   1   0    0
1    B  150     Avg   1   1   0   0   1   0   1    1
2    C  200    Good   1   0   1   1   0   1   2    2

If your Vn and Sn in columns are in order

v_cols = df.filter(like='V').columns
s_cols = df.filter(like='S').columns

df['SP2'] = sum([df[[v, s]].prod(axis=1) for v, s in zip(v_cols, s_cols)])
print(df)

  Name  Qty Remarks  V1  V2  V3  S1  S2  S3  SP  SP2
0    A  100     Bad   0   1   0   1   0   1   0    0
1    B  150     Avg   1   1   0   0   1   0   1    1
2    C  200    Good   1   0   1   1   0   1   2    2
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文