当前位置：文江博客话题详情

Python pandas calculated-columns

如何在不使用列名的情况下对熊猫的多列执行操作？

发布于 2025-02-03 18:18:25 字数 1294 浏览 2 评论 0 原文

我有一个带有大量列的数据集。我想对所有这些列执行一般计算，并获得最终值，并将其作为新列应用。

例如，我有一个类似于下面的数据框架，

      A1       A2       A3      ...   A120
0    0.12     0.03     0.43     ...   0.56
1    0.24     0.53     0.01     ...   0.98
.     ...       ...     ...     ...    ...
200   0.11     0.22     0.31     ...   0.08

我想使用新的列计算构建类似于下面的数据框架。

calc = (A1**2 - A1) + (A2**2 - A2) ... (A120**2 - A120)

最终的数据框应该是这样，

      A1       A2       A3      ...   A120   calc
0    0.12     0.03     0.43     ...   0.56    x
1    0.24     0.53     0.01     ...   0.98    y
.     ...       ...     ...     ...    ...   ...
200   0.11     0.22     0.31    ...   0.08    n

我尝试使用以下python进行此操作，

import pandas as pd

df = pd.read_csv('sample.csv')

def construct_matrix():
    temp_sumsqc = 0
    for i in range(len(df.columns)):
        column_name_construct = 'A'+f'{i}'
        temp_sumsqc += df[column_name_construct] ** 2 - (df[column_name_construct])
    df["sumsqc"] = temp_sumsqc


matrix_constructor()
print(df_read.to_string())

但这会引发 keyError：'a1

很难做 df [“ a1”] ** 2- df [“ a1”] + df [“ a2”] ** 2 -df [“ a2”] + ... ，因为有120列。

由于我尝试的方式无法正常工作，我想知道是否有更好的方法可以做到这一点？

原文

I have a dataset with a large number of columns. I wanted to perform a general computation on all these columns and get a final value and apply that as a new column.

For example, I have a data frame like below

      A1       A2       A3      ...   A120
0    0.12     0.03     0.43     ...   0.56
1    0.24     0.53     0.01     ...   0.98
.     ...       ...     ...     ...    ...
200   0.11     0.22     0.31     ...   0.08

I want to construct a data frame similar to the below with a new column calc.

calc = (A1**2 - A1) + (A2**2 - A2) ... (A120**2 - A120)

The final data frame should be like this

      A1       A2       A3      ...   A120   calc
0    0.12     0.03     0.43     ...   0.56    x
1    0.24     0.53     0.01     ...   0.98    y
.     ...       ...     ...     ...    ...   ...
200   0.11     0.22     0.31    ...   0.08    n

I tried to do this with python as below

import pandas as pd

df = pd.read_csv('sample.csv')

def construct_matrix():
    temp_sumsqc = 0
    for i in range(len(df.columns)):
        column_name_construct = 'A'+f'{i}'
        temp_sumsqc += df[column_name_construct] ** 2 - (df[column_name_construct])
    df["sumsqc"] = temp_sumsqc


matrix_constructor()
print(df_read.to_string())

But this throws a KeyError: 'A1

It is difficult to do df["A1"]**2 - df["A1"] + df["A2"]**2 - df["A2"] + ... since there are 120 columns.

Since the way I attempted didn't work, I wonder whether there's a better way to do this?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

要走就滚别墨迹 2025-02-10 18:18:25

无需用于循环，我们可以在此处使用矢量化方法

df['calc'] = df.pow(2).sub(df).sum(1)

No need to use for loop, we can use vectorized approach here

df['calc'] = df.pow(2).sub(df).sum(1)

回复收藏 0 原文

对你再特殊 2025-02-10 18:18:25

您可以使用 df.apply 来执行每列的代码，然后使用 sum（axis = 1）将结果值跨列总和：

df['sumsqc'] = df.apply(lambda col: (col ** 2) - col).sum(axis=1)

输出：

>>> df
       A1    A2    A3  A120  sumsqc
0    0.12  0.03  0.43  0.56 -0.6262
1    0.24  0.53  0.01  0.98 -0.4610
200  0.11  0.22  0.31  0.08 -0.5570

请注意， a1 ** 2 -a1 等效于 a1*（a1-1），因此您可以执行

df['sumsqc'] = df.apply(lambda col: col * (col - 1)).sum(axis=1)

You can use df.apply to execute code for each column, and then use sum(axis=1) to sum the resulting values across columns:

df['sumsqc'] = df.apply(lambda col: (col ** 2) - col).sum(axis=1)

Output:

>>> df
       A1    A2    A3  A120  sumsqc
0    0.12  0.03  0.43  0.56 -0.6262
1    0.24  0.53  0.01  0.98 -0.4610
200  0.11  0.22  0.31  0.08 -0.5570

Note that A1**2 - A1 is equivalent to A1 * (A1 - 1), so you could do

df['sumsqc'] = df.apply(lambda col: col * (col - 1)).sum(axis=1)

回复收藏 0 原文

~没有更多了~

关于作者

萝莉病

暂无简介

文章

27 人气

关注发私信

十二

文章 0 评论 0

关注

飞烟轻若梦

文章 0 评论 0

关注

OPleyuhuo

文章 0 评论 0

关注

wxb0109

文章 0 评论 0

关注

旧城空念

文章 0 评论 0

关注

-小熊_

文章 0 评论 0

友情链接

文江博客

如何在不使用列名的情况下对熊猫的多列执行操作？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

十二

飞烟轻若梦

OPleyuhuo

wxb0109

旧城空念

-小熊_

友情链接

如何在不使用列名的情况下对熊猫的多列执行操作？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

十二

飞烟轻若梦

OPleyuhuo

wxb0109

旧城空念

-小熊_

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。