在功能中更改pandas dataframe内容

发布于 2025-02-04 00:52:06 字数 1170 浏览 1 评论 0原文

我正在写一门热门编码的课程,但它不如我预期。

在我的主代码上,我有以下内容:

for col in train_x_categorical.columns:
   dataCleaner.addFeatureToBeOneHotEncoded(col)

dataCleaner.applyOneHotEncoding(train_x_categorical)

train_x_categorical.head()

类方法如下:

def addFeatureToBeOneHotEncoded(self, featureName):
    self._featuresToBeOneHotEncoded.append(featureName)

def applyOneHotEncoding(self, data):
    for feature in self._featuresToBeOneHotEncoded:
        dummies = pd.get_dummies(data[feature])
        dummies.drop(dummies.columns[-1],axis=1,inplace=True) 
        data.drop(feature, axis=1, inplace=True) 
        data = pd.concat([data, dummies], axis=1)
        print(data.columns)

现在,使用print(data.columns)我可以看到该方法正常工作,但是当train> triar_x_categorical.head()运行,我看不到方法applyOneHotEncoding

我不明白为什么会发生这种情况以及如何解决。 I thought that since python passes values by reference, the variable data points to the same object as the variable train_x_categorical, so in the method applyOneHotEncoding I正在处理同一对象,但显然我错了。 有人可以向我解释为什么我的推理是错误的,以及如何解决问题?

I'm writing a class that does one hot encoding, but it doesn't work as I expected.

On my main code I have this:

for col in train_x_categorical.columns:
   dataCleaner.addFeatureToBeOneHotEncoded(col)

dataCleaner.applyOneHotEncoding(train_x_categorical)

train_x_categorical.head()

The class method is the following:

def addFeatureToBeOneHotEncoded(self, featureName):
    self._featuresToBeOneHotEncoded.append(featureName)

def applyOneHotEncoding(self, data):
    for feature in self._featuresToBeOneHotEncoded:
        dummies = pd.get_dummies(data[feature])
        dummies.drop(dummies.columns[-1],axis=1,inplace=True) 
        data.drop(feature, axis=1, inplace=True) 
        data = pd.concat([data, dummies], axis=1)
        print(data.columns)

Now, with print(data.columns) I can see that the method works correctly, but when train_x_categorical.head() runs I can't see the effect of the method applyOneHotEncoding.

I don't understand why this is happening and how to fix it.
I thought that since python passes values by reference, the variable data points to the same object as the variable train_x_categorical, so in the method applyOneHotEncoding I was working on the same object, but clearly I am wrong.
Can someone explain to me why my reasoning is wrong and how I can solve the problem?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

謌踐踏愛綪 2025-02-11 00:52:06

这是因为applyOneHotEncoding更新参考变量 - data。那无法以您认为的方式奏效。这是Python中著名的功能。我知道有几种方法 - 一种是让您的方法返回值。在您的情况下,这将不起作用,因为您是作为循环的一部分这样做的。另一个选项是将变量放在包装器类中并将其传递给方法。然后更新作为包装器类的一部分的变量将起作用。

请参阅此讨论的详尽讨论:我如何通过引用传递变量?

It is because applyOneHotEncoding updates the reference variable - data. That doesn't work the way you think it does. This is a well-known feature in Python. There are a couple of ways around this that I know of - one is to have your method return the value. That won't work in your case since you are doing this as part of a loop. The other option is to put the variable to be updated in a wrapper class and pass that to the method. Then updating the variable that is part of the wrapper class will work.

See this for an exhaustive discussion: How do I pass a variable by reference?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文