在功能中更改pandas dataframe内容
我正在写一门热门编码的课程,但它不如我预期。
在我的主代码上,我有以下内容:
for col in train_x_categorical.columns:
dataCleaner.addFeatureToBeOneHotEncoded(col)
dataCleaner.applyOneHotEncoding(train_x_categorical)
train_x_categorical.head()
类方法如下:
def addFeatureToBeOneHotEncoded(self, featureName):
self._featuresToBeOneHotEncoded.append(featureName)
def applyOneHotEncoding(self, data):
for feature in self._featuresToBeOneHotEncoded:
dummies = pd.get_dummies(data[feature])
dummies.drop(dummies.columns[-1],axis=1,inplace=True)
data.drop(feature, axis=1, inplace=True)
data = pd.concat([data, dummies], axis=1)
print(data.columns)
现在,使用print(data.columns)
我可以看到该方法正常工作,但是当train> triar_x_categorical.head()
运行,我看不到方法applyOneHotEncoding
。
我不明白为什么会发生这种情况以及如何解决。 I thought that since python passes values by reference, the variable data
points to the same object as the variable train_x_categorical
, so in the method applyOneHotEncoding
I正在处理同一对象,但显然我错了。 有人可以向我解释为什么我的推理是错误的,以及如何解决问题?
I'm writing a class that does one hot encoding, but it doesn't work as I expected.
On my main code I have this:
for col in train_x_categorical.columns:
dataCleaner.addFeatureToBeOneHotEncoded(col)
dataCleaner.applyOneHotEncoding(train_x_categorical)
train_x_categorical.head()
The class method is the following:
def addFeatureToBeOneHotEncoded(self, featureName):
self._featuresToBeOneHotEncoded.append(featureName)
def applyOneHotEncoding(self, data):
for feature in self._featuresToBeOneHotEncoded:
dummies = pd.get_dummies(data[feature])
dummies.drop(dummies.columns[-1],axis=1,inplace=True)
data.drop(feature, axis=1, inplace=True)
data = pd.concat([data, dummies], axis=1)
print(data.columns)
Now, with print(data.columns)
I can see that the method works correctly, but when train_x_categorical.head()
runs I can't see the effect of the method applyOneHotEncoding
.
I don't understand why this is happening and how to fix it.
I thought that since python passes values by reference, the variable data
points to the same object as the variable train_x_categorical
, so in the method applyOneHotEncoding
I was working on the same object, but clearly I am wrong.
Can someone explain to me why my reasoning is wrong and how I can solve the problem?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这是因为
applyOneHotEncoding
更新参考变量 -data
。那无法以您认为的方式奏效。这是Python中著名的功能。我知道有几种方法 - 一种是让您的方法返回值。在您的情况下,这将不起作用,因为您是作为循环的一部分这样做的。另一个选项是将变量放在包装器类中并将其传递给方法。然后更新作为包装器类的一部分的变量将起作用。请参阅此讨论的详尽讨论:我如何通过引用传递变量?
It is because
applyOneHotEncoding
updates the reference variable -data
. That doesn't work the way you think it does. This is a well-known feature in Python. There are a couple of ways around this that I know of - one is to have your method return the value. That won't work in your case since you are doing this as part of a loop. The other option is to put the variable to be updated in a wrapper class and pass that to the method. Then updating the variable that is part of the wrapper class will work.See this for an exhaustive discussion: How do I pass a variable by reference?