逆转SCI-KIT LabElenCoder,但具有2D数组数据集
我正在尝试创建一个自动数据预处理库,我想将字符串数据转换为数值,以便可以通过ML算法运行。但是我似乎无法将其倒回原始状态,因为Sci-kit具有内置的“ inverse_transform()”方法,这应该相对简单。
le=LabelEncoder()
def transformCatagorical(data):
catagorical_data = data.select_dtypes(include=['object']).columns.tolist()
for cat in catagorical_data:
transform = le.fit_transform(data[cat].astype(str))
data[cat] = transform
这是我们的转换函数,如下所示,可以产生良好的结果: 转换的数据
但是,当我们尝试使用此函数将其反转时:
def reverse(orig, data):
cols = get_categorical_columns(orig)
for col in cols:
data[col] = le.inverse_transform(data[col])
它将其转换为一个完整的随机随机,像结构一样坐标?我不确定没有图片就可以解释它: 错误转换数据的图片
我一直在尝试弄清楚它是如何/为什么这样做的老实说,我完全迷路了。任何帮助将不胜感激!谢谢你!
I'm trying to create an automated data pre-processing library and I want to transform the string data into numerical so it can be ran through ML algorithms. But I can't seem to reverse it back to its original state, which should be relatively simple given that Sci-Kit has a built in "inverse_transform()" method.
le=LabelEncoder()
def transformCatagorical(data):
catagorical_data = data.select_dtypes(include=['object']).columns.tolist()
for cat in catagorical_data:
transform = le.fit_transform(data[cat].astype(str))
data[cat] = transform
This is our transformation function which yields good results as shown here:
Transformed Data
But when we try to reverse it using this function:
def reverse(orig, data):
cols = get_categorical_columns(orig)
for col in cols:
data[col] = le.inverse_transform(data[col])
It transforms it into a complete random, coordinate like structure? Im not sure how to explain it without a picture:
Picture of wrongly transformed data
I've been trying to figure out how/why it's doing this but honestly I'm completely lost. Any help would be appreciated! Thank you!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论