读取CSV文件时出错:将列从字符串转换为float
我正在尝试读取一个包含列SPTYPE的CSV文件,其中有字符串值。我的变量正在转换为对象,但我需要它是浮动类型。 这是片段:
data = pd.read_csv("/content/Star3642_balanced.csv")
X_orig = data[["Vmag", "Plx", "e_Plx", "B-V", "SpType", "Amag"]].to_numpy()
这是给我错误的原因:
X = torch.tensor(X_orig, dtype=torch.float32)
错误读取“不能转换np.ndarray of type numpy.object_。唯一支持的类型是:float64,float32,float32,float16,confffers64,conffffer64,complect128,int int64,int int32,int32,int32 ,INT16,INT8,UINT8和BOOL。“
我在阅读CSV文件后尝试执行此操作,但这无济于事:
data["SpType"] = data.SpType.astype(float)
有人可以告诉我可以对此做些什么?
I am trying to read a csv file that contains a column, SpType, in which there are String values. My variable is being converted into an object, but I need it to be float type.
Here's the snippet:
data = pd.read_csv("/content/Star3642_balanced.csv")
X_orig = data[["Vmag", "Plx", "e_Plx", "B-V", "SpType", "Amag"]].to_numpy()
Here's what's giving me the error:
X = torch.tensor(X_orig, dtype=torch.float32)
The error reads "can't convert np.ndarray of type numpy.object_. The only supported types are: float64, float32, float16, complex64, complex128, int64, int32, int16, int8, uint8, and bool."
I tried doing this after reading the csv file, but it didn't help:
data["SpType"] = data.SpType.astype(float)
Can someone please tell me what can be done about this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
字符串应编码为数字值。最简单的方法是使用pandas单热编码(在这种情况下,这将创建许多额外的列,但是神经网络应处理那些无需付出的努力):
或者,您可以使用Sklearn Encoders或category_encoders库 - 更复杂的编码可能需要分别处理测试集以避免目标泄漏。
Strings should be encoded into numeric values. The easiest way would be using Pandas one-hot encoding (that will create lots of extra columns in this case, but a neural network should process those without much effort):
Alternatively, you may use sklearn encoders or category_encoders library - more complex encoding might require to process the test set separately to avoid the target leakage.