读取 csv 文件时更改列格式
我有这个 csv 文件(名为 df.csv
):
我使用此代码阅读了它:
import pandas as pd
df = pd.read_csv('df.csv')
并使用此代码将其打印出来:
print(df)
以及打印的输出看起来像这样:
employment_type ltv
0
1
2 Salaried 77.13
3 Salaried 77.4
4 Salaried 76.42
5 Salaried 71.89
尽你所能看,前两条记录是空的。 我使用以下代码检查数据帧信息:
print(df.info())
输出如下所示:
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 employment_type 6 non-null object
1 ltv 6 non-null object
现在,我希望:
employment_type
将作为对象读入(并且满足我的期望)ltv< /code> 会被读入为 float
我猜这两个字段都被读入为对象的原因是因为第一个空记录,对吗?
虽然我很高兴将 employment_type
作为对象读入,但如何以数字形式读入 ltv
字段? 我不想在读入文件后修改格式。我需要找到一种方法在读入文件时自动分配正确的格式:我将不得不读入一些具有数百列的类似文件,并且我无法手动为每一列分配正确的格式。
I have this csv file (called df.csv
):
I read it in using this code:
import pandas as pd
df = pd.read_csv('df.csv')
and I print it out using this code:
print(df)
and the output of the print looks like this:
employment_type ltv
0
1
2 Salaried 77.13
3 Salaried 77.4
4 Salaried 76.42
5 Salaried 71.89
As you can see, the first two records are empty.
I check the dataframe info with this code:
print(df.info())
and the output looks like this:
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 employment_type 6 non-null object
1 ltv 6 non-null object
Now, I would expect that:
employment_type
would have been read in as object (and that meets my expectations)ltv
would have been read in as float
I guess that the reason why both fields have been read in as objects is because of the first empty record, correct?
Whilst I am happy for employment_type
to be read in as an object, how can I read in the ltv
field as numeric?
I don't want to modify the format after I have read the file in. I need to find a way to automatically assign the correct format whilst reading in the file: I will have to read in some similar files with hundreds of columns and I can't manually assign the correct format to each column.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我猜这两个字段都被作为对象读入的原因是因为第一个空记录,对吗?
是的,pandas 非常擅长推断数据类型,并且空单元格不能是 int 或 float。
要解决您的问题,只需删除这些空行(使用 dropna),然后您就可以编写
I guess that the reason why both fields have been read in as objects is because of the first empty record, correct?
Yes, pandas is pretty good at infering data types, and an empty cell can't be an int or a float.
To fix your issue, just remove these empty rows (with dropna) and you can then write