Pandas 数据框 to_datetime() 给出错误
目标: 我从 .csv 读取测量数据并将其转换为数据框。然后,我将文件名中的日期信息添加到数据框中已有的时间字符串中。最后一步是将带有日期和时间信息的字符串转换为日期时间对象。
有效的第一个步骤:
import pandas as pd
filename = '2022_02_14_data_0.csv
path = 'C:/Users/ma1075116/switchdrive/100_Schaltag/100_Digitales_Model/Messungen/'
measData = pd.read_csv(path+filename, sep = '\t', header = [0,1], encoding = 'ISO-8859-1')
# add the date to the timestamp string
measData['Timestamp'] = filename[:11]+measData['Timestamp']
数据帧 measData['Timestamp'] 中的对象现在精确地具有以下模式的字符串:
'2022_02_14_00:00:06'
现在我想将此字符串转换为日期时间:
measData['Timestamp'] = pd.to_datetime(measData['Timestamp'], format= '%Y_%m_%d_%H:%M:%S')
这会引发错误:
ValueError:组装映射至少需要指定[年、月、日]:缺少[日、月、年]
为什么会出现此错误以及如何避免它?我非常确定格式是正确的。
编辑: 我编写了一个示例代码,它应该执行完全相同的操作,并且它有效:
filename = '2022_02_14_data_0.csv'
timestamps = {'Timestamp': ['00:00:00', '00:00:01', '00:00:04']}
testFrame = pd.DataFrame(timestamps)
testFrame['Timestamp'] = testFrame['Timestamp']#
testFrame['Timestamp'] = filename[:11]+testFrame['Timestamp']
testFrame['Timestamp'] = pd.to_datetime(testFrame['Timestamp'], format= '%Y_%m_%d_%X')
我的下一步是检查数据帧中的所有时间戳条目是否具有相同的格式。
解决方案: 我不明白这个错误,但我找到了一个可行的解决方案。现在,我解析 read_csv 函数中已有的时间,并添加文件名中的日期信息。这有效,measData(timeStamp) 现在的数据类型为 datetime64。
filename = '2022_02_14_data_0.csv'
path = 'C:/Users/ma1075116/switchdrive/100_Schaltag/100_Digitales_Model/Messungen/'
measData = pd.read_csv(path+filename, sep = '\t', header = [0,1],
parse_dates=[0], # parse for the time in the first column
date_parser = lambda col: pd.to_datetime(filename[:11]+col, format= '%Y_%m_%d_%X'),
encoding = 'ISO-8859-1')
Goal:
I read measurement data from a .csv and convert them to a dataframe. Then I add the date information from the filename to the time string which is already in the dataframe. And the last step is to convert this string with date and time informatin into a datetime object.
First steps that worked:
import pandas as pd
filename = '2022_02_14_data_0.csv
path = 'C:/Users/ma1075116/switchdrive/100_Schaltag/100_Digitales_Model/Messungen/'
measData = pd.read_csv(path+filename, sep = '\t', header = [0,1], encoding = 'ISO-8859-1')
# add the date to the timestamp string
measData['Timestamp'] = filename[:11]+measData['Timestamp']
An object in the Dataframe measData['Timestamp'] has now exacty a string with the following pattern:
'2022_02_14_00:00:06'
Now I want to convert this string to datetime:
measData['Timestamp'] = pd.to_datetime(measData['Timestamp'], format= '%Y_%m_%d_%H:%M:%S')
This raises the error:
ValueError: to assemble mappings requires at least that [year, month, day] be specified: [day,month,year] is missing
Why do I get this error and how can I avoid it? I am pretty shure that the format is correct.
Edit:
I wrote a sample code which should do exactly the same, and it works:
filename = '2022_02_14_data_0.csv'
timestamps = {'Timestamp': ['00:00:00', '00:00:01', '00:00:04']}
testFrame = pd.DataFrame(timestamps)
testFrame['Timestamp'] = testFrame['Timestamp']#
testFrame['Timestamp'] = filename[:11]+testFrame['Timestamp']
testFrame['Timestamp'] = pd.to_datetime(testFrame['Timestamp'], format= '%Y_%m_%d_%X')
My next step is now to check if all timestamp entries in the dataframe have the same format.
Solution:
I do not understand the error but I found a working solution. Now I parse for the time already in the read_csv function and add the date information from the filename there. This works, measData(timeStamp) has now the datatype datetime64.
filename = '2022_02_14_data_0.csv'
path = 'C:/Users/ma1075116/switchdrive/100_Schaltag/100_Digitales_Model/Messungen/'
measData = pd.read_csv(path+filename, sep = '\t', header = [0,1],
parse_dates=[0], # parse for the time in the first column
date_parser = lambda col: pd.to_datetime(filename[:11]+col, format= '%Y_%m_%d_%X'),
encoding = 'ISO-8859-1')
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您的格式似乎在一天后缺少下划线。
这对我有用:
编辑:
这对我来说效果很好(measData [“Timestamp”]是一个pd.Series):
我发现重现错误的唯一方法是这个(measData是一个pd.DataFrame):
所以请确保您放入 to_datetime 的内容是 pd.Series。如果这没有帮助,请提供您的一小部分数据样本。
Your format seems to be missing an underscore after day.
This works for me:
EDIT:
This works fine for me (measData["Timestamp"] is a pd.Series):
The only way I found to reproduce your error is this (measData is a pd.DataFrame):
So make sure that what you are putting into to_datetime is a pd.Series. If this does not help, please provide a small sample of your data.
您可以在列中使用
datetime.datetime.strptime
和apply
来执行此操作。重新创建数据集:
应用所需的转换:
You can do it like this using
datetime.datetime.strptime
andapply
in the column.Recreating your dataset:
Applying the desired transformation: