如何对 Pandas 中的时间范围进行分类?

发布于 2025-01-16 13:46:07 字数 1317 浏览 1 评论 0原文

在我的项目中,我试图创建一个新列来按小时范围对记录进行分类,让我解释一下,我在数据框中有一列名为“TowedTime”的时间序列数据,我希望另一列按整小时(不含分钟)进行分类,例如,如果“TowedTime”列中的值是 09:32:10,我希望将其分类为上午 9 点,如果说 12:45:10,则应分类为中午 12 点,依此类推与所有其他值。我已经阅读了有关 .cut 和 bins 函数的信息,但我无法得到我想要的结果。 DF ' TowedTime 值'


        import numpy as np 
        import pandas as pd 
        import matplotlib.pyplot as plt
        import seaborn as sns
        import datetime
        
        df = pd.read_excel("Baltimore Towing Division.xlsx",sheet_name="TowingData")
        
        df['Month'] = pd.DatetimeIndex(df['TowedDate']).strftime("%b")
        df['Week day'] = pd.DatetimeIndex(df['TowedDate']).strftime("%a")
    
        monthOrder = ['Jan', 'Feb', 'Mar', 'Apr','May','Jun','Jul','Aug','Sep','Oct','Nov','Dec']
        dayOrder = ['Mon','Tue','Wed','Thu','Fri','Sat','Sun']
        
        pivotHours = pd.pivot_table(df, values='TowedDate',index='TowedTime',
                                columns='Week day',
                                fill_value=0,
                                aggfunc= 'count', 
                                margins = False, margins_name='Total').reindex(dayOrder,axis=1)
        
        print(pivotHours)

In my project I am trying to create a new column to categorize records by range of hours, let me explain, I have a column in the dataframe called 'TowedTime' with time series data, I want another column to categorize by full hour without minutes, for example if the value in the 'TowedTime' column is 09:32:10 I want it to be categorized as 9 AM, if says 12:45:10 it should be categorized as 12 PM and so on with all the other values. I've read about the .cut and bins function but I can't get the result I want.
DF 'TowedTime values'


        import numpy as np 
        import pandas as pd 
        import matplotlib.pyplot as plt
        import seaborn as sns
        import datetime
        
        df = pd.read_excel("Baltimore Towing Division.xlsx",sheet_name="TowingData")
        
        df['Month'] = pd.DatetimeIndex(df['TowedDate']).strftime("%b")
        df['Week day'] = pd.DatetimeIndex(df['TowedDate']).strftime("%a")
    
        monthOrder = ['Jan', 'Feb', 'Mar', 'Apr','May','Jun','Jul','Aug','Sep','Oct','Nov','Dec']
        dayOrder = ['Mon','Tue','Wed','Thu','Fri','Sat','Sun']
        
        pivotHours = pd.pivot_table(df, values='TowedDate',index='TowedTime',
                                columns='Week day',
                                fill_value=0,
                                aggfunc= 'count', 
                                margins = False, margins_name='Total').reindex(dayOrder,axis=1)
        
        print(pivotHours)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

丶情人眼里出诗心の 2025-01-23 13:46:07

首先,确保“TowedTime”列的类型为日期时间。其次,您可以轻松地从此数据类型中提取小时。

df['TowedTime'] = pd.to_datetime(df['TowedTime'],format='%H:%M:%S')
df['hour'] = df['TowedTime'].dt.hour

希望它能回答你的问题

First, make sure the type of the column 'TowedTime' is datetime. Second, you can easily extract the hour from this data type.

df['TowedTime'] = pd.to_datetime(df['TowedTime'],format='%H:%M:%S')
df['hour'] = df['TowedTime'].dt.hour

hope it answers your question

过期以后 2025-01-23 13:46:07

在@Fabien CI 的帮助下解决了这个问题。

首先,我必须使用 dtypes 函数检查“TowedTime”列中值的数据类型。我发现那是一个对象。

我继续尝试将 'TowedTime' 转换为日期时间:

df['TowedTime'] = pd.to_datetime(df['TowedTime'],format='%H:%M:%S').dt.time

然后在 df 中创建一个新列,仅显示小时数:

df['Hour'] = pd.to_datetime(df['TowedTime'],format='%H:%M:%S').dt.hour

结果是这样的:
输入图片此处描述

您可以在图像中注意到,“TowedTime”列仍保留为对象,但新的“Hour”列正确返回小时值。

最初,数据集已经将日期和时间分成不同的列,我认为他们使用了某种方法在Excel中分离日期和时间,这将时间(“TowedTime”)创建为一个对象,我无法转换它,或者至少 dtypes 函数向我展示了这一点。

我尝试了所有 Pandas 方法将对象转换为 Datetime :

df['TowedTime'] = pd.to_datetime(df['TowedTime'])

df['TowedTime'] = pd.to_datetime(df['TowedTime'])

df['TowedTime'] = df['TowedTime'].astype('datetime64[ns]')

df['TowedTime'] =  pd.to_datetime(df['TowedTime'], format='%H:%M:%S')

df['TowedTime'] = pd.to_datetime(df['TowedTime'], format='%H:%M:%S')

With the help of @Fabien C I was able to solve the problem.

First, I had to check the data type of values in the 'TowedTime' column with dtypes function. I found that were a Object.

I proceed to try convert 'TowedTime' to datetime:

df['TowedTime'] = pd.to_datetime(df['TowedTime'],format='%H:%M:%S').dt.time

Then to create a new column in the df, for only the hours:

df['Hour'] = pd.to_datetime(df['TowedTime'],format='%H:%M:%S').dt.hour

And the result was this:
enter image description here

You can notice in the image that 'TowedTime' column remains as an object, but the new 'Hour' column correctly returns the hour value.

Originally, the dataset already had the date and time separated into different columns, I think they used some method to separate date and time in excel and this created the time ('TowedTime') to be an object, I could not convert it, Or at least that's what the dtypes function shows me.

I tried all this Pandas methods for converting the Object to Datetime :

df['TowedTime'] = pd.to_datetime(df['TowedTime'])

df['TowedTime'] = pd.to_datetime(df['TowedTime'])

df['TowedTime'] = df['TowedTime'].astype('datetime64[ns]')

df['TowedTime'] =  pd.to_datetime(df['TowedTime'], format='%H:%M:%S')

df['TowedTime'] = pd.to_datetime(df['TowedTime'], format='%H:%M:%S')
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文