在 numpy.select 的条件列表中将字符串输入转换为 DateTime

发布于 2025-01-11 11:29:42 字数 803 浏览 0 评论 0原文

我正在使用用户输入(字符串)通过 numpy.select() 在 EXCEL 文件中查找匹配条件。

我的条件之一是捕获用户输入的日期(主要是 YYYY 年),以检查 EXCEL 文件的 event_start 列中的哪些数据(读入 pandas.DataFrame)进入该时间范围:

pd.to_datetime(exy[0]) <= pd.to_datetime(f['event_start']) <= pd.to_datetime(exy[1])

在本例中,exy 是用户输入的两个日期的列表。这两个日期定义了应该与从指定的 pandas.DataFrame 列检索的时间信息进行比较的日期范围。

但是,此条件不会返回任何结果。在条件列表中转换为日期时间是不可能的,因为 f['event_start'] 本质上构成了一个系列而不是单个值。

如何转换单个值,同时保留与数组中其他值的整体关系?

我已尝试在条件列表中进行列表理解,其中 len(pers_f) 是原始文件中的行数:

pd.to_datetime(exy[0]) <= pd.to_datetime([f['event_start'].iloc[n] for n in range(0, len(pers_f))]) <= pd.to_datetime(exy[1])

但这也不会触发正确的输出。

numpy.select() 的条件列表中处理日期的建议程序是什么?

I am using user input (strings) to find matching conditions in an EXCEL file via numpy.select().

One of my conditions captures dates (mainly years YYYY) entered by the user to check which data in the event_start column of my EXCEL file (read into a pandas.DataFrame) fall into that time range:

pd.to_datetime(exy[0]) <= pd.to_datetime(f['event_start']) <= pd.to_datetime(exy[1])

In this case, exy is a list of two dates entered by the user. These two dates define the date range to which time information retrieved from the indicated pandas.DataFrame column ought to be compared.

However, this condition does not return any results. Converting to datetime within a condition list is not possible because f['event_start'] essentially constitutes a series and not an individual value.

How can I convert individual values while preserving the overall relation to other values in the array?

I have tried list comprehension within the condition list, with len(pers_f) being the number of rows in the original file:

pd.to_datetime(exy[0]) <= pd.to_datetime([f['event_start'].iloc[n] for n in range(0, len(pers_f))]) <= pd.to_datetime(exy[1])

But this does not trigger the correct output either.

What is the recommended procedure for working with dates in condition lists for numpy.select()?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

美男兮 2025-01-18 11:29:42

我的脚本有两个基本问题:

  1. 我尝试转换 Datetime 的日期通常是 1700 之前的日期,因此首先不适合使用 pd.to_datetime
  2. 我无法在 numpy.select() 条件列表中执行列表理解。

这就是为什么我 a) 改用 pandas.Period 以及 b) 创建一个解决方法来处理条件列表之外的日期。

用户输入的日期立即转换为句点:

try:
    d1=pd.Period(exy[0], freq="D") # convert input to Period with day frequence
    print(type(d1))

    d2=pd.Period(exy[1], freq="D") # convert input to Period with day frequence
    print(type(d2))
    
except IndexError:
    pass

如果用户选择了两个日期的日期范围来查找其间的所有匹配日期,则执行以下操作:

print("Searching for date range between", d1, "and", d2, "!") 
        for n in range(0, len(pers_f)):
            try:
                if d1 <= pd.Period(f['event_start'].iloc[n]) <= d2:
                    nf=nf.append(f.iloc[n]) # create new data frame
            except ValueError as error:
                print(error.args)
                pass
            
# define possible conditions and choices
        
        try:
            condlist = [(nf['pers_name'].str.contains('|'.join(qn)) ^ nf['pers_name'].isin(qn)),
                nf['inst_name'].isin(qi), 
                nf['pers_title'].isin(qt),
                nf['pers_function'].isin(qf),
                nf['rel_pers'].str.contains('|'.join(qr)) ^ nf['rel_pers'].isin(qr)]
    
            choicelist = [nf['pers_name'],
                nf['inst_name'], 
                nf['pers_title'],
                nf['pers_function'],
                nf['rel_pers']]
            
        except:
            print("No match found!")

我得到的结果很好,但脚本似乎不必要地复杂。如果有人仍然可以提出更优雅的解决方案,我将不胜感激。

My script had two basic issues:

  1. The dates I was trying to convert the Datetime were often dates before 1700 and thus not suitable for working with pd.to_datetime in the first place.
  2. I was not able to perform list comprehension within the numpy.select() condition list.

This is why I have a) resorted to pandas.Period instead and b) created a work-around to handle dates outside the condition list.

Dates entered by the user are immediately converted to periods:

try:
    d1=pd.Period(exy[0], freq="D") # convert input to Period with day frequence
    print(type(d1))

    d2=pd.Period(exy[1], freq="D") # convert input to Period with day frequence
    print(type(d2))
    
except IndexError:
    pass

If the user has selected a date range of two dates to find all matching dates in between, the following operation is performed:

print("Searching for date range between", d1, "and", d2, "!") 
        for n in range(0, len(pers_f)):
            try:
                if d1 <= pd.Period(f['event_start'].iloc[n]) <= d2:
                    nf=nf.append(f.iloc[n]) # create new data frame
            except ValueError as error:
                print(error.args)
                pass
            
# define possible conditions and choices
        
        try:
            condlist = [(nf['pers_name'].str.contains('|'.join(qn)) ^ nf['pers_name'].isin(qn)),
                nf['inst_name'].isin(qi), 
                nf['pers_title'].isin(qt),
                nf['pers_function'].isin(qf),
                nf['rel_pers'].str.contains('|'.join(qr)) ^ nf['rel_pers'].isin(qr)]
    
            choicelist = [nf['pers_name'],
                nf['inst_name'], 
                nf['pers_title'],
                nf['pers_function'],
                nf['rel_pers']]
            
        except:
            print("No match found!")

The results I get are fine, but the script seems unnecessarily convoluted. I will be grateful if someone can still suggest a more elegant solution.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文