pytz 和 pandas 时区之间的比较失败

发布于 2025-01-14 12:15:01 字数 956 浏览 1 评论 0 原文

我想将 pd.DatetimeIndex 的时区与 pytz.timezone 进行比较,以查看 DatetimeIndex 是否具有预期的时区。但比较失败,可能是因为使用 tzinfo 参数不起作用,如本答案中所述。

import pandas as pd
import pytz
import unittest

tzstr = 'Europe/Vienna'
manual_tz = pytz.timezone(tzstr)

timestamp = pd.to_datetime('2020-03-02 07:00:00+01:00').tz_convert(tzstr)

tc = unittest.TestCase()
tc.assertEqual(timestamp.tz, manual_tz)
AssertionError: <DstTzInfo 'Europe/Vienna' CET+1:00:00 STD> != <DstTzInfo 'Europe/Vienna' LMT+1:05:00 STD>

如何检查 timestamp 的时区是否符合预期?在我看来,它应该与 pytz.timezone 相同,但是 pytz 和/或 pandas 的行为使它们在某种程度上有所不同。

如果无法进行这样的比较,则提出替代问题:我必须注意什么,才能不再遇到这个问题?这种事已经不止一次地发生在我身上了。我发现了一个 与此类似的问题,但必须使用多个日期来比较偏移量是否始终相同似乎并不喜欢最好的方式。

I want to compare the timezone of a pd.DatetimeIndex with a pytz.timezone to see if the DatetimeIndex has the expected timezone. But the comparison fails, possibly because using the tzinfo argument does not work as explained in this answer.

import pandas as pd
import pytz
import unittest

tzstr = 'Europe/Vienna'
manual_tz = pytz.timezone(tzstr)

timestamp = pd.to_datetime('2020-03-02 07:00:00+01:00').tz_convert(tzstr)

tc = unittest.TestCase()
tc.assertEqual(timestamp.tz, manual_tz)
AssertionError: <DstTzInfo 'Europe/Vienna' CET+1:00:00 STD> != <DstTzInfo 'Europe/Vienna' LMT+1:05:00 STD>

How can I check that the timezone of timestamp is as expected? It should be the same as pytz.timezone in my opinion, but the behavior of pytz and/or pandas makes them somehow different.

Alternative question formulation if it is not possible to make a comparison like this: What do I have to look out for, to not come across this problem any more? It did happen to me more than once already. I found a question similar to this, but having to use multiple dates to compare if the offset is always the same does not seem like the best way to.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

酒废 2025-01-21 12:15:02

我认为您的问题与此答案相关

pytz 创建时区对象时传递的默认区域名称和偏移量是该区域最早可用的名称和偏移量,有时它们看起来有点奇怪。

这就是我运行代码时所得到的:

AssertionError: <DstTzInfo 'Europe/Vienna' CET+1:00:00 STD> != <DstTzInfo 'Europe/Vienna' LMT+1:05:00 STD>

您可以做的是:

import pandas as pd
import pytz
import unittest
import datetime

tzstr = 'Europe/Vienna'
manual_tz = pytz.timezone(tzstr)
pytz_localize = manual_tz.localize(datetime.datetime(2020, 3, 2, 7, 0, 0, 0))


timestamp = pd.to_datetime('2020-03-02 07:00:00+01:00').tz_convert(tzstr)

tc = unittest.TestCase()
tc.assertEqual(timestamp.tz, pytz_localize.tzinfo)

但是 IHMO,必须创建两个日期来比较时区信息很奇怪。

比较时区是什么意思?您想检查什么?

编辑
关于测试时区和转换。我要做的就是找到你想要检查的测试用例。我要检查的是我是否正确处理了 DST 更改,为此我会执行以下操作:

import pandas as pd

# DST in Austria in 2021.
all_dates = ['2021-03-28 01:59:00+0100', '2021-03-28 03:01:00+0200']
utc_dates = pd.to_datetime(['2021-03-28 00:59:00+00:00', '2021-03-28 01:01:00+00:00']).tz_localize('UTC')

timezoned_timestamps = pd.to_datetime(all_dates, utc=True).tz_convert("Europe/Vienna")

# then we check if it's equal after UTC conversion
pd.testing.assert_index_equal(utc_dates, timezoned_timestamps.tz_convert('UTC'))

第一个 to_datetime 调用中的参数 utc=True 是强制性的:

但是,具有混合时间偏移的时区感知输入(例如从具有夏令时的时区发出,例如欧洲/巴黎)无法成功转换为 DatetimeIndex。相反,返回一个包含 datetime.datetime 对象的简单索引

相反,从 文档

I think your matter is linked to this answer

The default zone name and offset delivered when pytz creates a timezone object are the earliest ones available for that zone, and sometimes they can seem kind of strange.

This is what I have when I run your code:

AssertionError: <DstTzInfo 'Europe/Vienna' CET+1:00:00 STD> != <DstTzInfo 'Europe/Vienna' LMT+1:05:00 STD>

What you can do is this:

import pandas as pd
import pytz
import unittest
import datetime

tzstr = 'Europe/Vienna'
manual_tz = pytz.timezone(tzstr)
pytz_localize = manual_tz.localize(datetime.datetime(2020, 3, 2, 7, 0, 0, 0))


timestamp = pd.to_datetime('2020-03-02 07:00:00+01:00').tz_convert(tzstr)

tc = unittest.TestCase()
tc.assertEqual(timestamp.tz, pytz_localize.tzinfo)

But IHMO, it's weird to have to create two dates for comparing the timezone information.

What do you mean by compare the timezone? What do you want to check?

Edit
About testing timezone and conversions. What I would do is find test cases you want to check. Something I would check is that I correctly handle the DST change, for this I would do something like this:

import pandas as pd

# DST in Austria in 2021.
all_dates = ['2021-03-28 01:59:00+0100', '2021-03-28 03:01:00+0200']
utc_dates = pd.to_datetime(['2021-03-28 00:59:00+00:00', '2021-03-28 01:01:00+00:00']).tz_localize('UTC')

timezoned_timestamps = pd.to_datetime(all_dates, utc=True).tz_convert("Europe/Vienna")

# then we check if it's equal after UTC conversion
pd.testing.assert_index_equal(utc_dates, timezoned_timestamps.tz_convert('UTC'))

The parameter utc=True in the first to_datetime call is mandatory:

However, timezone-aware inputs with mixed time offsets (for example issued from a timezone with daylight savings, such as Europe/Paris) are not successfully converted to a DatetimeIndex. Instead a simple Index containing datetime.datetime objects is returned

From the documentation.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文