我应该在 PostgreSQL 数据库中选择哪种时间戳类型?
我想定义在多时区项目的上下文中在 Postgres 数据库中存储时间戳的最佳实践。
我可以
- 选择
TIMESTAMP WITHOUT TIME ZONE
并记住在插入该字段时使用的时区 - 选择
TIMESTAMP WITHOUT TIME ZONE
并添加另一个包含时区名称的字段在插入时使用的 - 选择
TIMESTAMP WITH TIME ZONE
并相应地插入时间戳
我对选项3(带时区的时间戳)略有偏好,但希望对此事有一个有根据的意见。
I would like to define a best practice for storing timestamps in my Postgres database in the context of a multi-timezone project.
I can
- choose
TIMESTAMP WITHOUT TIME ZONE
and remember which timezone was used at insertion time for this field - choose
TIMESTAMP WITHOUT TIME ZONE
and add another field which will contain the name of the timezone that was used at insertion time - choose
TIMESTAMP WITH TIME ZONE
and insert the timestamps accordingly
I have a slight preference for option 3 (timestamp with time zone) but would like to have an educated opinion on the matter.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
首先,PostgreSQL 的时间处理和算术非常棒,选项 3 在一般情况下也很好。然而,它是时间和时区的不完整视图,可以补充:
America/Los_Angeles
,而不是-0700
代码>)。-0700
)。UTC
并使用TIMESTAMP WITH TIME ZONE
列进行存储。UTC
转换为America/Los_Angeles
)。时区
设置为UTC
。此选项并不总是有效,因为获取用户的时区可能很困难,因此对冲建议对轻量级应用程序使用
TIMESTAMP WITH TIME ZONE
。也就是说,让我更详细地解释一下选项 4 的一些背景知识。与选项 3 一样,使用
WITH TIME ZONE
的原因是因为某件事发生的时间是一个绝对时间点。WITHOUT TIME ZONE
产生相对时区。永远、永远、永远不要混合使用绝对时间戳和相对时间戳。从编程和一致性的角度来看,确保所有计算均使用 UTC 作为时区进行。这不是 PostgreSQL 的要求,但它在与其他编程语言或环境集成时很有帮助。在列上设置
CHECK
以确保写入时间戳列的时区偏移量为0
是一种防御位置,可以防止几类错误(例如脚本将数据转储到文件,并使用其他方法对时间数据进行词法排序)。同样,PostgreSQL 不需要它来正确执行日期计算或在时区之间进行转换(即 PostgreSQL 非常擅长在任意两个任意时区之间转换时间)。确保进入数据库的数据以零偏移量存储:它不是 100% 完美,但它提供了足够强大的防偷拍措施,确保数据已转换为 UTC。关于如何做到这一点有很多意见,但根据我的经验,这似乎是实践中最好的。
对数据库时区处理的批评在很大程度上是有道理的(有很多数据库在处理这个问题上非常无能),但是 PostgreSQL 对时间戳和时区的处理非常棒(尽管到处都有一些“功能”)。例如,其中一项功能:
请注意,
AT TIME ZONE 'UTC'
会删除时区信息,并使用目标的参考系创建相对的TIMESTAMP NOT TIME ZONE
(>UTC
)。从不完整的
TIMESTAMP WITHOUT TIME ZONE
转换为TIMESTAMP WITH TIME ZONE
时,缺失的时区将从您的连接继承:底线:
America/Los_Angeles
),而不是 UTC 偏移量(例如-0700
),UTC
作为数据库中的时区
随机编程语言注释:Python 的
datetime
数据类型非常擅长维护之间的区别绝对时间与相对时间(尽管一开始令人沮丧,直到您补充它与像 PyTZ 这样的库)。编辑
让我再解释一下相对与绝对之间的区别。
绝对时间用于记录事件。示例:“用户 123 已登录”或“毕业典礼于 2011 年 5 月 28 日下午 2 点(太平洋标准时间)开始”。无论您所在的当地时区如何,如果您可以传送到事件发生的地点,您就可以目睹事件的发生。数据库中的大多数时间数据都是绝对的(因此应该是
TIMESTAMP WITH TIME ZONE
,理想情况下带有+0偏移量和表示管理特定时区的规则的文本标签 - 而不是偏移量)。相对事件是从尚未确定的时区的角度记录或安排某事的时间。例如:“我们的公司早上 8 点开门,晚上 9 点关门”、“我们每周一早上 7 点召开每周早餐会”或“每个万圣节晚上 8 点”。一般来说,相对时间用于事件的模板或工厂,而绝对时间用于几乎所有其他内容。有一个罕见的例外值得指出,它应该说明相对时间的价值。对于距离足够远的未来事件,某些事情可能发生的绝对时间可能存在不确定性,请使用相对时间戳。这是一个现实世界的示例:
假设是 2004 年,您需要安排 2008 年 10 月 31 日下午 1 点在美国西海岸(即
America/Los_Angeles
/PST8PDT< /代码>)。如果您使用
'2008-10-31 21:00:00.000000+00'::TIMESTAMP WITH TIME ZONE
使用绝对时间存储该信息,则递送会在下午 2 点显示,因为美国政府通过了<一href="https://en.wikipedia.org/wiki/Energy_Policy_Act_of_2005#Change_to_daylight_ saving_time">2005 年能源政策法案更改了管理夏令时的规则。在 2004 年计划交付时,日期10-31-2008
应该是太平洋标准时间 (+8000
),但从 2005 年以上时区数据库开始认识到10-31-2008
应该是太平洋夏令时 (+0700
)。存储与时区的相对时间戳将导致正确的交付时间表,因为相对时间戳不受国会不知情的篡改的影响。使用相对时间与绝对时间来安排事情之间的界限是一条模糊线,但我的经验法则是,未来超过 3-6 个月的任何事情的安排都应该使用相对时间戳(已安排 = 绝对与计划 =相对的 ???)。另一种/最后一种相对时间是
INTERVAL
。示例:“会话将在用户登录后 20 分钟超时”。INTERVAL
可以与绝对时间戳 (TIMESTAMP WITH TIME ZONE
) 或相对时间戳 (TIMESTAMP NOT TIME ZONE
) 一起正确使用。同样正确的是,“用户会话在成功登录后 20 分钟到期 (login_utc + session_duration)”或“我们的早餐会议只能持续 60 分钟 (recurring_start_time + meet_length)”。最后一点混乱:
DATE
、TIME
、TIME NOT TIME ZONE
和TIME with TIME ZONE
都是相对数据类型。例如:'2011-05-28'::DATE
表示相对日期,因为您没有可用于识别午夜的时区信息。同样,'23:23:59'::TIME
是相对的,因为您不知道时区或时间表示的DATE
。即使使用'23:59:59-07'::TIME WITH TIME ZONE
,您也不知道DATE
是什么。最后,带有时区的DATE
实际上不是DATE
,它是一个TIMESTAMP WITH TIME ZONE
:将日期和时区放入数据库是个好东西,但是很容易得到稍微不正确的结果。只需最少的额外努力即可正确、完整地存储时间信息,但这并不意味着总是需要额外的努力。
First off, PostgreSQL’s time handling and arithmetic is fantastic and Option 3 is fine in the general case. It is, however, an incomplete view of time and timezones and can be supplemented:
America/Los_Angeles
, not-0700
).-0700
).UTC
and stored using aTIMESTAMP WITH TIME ZONE
column.UTC
toAmerica/Los_Angeles
).timezone
toUTC
.This option doesn’t always work because it can be hard to get a user’s time zone and hence the hedge advice to use
TIMESTAMP WITH TIME ZONE
for lightweight applications. That said, let me explain some background aspects of this this Option 4 in more detail.Like Option 3, the reason for the
WITH TIME ZONE
is because the time at which something happened is an absolute moment in time.WITHOUT TIME ZONE
yields a relative time zone. Don't ever, ever, ever mix absolute and relative TIMESTAMPs.From a programmatic and consistency perspective, ensure all calculations are made using UTC as the time zone. This isn’t a PostgreSQL requirement, but it helps when integrating with other programming languages or environments. Setting a
CHECK
on the column to make sure the write to the time stamp column has a time zone offset of0
is a defensive position that prevents a few classes of bugs (e.g. a script dumps data to a file and something else sorts the time data using a lexical sort). Again, PostgreSQL doesn’t need this to do date calculations correctly or to convert between time zones (i.e. PostgreSQL is very adept at converting times between any two arbitrary time zones). To ensure data going in to the database is stored with an offset of zero:It's not 100% perfect, but it provides a strong enough anti-footshooting measure that makes sure the data is already converted to UTC. There are lots of opinions on how to do this, but this seems to be the best in practice from my experience.
Criticisms of database time zone handling is largely justified (there are plenty of databases that handle this with great incompetence), however PostgreSQL’s handling of timestamps and timezones is pretty awesome (despite a few "features" here and there). For example, one such feature:
Note that
AT TIME ZONE 'UTC'
strips time zone info and creates a relativeTIMESTAMP WITHOUT TIME ZONE
using your target’s frame of reference (UTC
).When converting from an incomplete
TIMESTAMP WITHOUT TIME ZONE
to aTIMESTAMP WITH TIME ZONE
, the missing time zone is inherited from your connection:The bottom line:
America/Los_Angeles
) and not an offset from UTC (e.g.-0700
)UTC
as thetimezone
in the database if possibleRandom programming language note: Python's
datetime
data type is very good at maintaining the distinction between absolute vs relative times (albeit frustrating at first until you supplement it with a library like PyTZ).EDIT
Let me explain the difference between relative vs absolute a bit more.
Absolute time is used to record an event. Examples: "User 123 logged in" or "a graduation ceremonies start at 2011-05-28 2pm PST." Regardless of your local time zone, if you could teleport to where the event occurred, you could witness the event happening. Most time data in a database is absolute (and therefore should be
TIMESTAMP WITH TIME ZONE
, ideally with a +0 offset and a textual label representing the rules governing the particular timezone - not an offset).A relative event would be to record or schedule the time of something from the perspective of a yet-to-be-determined time zone. Examples: "our business's doors open at 8am and close at 9pm", "let's meet every Monday at 7am for a weekly breakfast meeting," or "every Halloween at 8pm." In general, relative time is used in a template or factory for events, and absolute time is used for almost everything else. There is one rare exception that’s worth pointing out which should illustrate the value of relative times. For future events that are far enough in the future where there could be uncertainty about the absolute time at which something could occur, use a relative timestamp. Here’s a real world example:
Suppose it’s the year 2004 and you need to schedule a delivery on October 31st in 2008 at 1pm on the West Coast of the US (i.e.
America/Los_Angeles
/PST8PDT
). If you stored that using absolute time using’2008-10-31 21:00:00.000000+00’::TIMESTAMP WITH TIME ZONE
, the delivery would have shown up at 2pm because the US Government passed the Energy Policy Act of 2005 that changed the rules governing daylight savings time. In 2004 when the delivery was scheduled, the date10-31-2008
would have been Pacific Standard Time (+8000
), but starting in year 2005+ timezone databases recognized that10-31-2008
would have been Pacific Daylight Savings time (+0700
). Storing a relative timestamp with the time zone would have resulted in a correct delivery schedule because a relative timestamp is immune to Congress’ ill-informed tampering. Where the cutoff between using relative vs absolute times for scheduling things is, is a fuzzy line, but my rule of thumb is that scheduling for anything in the future further than 3-6mo should make use of relative timestamps (scheduled = absolute vs planned = relative ???).The other/last type of relative time is the
INTERVAL
. Example: "the session will time out 20 minutes after a user logs in". AnINTERVAL
can be used correctly with either absolute timestamps (TIMESTAMP WITH TIME ZONE
) or relative timestamps (TIMESTAMP WITHOUT TIME ZONE
). It is equally correct to say, "a user session expires 20min after a successful login (login_utc + session_duration)" or "our morning breakfast meeting can only last 60 minutes (recurring_start_time + meeting_length)".Last bits of confusion:
DATE
,TIME
,TIME WITHOUT TIME ZONE
andTIME WITH TIME ZONE
are all relative data types. For example:'2011-05-28'::DATE
represents a relative date since you have no time zone information which could be used to identify midnight. Similarly,'23:23:59'::TIME
is relative because you don't know either the time zone or theDATE
represented by the time. Even with'23:59:59-07'::TIME WITH TIME ZONE
, you don't know what theDATE
would be. And lastly,DATE
with a time zone is not in fact aDATE
, it is aTIMESTAMP WITH TIME ZONE
:Putting dates and time zones in databases is a good thing, but it is easy to get subtly incorrect results. Minimal additional effort is required to store time information correctly and completely, however that doesn’t mean the extra effort is always required.
肖恩的回答过于复杂且具有误导性。
事实上,“WITH TIME ZONE”和“WITHOUT TIME ZONE”都将值存储为类似 UNIX 的绝对 UTC 时间戳。区别在于时间戳的显示方式。当“WITH time zone”时,显示的值是转换为用户所在区域的 UTC 存储值。当“WITHOUT time zone”时,UTC 存储的值会被扭曲,以便无论用户设置哪个区域都显示相同的钟面。
“WITHOUT time zone”唯一可用的情况是当钟面值适用时例如,当时间戳表明投票站可能关闭时(即,无论某人所在的时区如何,投票站都会在 20:00 关闭)。
除非有非常具体的规定,否则始终使用“带时区”。原因不。
Sean's answer is overly complex and misleading.
The fact is that both "WITH TIME ZONE" and "WITHOUT TIME ZONE" store the value as a unix-like absolute UTC timestamp. The difference is all in how the timestamp is displayed. When "WITH time zone" then the displayed value is the UTC stored value translated to the user's zone. When "WITHOUT time zone" the UTC stored value is twisted so as to show the same clock face no matter what zone the user has set".
The only situation where a "WITHOUT time zone" is usable is when a clock face value is applicable regardless of actual zone. For example, when a timestamp indicates when voting booths might close (ie. they close at 20:00 regardless of a person's timezone).
Use choice 3. Always use "WITH time zone" unless there is a very specific reason not to.
我更喜欢选项 3,因为 Postgres 可以为您完成大量重新计算相对于时区的时间戳的工作,而对于其他两个,您必须自己完成。存储带有时区的时间戳的额外存储开销实际上可以忽略不计,除非您正在谈论数百万条记录,在这种情况下您可能已经有相当大的存储需求。
My preference is towards option 3, as Postgres can then do al ot of the work recalculating timestamps relative to timezone for you, whereas with the other two you'll have to do that yourself. The extra storage overhead of storing the timestamp with a timezone is really negligible unless you're talking millions of records, in which case you probably already have pretty meaty storage requirements anyway.