防止意外时区转换

发布于 2024-12-08 12:04:56 字数 609 浏览 1 评论 0原文

在 R 中,我有一堆以 GMT 格式测量的日期时间值。我不断遇到事故,某些函数或另一个函数丢失了我的值的时区,甚至丢失了类名。即使是像 c()unlist() 这样基本的函数:

> dput(x)
structure(1317830532, class = c("POSIXct", "POSIXt"), tzone = "GMT")
> dput(c(x))
structure(1317830532, class = c("POSIXct", "POSIXt"))
> dput(list(x))
list(structure(1317830532, class = c("POSIXct", "POSIXt"), tzone = "GMT"))
> dput(unlist(list(x)))
1317830532

我觉得我距离真正的 火星气候轨道器时刻,如果这种情况发生在我最意想不到的时候。有人有什么策略可以确保他们的约会“保持不变”吗?

In R, I have a bunch of datetime values that I measure in GMT. I keep running into accidents where some function or another loses the timezone on my values, or even loses the class name. Even on functions so basic as c() and unlist():

> dput(x)
structure(1317830532, class = c("POSIXct", "POSIXt"), tzone = "GMT")
> dput(c(x))
structure(1317830532, class = c("POSIXct", "POSIXt"))
> dput(list(x))
list(structure(1317830532, class = c("POSIXct", "POSIXt"), tzone = "GMT"))
> dput(unlist(list(x)))
1317830532

I feel like I'm a hair's breadth away from having a real Mars Climate Orbiter moment if this happens when I least expect it. Anyone have any strategies for making sure their dates "stay put"?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

女皇必胜 2024-12-15 12:04:56

此行为记录在 ?c?DateTimeClasses?unlist 中:

来自 ?DateTimeClasses

在“POSIXlt”对象上使用 c 会将其转换为当前时区,而在“POSIXct”对象上则删除任何“tzone”属性(即使它们都标记有相同的时区)。*

来自 ?c

c 有时用于删除除名称之外的属性的副作用。*


也就是说,我的测试表明,尽管使用 c,数据的完整性仍然完好无损>取消列出。例如:

x <- structure(1317830532, class = c("POSIXct", "POSIXt"), 
                 tzone = "GMT")
y <- structure(1317830532+3600, class = c("POSIXct", "POSIXt"), 
                 tzone = "PST8PDT")
x
[1] "2011-10-05 16:02:12 GMT"

y
[1] "2011-10-05 10:02:12 PDT"

strftime(c(x, y), format="%Y/%m/%d %H:%M:%S", tz="GMT")
[1] "2011/10/05 16:02:12" "2011/10/05 17:02:12"

strftime(c(x, y), format="%Y/%m/%d %H:%M:%S", tz="PST8PDT")
[1] "2011/10/05 09:02:12" "2011/10/05 10:02:12"

strftime(unlist(y), format="%Y/%m/%d %H:%M:%S", tz="PST8PDT")
[1] "2011/10/05 10:02:12"

如果您使用 R 来跟踪日期,您的火星漫游车应该没问题。

This behaviour is documented in ?c, ?DateTimeClasses and ?unlist:

From ?DateTimeClasses:

Using c on "POSIXlt" objects converts them to the current time zone, and on "POSIXct" objects drops any "tzone" attributes (even if they are all marked with the same time zone).*

From ?c:

c is sometimes used for its side effect of removing attributes except names.*


That said, my testing indicates that the integrity of your data remains intact, despite using c or unlist. For example:

x <- structure(1317830532, class = c("POSIXct", "POSIXt"), 
                 tzone = "GMT")
y <- structure(1317830532+3600, class = c("POSIXct", "POSIXt"), 
                 tzone = "PST8PDT")
x
[1] "2011-10-05 16:02:12 GMT"

y
[1] "2011-10-05 10:02:12 PDT"

strftime(c(x, y), format="%Y/%m/%d %H:%M:%S", tz="GMT")
[1] "2011/10/05 16:02:12" "2011/10/05 17:02:12"

strftime(c(x, y), format="%Y/%m/%d %H:%M:%S", tz="PST8PDT")
[1] "2011/10/05 09:02:12" "2011/10/05 10:02:12"

strftime(unlist(y), format="%Y/%m/%d %H:%M:%S", tz="PST8PDT")
[1] "2011/10/05 10:02:12"

Your Mars Rover should be OK if you use R to keep track of dates.

流星番茄 2024-12-15 12:04:56

那么为什么不将 R 会话的时区设置为 GMT呢?如果某些内容转换为“当前”时区,它仍然是正确的。

Why not set your timezone to GMT for your R sessions, then? If something gets converted to the "current" timezone, it is still right.

锦爱 2024-12-15 12:04:56

鉴于这是记录在案的行为,并且应该避免此类函数或围绕此类行为进行防御性编码,那么您需要支持这两种方法的机制。对于这样的事情,我建议写一个“poor man's lint”;有了这样的 lint 探测器,您就可以恢复理智了。此外,对于 lint 检测,有多种方法可以避免火星极地轨道飞行器崩溃,有些是相互独立的,有些是相互依赖的

  1. :构建替代方案首先,对于您知道会导致问题的所有函数,要么决定不使用它们,要么编写一个新的包装函数,该函数将按预期运行,并且将设置时区参数你渴望。然后,确保您使用该特殊包装器而不是底层函数。
  2. 静态分析 使用您最喜欢的编辑器(例如作为宏)、使用 shell 脚本和脚本来编写搜索功能。 GNU findgrep 函数,或以其他方式(例如 R 中的 grep),查找那些导致问题的特定函数。一旦发现,请删除或使用防御性编码方法(例如#1 中的包装器)。
  3. 测试 使用单元测试(例如 Runittestthat)开发测试,确保在使用函数或包时维护时区属性。每次出现新的错误时,创建一个新的测试以确保该错误不会在发布的版本中再次出现。
  4. 弱类型检查 您还可以在整个代码中包含测试,以测试是否指定了时区。最好有您自己的函数来进行此测试,而不是编写一段在整个过程中重现的代码块。通过这种方式,您最终可以扩展检查以包括其他类型的检查,例如时区的持久性以及测试两个或多个对象上的操作是否注意时区的差异(也许他们允许,也许他们不允许) )。
  5. 将所有内容映射到一个 TZ 也称为 Indiana-be-damned。保留有关时区的各种策略是一项艰巨的工作,并且本质上是处理时态数据时的摩擦。只需映射到一个 TZ (UTC),然后让任何本地工作都可以从中进行。如果您碰巧有 DST 不变的本地规律,请在从 UTC 转换回来后解决该问题。

我针对其他问题执行了所有#s 1-4,但是,正如它们很容易适应时区检查一样,它们对于许多避开火星轨道飞行器的目标来说是相当可重复使用的。我做这种事情正是为了避免编写下一个这样的火星轨道飞行器。 (对于我们所有使用数值数据的人来说,这是一个昂贵的教训。:))

Given that this is documented behavior and one should either avoid such functions or else defensively code around such behavior, then you need mechanisms to support either approach. For things like this, I would recommend writing a "poor man's lint"; with such a lint detector, you can go about restoring sanity In addition, to lint detection, there are several approaches to avoiding Mars Polar Orbiter crashes, some are independent of each other, others dependent:

  1. Set a policy & build alternatives First, for all of the functions that you know are causing you problems, either decide that you won't use them, or write a new wrapper function that will behave as intended, and that will set the timezone parameter you desire. Then, ensure that you use that special wrapper rather than the underlying function.
  2. Static analysis Write a search function using your favorite editor (e.g. as a macro), using a shell script & the GNU find and grep functions, or in some other manner (e.g. grep in R), to find those particular functions that are causing you problems. When found, either remove or use a defensive coding method (e.g. the wrapper in #1).
  3. Testing Using unit tests, e.g. Runit or testthat, develop tests that ensure that timezone properties are maintained when using your functions or package. Every time there's a new bug, create a new test to ensure that bug doesn't appear again in released versions.
  4. Weak type checking You can also include tests throughout your code that test whether a timezone is specified. It's best to have your own function for this test, rather than write a block of code that is reproduced throughout. In this way, you can eventually extend the checking to include other types of checks, such as persistence of the timezone and tests for whether operations on two or more objects are mindful of differences in timezones (maybe they allow it, maybe they don't).
  5. Map everything to one TZ Also known as Indiana-be-damned. Retaining a variety of policies about the timezones is hard work, and is essentially friction in working with temporal data. Just map to one TZ (UTC) and then let anything local work from that. If you happen to have local regularity that is invariant of DST, then address that after converting back from UTC.

I do all of #s 1-4 for other issues, but, just as they're easily adapted to timezone checking, they're fairly reusable for lots of Mars Orbiter-avoiding objectives. I do this kind of thing precisely to avoid coding the next such Mars Orbiter. (That was an expensive lesson for all of us that work with numerical data. :))

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文