pyspark to_timestamp()处理格式的miliseconds sss

发布于 2025-02-09 16:04:57 字数 1158 浏览 2 评论 0原文

我的数据扭曲了,
我在这里使用以下功能。

to_timestamp("col","yyyy-MM-dd'T'hh:mm:ss.SSS'Z'") 

数据:

time                      | OUTPUT                         | IDEAL
2022-06-16T07:01:25.346Z  | 2022-06-16T07:01:25.346+0000   | 2022-06-16T07:01:25.346+0000
2022-06-16T06:54:21.51Z   | 2022-06-16T06:54:21.051+0000   | 2022-06-16T06:54:21.510+0000
2022-06-16T06:54:21.5Z    | 2022-06-16T06:54:21.005+0000   | 2022-06-16T06:54:21.500+0000

因此,我在数据中具有SS或SS或SSS格式。
如何将其标准化为SSS正确的方式?
在这里,51 miliseconds表示510不是051。

使用Spark版本:3.2.1 代码:

import pyspark.sql.functions as F
test = spark.createDataFrame([(1,'2022-06-16T07:01:25.346Z'),(2,'2022-06-16T06:54:21.51Z'),(3,'2022-06-16T06:54:21.5Z')],['no','timing1'])
timeFmt = "yyyy-MM-dd'T'hh:mm:ss.SSS'Z'"
test = test.withColumn("timing2", (F.to_timestamp(F.col('timing1'),format=timeFmt)))
test.select("timing1","timing2").show(truncate=False)

输出:

“在此处输入图像描述”

I have distorted Data,
I am using below function here.

to_timestamp("col","yyyy-MM-dd'T'hh:mm:ss.SSS'Z'") 

Data:

time                      | OUTPUT                         | IDEAL
2022-06-16T07:01:25.346Z  | 2022-06-16T07:01:25.346+0000   | 2022-06-16T07:01:25.346+0000
2022-06-16T06:54:21.51Z   | 2022-06-16T06:54:21.051+0000   | 2022-06-16T06:54:21.510+0000
2022-06-16T06:54:21.5Z    | 2022-06-16T06:54:21.005+0000   | 2022-06-16T06:54:21.500+0000

so, I have S or SS or SSS format for milisecond in data.
How can i normalise it into SSS correct way?
Here, 51 miliseconds mean 510 not 051.

Using spark version : 3.2.1
Code :

import pyspark.sql.functions as F
test = spark.createDataFrame([(1,'2022-06-16T07:01:25.346Z'),(2,'2022-06-16T06:54:21.51Z'),(3,'2022-06-16T06:54:21.5Z')],['no','timing1'])
timeFmt = "yyyy-MM-dd'T'hh:mm:ss.SSS'Z'"
test = test.withColumn("timing2", (F.to_timestamp(F.col('timing1'),format=timeFmt)))
test.select("timing1","timing2").show(truncate=False)

Output:

enter image description here

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

╰つ倒转 2025-02-16 16:04:57

我还使用v3.2.1,如果您不解析时间戳格式,它对我有用。它已经以正确的格式:

from pyspark.sql import functions as F

test = spark.createDataFrame([(1,'2022-06-16T07:01:25.346Z'),(2,'2022-06-16T06:54:21.51Z'),(3,'2022-06-16T06:54:21.5Z')],['no','timing1'])

new_df = test.withColumn('timing1_ts', F.to_timestamp('timing1'))\

new_df.show(truncate=False)

new_df.dtypes

+---+------------------------+-----------------------+
|no |timing1                 |timing1_ts             |
+---+------------------------+-----------------------+
|1  |2022-06-16T07:01:25.346Z|2022-06-16 07:01:25.346|
|2  |2022-06-16T06:54:21.51Z |2022-06-16 06:54:21.51 |
|3  |2022-06-16T06:54:21.5Z  |2022-06-16 06:54:21.5  |
+---+------------------------+-----------------------+

Out[9]: [('no', 'bigint'), ('timing1', 'string'), ('timing1_ts', 'timestamp')]

I also use v3.2.1 and it works for me if you just don't parse the timestamp format. It is already in the right format:

from pyspark.sql import functions as F

test = spark.createDataFrame([(1,'2022-06-16T07:01:25.346Z'),(2,'2022-06-16T06:54:21.51Z'),(3,'2022-06-16T06:54:21.5Z')],['no','timing1'])

new_df = test.withColumn('timing1_ts', F.to_timestamp('timing1'))\

new_df.show(truncate=False)

new_df.dtypes

+---+------------------------+-----------------------+
|no |timing1                 |timing1_ts             |
+---+------------------------+-----------------------+
|1  |2022-06-16T07:01:25.346Z|2022-06-16 07:01:25.346|
|2  |2022-06-16T06:54:21.51Z |2022-06-16 06:54:21.51 |
|3  |2022-06-16T06:54:21.5Z  |2022-06-16 06:54:21.5  |
+---+------------------------+-----------------------+

Out[9]: [('no', 'bigint'), ('timing1', 'string'), ('timing1_ts', 'timestamp')]
朱染 2025-02-16 16:04:57

我正在使用此设置:

 spark.sql("set spark.sql.legacy.timeParserPolicy=LEGACY")

我必须重置此设置,并且正常工作。

I was using this setting :

 spark.sql("set spark.sql.legacy.timeParserPolicy=LEGACY")

I have to reset this and it is working as normal.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文