SparkSQL将整数转换为mm-dd-yyyy
以下代码输出一个表:
df1 = spark.read.format("csv").option("header", "true").load("dbfs:/FileStore/shared_uploads/*********/*********")
df1.registerTempTable("df1")
display(df1)
该表的“日期”列的大整数的形式:
- 20220716
- 20220717
- 等
- 。
然后,我想使用pyspark以mm-dd-yyyy的形式输出一列,但是正在为date_final返回null。
from pyspark.sql.functions import *
oe_seq = sqlContext.sql("""
to_date(cast(Date as string), 'MM-dd-yyyy') as DATE_FINAL,
from df1
""")
oe_seq.registerTempTable("oe_seq")
display(oe_seq)
我如何在pyspark中以“ mm-dd-yyy”的形式使列以“ mm-dd-yyy”形式?
The following code outputs a table:
df1 = spark.read.format("csv").option("header", "true").load("dbfs:/FileStore/shared_uploads/*********/*********")
df1.registerTempTable("df1")
display(df1)
The table has a 'Date' column in the form of a big integer:
- 20220716
- 20220717
- etc.
- etc.
Then, I want to use pyspark to output a column in the form MM-DD-YYYY but it is returning null for DATE_FINAL.
from pyspark.sql.functions import *
oe_seq = sqlContext.sql("""
to_date(cast(Date as string), 'MM-dd-yyyy') as DATE_FINAL,
from df1
""")
oe_seq.registerTempTable("oe_seq")
display(oe_seq)
How can I get the column to be in the form 'MM-dd-YYY' in PySpark??
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您已经尝试使用to_date进行格式化,但是to_date用于从字符串中转换
为以所需形式格式化的日期,您可以使用date_format如下
you have tried to format using to_date but to_date is used to convert into date from string
for formatting in desired form you can do using date_format like below