如何将unix纪元时间转换为hive中的日期字符串
我有一个包含时间戳列的日志文件。时间戳采用 unix 纪元时间格式。
我想根据时间戳创建一个分区,其中包含年、月和日分区。
到目前为止,我已经做到了这一点,但它抛出了一个错误。
PARSE ERROR cannot recognize input '(' in column type
这是我的代码。
from (
from raw_data
MAP ${PREFIX}raw_data.line
USING 's3://scripts/clean.py'
AS (timestamp STRING, name STRING)
) map_out
INSERT OVERWRITE TABLE date_base_data_temp PARTITION(year(timestamp), month(timestamp)), day(timestamp)))
select map_out.name;
I have a log file which contains timestamp column. The timestamp is in unix epoch time format.
I want to create a partition based on a timestamp with partitions year, month and day.
So far I have done this but it is throwing an error.
PARSE ERROR cannot recognize input '(' in column type
Here is my code.
from (
from raw_data
MAP ${PREFIX}raw_data.line
USING 's3://scripts/clean.py'
AS (timestamp STRING, name STRING)
) map_out
INSERT OVERWRITE TABLE date_base_data_temp PARTITION(year(timestamp), month(timestamp)), day(timestamp)))
select map_out.name;
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
哦,看起来很丑。尝试在 Hive 中使用此函数:
或者如果时间戳以毫秒而不是秒为单位:
将 unix 时间戳转换为 YYYY-MM-DD HH:MM:SS 格式,然后您可以使用以下函数获取年、月、日:
Oof, that looks ugly. Try using this function in Hive:
Or if timestamp is in
ms
instead of seconds:That converts a unix timestamp into a YYYY-MM-DD HH:MM:SS format, then you can use the following functions to get the year, month, and day:
随着 Hive 和 SparkSQL 的最新版本,日期数据类型和类型转换选项可用。以下应该适用于 Hive 以及 Spark SQL
With more recent releases of Hive and SparkSQL, data type of date and type casting options are available. Following should work in Hive as well as Spark SQL
如果您需要将日期转换为自定义格式,请使用:
这将返回日期作为yearMonth,例如201708
If you need to convert the date in custom format, use this:
which will return the date as yearMonth e.g. 201708
将此查询添加到列表中,其中时间戳需要转换为字符串分区的日期字符串 yyyy-MM-dd:
Adding this query to the list where the timestamp needs to be converted to date string yyyy-MM-dd for a string partition:
或者如果您看到任何错误,请尝试使用
选择 order_id, date_format(from_unixtime(order_date DIV 1000),'yyy-MM-dd') 作为 order_date ,order_customer_id,order_status
来自订单
or if you see any error on the same , try to use
select order_id, date_format(from_unixtime(order_date DIV 1000),'yyy-MM-dd') as order_date ,order_customer_id,order_status
from orders