如何将unix纪元时间转换为hive中的日期字符串

发布于 2024-12-01 22:12:11 字数 537 浏览 3 评论 0原文

我有一个包含时间戳列的日志文件。时间戳采用 unix 纪元时间格式。

我想根据时间戳创建一个分区,其中包含年、月和日分区。

到目前为止,我已经做到了这一点,但它抛出了一个错误。

PARSE ERROR cannot recognize input '(' in column type

这是我的代码。

from (
      from raw_data
            MAP  ${PREFIX}raw_data.line
            USING 's3://scripts/clean.py'
            AS (timestamp STRING, name STRING)
      ) map_out
INSERT OVERWRITE TABLE date_base_data_temp PARTITION(year(timestamp), month(timestamp)), day(timestamp))) 
    select map_out.name;

I have a log file which contains timestamp column. The timestamp is in unix epoch time format.

I want to create a partition based on a timestamp with partitions year, month and day.

So far I have done this but it is throwing an error.

PARSE ERROR cannot recognize input '(' in column type

Here is my code.

from (
      from raw_data
            MAP  ${PREFIX}raw_data.line
            USING 's3://scripts/clean.py'
            AS (timestamp STRING, name STRING)
      ) map_out
INSERT OVERWRITE TABLE date_base_data_temp PARTITION(year(timestamp), month(timestamp)), day(timestamp))) 
    select map_out.name;

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

饭团 2024-12-08 22:12:11

哦,看起来很丑。尝试在 Hive 中使用此函数:

SELECT from_unixtime(unix_timestamp) as new_timestamp from raw_data ...

或者如果时间戳以毫秒而不是秒为单位:

SELECT from_unixtime(unix_timestamp DIV 1000) as new_timestamp from raw_data ...

将 unix 时间戳转换为 YYYY-MM-DD HH:MM:SS 格式,然后您可以使用以下函数获取年、月、日:

SELECT year(new_timestamp) as year, month(new_timestamp) as month, day(new_timestamp) as day ...

Oof, that looks ugly. Try using this function in Hive:

SELECT from_unixtime(unix_timestamp) as new_timestamp from raw_data ...

Or if timestamp is in ms instead of seconds:

SELECT from_unixtime(unix_timestamp DIV 1000) as new_timestamp from raw_data ...

That converts a unix timestamp into a YYYY-MM-DD HH:MM:SS format, then you can use the following functions to get the year, month, and day:

SELECT year(new_timestamp) as year, month(new_timestamp) as month, day(new_timestamp) as day ...
π浅易 2024-12-08 22:12:11

随着 Hive 和 SparkSQL 的最新版本,日期数据类型和类型转换选项可用。以下应该适用于 Hive 以及 Spark SQL

SELECT cast(from_unixtime(epoch_datetime) as date) from myHiveTable

With more recent releases of Hive and SparkSQL, data type of date and type casting options are available. Following should work in Hive as well as Spark SQL

SELECT cast(from_unixtime(epoch_datetime) as date) from myHiveTable
少女净妖师 2024-12-08 22:12:11

如果您需要将日期转换为自定义格式,请使用:

select date_format(from_unixtime(epoch_datetime),'yyyyMM') as formatted_date from myHiveTable;

这将返回日期作为yearMonth,例如201708

If you need to convert the date in custom format, use this:

select date_format(from_unixtime(epoch_datetime),'yyyyMM') as formatted_date from myHiveTable;

which will return the date as yearMonth e.g. 201708

最美不过初阳 2024-12-08 22:12:11

将此查询添加到列表中,其中时间戳需要转换为字符串分区的日期字符串 yyyy-MM-dd:

hive> select date_format(from_unixtime(epoch_datetime), 'yyyy-MM-dd') as day from table_name limit 20;

-- If required, remove the millis precision for timestamps
hive> select date_format(from_unixtime(cast(epoch_datetime/1000 as bigint)), 'yyyy-MM-dd') as day from table_name limit 20;

Adding this query to the list where the timestamp needs to be converted to date string yyyy-MM-dd for a string partition:

hive> select date_format(from_unixtime(epoch_datetime), 'yyyy-MM-dd') as day from table_name limit 20;

-- If required, remove the millis precision for timestamps
hive> select date_format(from_unixtime(cast(epoch_datetime/1000 as bigint)), 'yyyy-MM-dd') as day from table_name limit 20;
深府石板幽径 2024-12-08 22:12:11
select order_id, date_format(from_unixtime(order_date/1000),'yyy-MM-dd') as order_date ,order_customer_id,order_status
from orders

或者如果您看到任何错误,请尝试使用
选择 order_id, date_format(from_unixtime(order_date DIV 1000),'yyy-MM-dd') 作为 order_date ,order_customer_id,order_status
来自订单

select order_id, date_format(from_unixtime(order_date/1000),'yyy-MM-dd') as order_date ,order_customer_id,order_status
from orders

or if you see any error on the same , try to use
select order_id, date_format(from_unixtime(order_date DIV 1000),'yyy-MM-dd') as order_date ,order_customer_id,order_status
from orders

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文