Hive:将列标题写入本地文件?
Hive 文档再次缺失:
我想将查询结果以及列名写入本地文件。
Hive 支持这个吗?
Insert overwrite local directory 'tmp/blah.blah' select * from table_name;
另外,还有一个问题:StackOverflow 是获取 Hive 帮助的最佳场所吗? @Nija,非常有帮助,但我不想继续打扰他们......
Hive documentation lacking again:
I'd like to write the results of a query to a local file as well as the names of the columns.
Does Hive support this?
Insert overwrite local directory 'tmp/blah.blah' select * from table_name;
Also, separate question: Is StackOverflow the best place to get Hive Help? @Nija, has been very helpful, but I don't to keep bothering them...
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
这是我的看法。注意,我不太熟悉 bash,所以欢迎提出改进建议:)
Here's my take on it. Note, i'm not very well versed in bash, so improvements suggestions welcome :)
尝试
Try
是的,你可以。将
set hive.cli.print.header=true;
放入主目录中的.hiverc
文件或任何其他 Hive 用户属性文件中。模糊警告:要小心,因为这使我过去的查询崩溃了(但我不记得原因了)。
Yes you can. Put the
set hive.cli.print.header=true;
in a.hiverc
file in your main directory or any of the other hive user properties files.Vague Warning: be careful, since this has crashed queries of mine in the past (but I can't remember the reason).
事实上,@nija 的答案是正确的——至少据我所知。在执行
插入覆盖[本地]目录...
时,没有任何方法可以写入列名称(无论您是否使用本地)。关于 @user1735861 描述的崩溃,hive
0.7.1
中存在一个已知错误(已在0.8.0
中修复),在执行set hive 后.cli.print.header=true;
,对于任何不产生输出的 HQL 命令/查询都会导致NullPointerException
。例如:虽然这很好:
但非 HQL 命令也很好(
set
、dfs
!
等...)更多信息请参见此处:https://issues.apache.org/jira/browse/HIVE-2334
Indeed, @nija's answer is correct - at least as far as I know. There isn't any way to write the column names when doing an
insert overwrite into [local] directory ...
(whether you use local or not).With regards to the crashes described by @user1735861, there is a known bug in hive
0.7.1
(fixed in0.8.0
) that, after doingset hive.cli.print.header=true;
, causes aNullPointerException
for any HQL command/query that produces no output. For example:Whereas this is fine:
Non-HQL commands are fine though (
set
,dfs
!
, etc...)More info here: https://issues.apache.org/jira/browse/HIVE-2334
Hive 确实支持写入本地目录。您的语法看起来也适合它。
请查看有关 SELECTS 和 FILTERS 的文档以获取更多信息。
我认为 Hive 没有办法将列名写入您正在运行的查询的文件中。 。 。我不能肯定地说它不会,但我不知道有什么办法。
我认为对于 Hive 问题来说,唯一比 SO 更好的地方是邮件列表。
Hive does support writing to the local directory. You syntax looks right for it as well.
Check out the docs on SELECTS and FILTERS for additional information.
I don't think Hive has a way to write the names of the columns to a file for the query you're running . . . I can't say for sure it doesn't, but I do not know of a way.
I think the only place better than SO for Hive questions would be the mailing list.
我今天遇到了这个问题,并且能够通过在原始查询和创建标题行的新虚拟查询之间执行 UNION ALL 来获得我需要的内容。我在每个部分上添加了一个排序列,并将标题设置为 0,将数据设置为 1,这样我就可以按该字段排序并确保标题行出现在顶部。
它有点笨重,但至少您可以通过单个查询获得所需的内容。
希望这有帮助!
I ran into this problem today and was able to get what I needed by doing a UNION ALL between the original query and a new dummy query that creates the header row. I added a sort column on each section and set the header to 0 and the data to a 1 so I could sort by that field and ensure the header row came out on top.
It's a little bulky, but at least you can get what you need with a single query.
Hope this helps!
这不是一个很好的解决方案,但这就是我所做的:
Not a great solution, but here is what I do: