蜂巢中的镶木
我正在学习如何将Hive与Hortonwork Sandbox一起使用,但是每当我无法使用时。当我创建一个表时,没有显示数据,因此我决定添加此查询:
Create external table tripinfo (
VendorID string,
pickup string,
dropoff string,
Passenger string,
distance string,
Pickloc string,
droploc string,
rate string,
store string,
payment string,
amount string,
extra string,
tax string,
improvement string,
tip string,
tolls string,
tap string)
row format serde "parquet.hive.serde.PaquetHiveSerDe"
stored as
INPUTFORMAT "parquet.hive.DeprecatedParquetInputFormat"
OUTPUTFORMAT "parquet.hive.DeprecatedParquetOutputFormat"
Location "/user/taxi/yellow data/trip/";
但是,它显示了此错误:编译语句时错误:失败:SemantIceXception:semanticexception无法找到类'parquet.hive.hive.hive.hive.deprecatedParquetInputformat'
镶木quet文件已经在HDF中,被“”隔开,并且是巨大的(正如您可能预期的那样) 我是做错了什么,还是有任何方法可以用镶木木数据创建表?
I am learning how to use Hive with the Hortonwork Sandbox, however whenever I am not able to. When I create a table, the data is not shown, so I decided to add this query:
Create external table tripinfo (
VendorID string,
pickup string,
dropoff string,
Passenger string,
distance string,
Pickloc string,
droploc string,
rate string,
store string,
payment string,
amount string,
extra string,
tax string,
improvement string,
tip string,
tolls string,
tap string)
row format serde "parquet.hive.serde.PaquetHiveSerDe"
stored as
INPUTFORMAT "parquet.hive.DeprecatedParquetInputFormat"
OUTPUTFORMAT "parquet.hive.DeprecatedParquetOutputFormat"
Location "/user/taxi/yellow data/trip/";
however, it shows this error: Error while compiling statement: FAILED: SemanticException Cannot find class 'parquet.hive.DeprecatedParquetInputFormat'
the parquet file is already in HDFS, separated by " " and is huge (as you might have expected)
Am I doing something wrong, or is ther any way to create a table with parquet data?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我认为您正在阅读此页面? -
注意标题 Hive 0.10-0.12 。沙箱至少应该使用Hive 1.x,甚至可能是2.x,因此您应该只使用诸如
Parquet文件中的二进制数据之类的查询,除非您是指单字符串型列,否。
I assume you are reading this page? - https://cwiki.apache.org/confluence/display/Hive/Parquet
Notice the header Hive 0.10-0.12. The sandbox should at least be using Hive 1.x, maybe even 2.x, so you should just use a query like so
Binary data in parquet files shouldn't separated by ASCII spaces unless you are referring to a single string-type column.