我如何在数据映中创建带有Hive格式的外部表格

发布于 2025-01-29 13:11:25 字数 462 浏览 3 评论 0原文

我有一个外部表，蜂巢中的格式下面。

CREATE EXTERNAL TABLE cs_mbr_prov(
  key struct<inid:string,......>, 
  memkey string, 
  ob_id string, 
  .....
)
  
ROW FORMAT SERDE 
  'org.apache.hadoop.hive.hbase.HBaseSerDe' 
STORED BY 
  'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
WITH SERDEPROPERTIES ( 
  'hbase.columns.mapping'=' :key,ci:MEMKEY, .....', 
  'serialization.format'='1')

我想在Azure Databricks中创建相同类型的表格，其中我的输入和输出为镶木格式。

原文

I am having a external table with below format in hive.

CREATE EXTERNAL TABLE cs_mbr_prov(
  key struct<inid:string,......>, 
  memkey string, 
  ob_id string, 
  .....
)
  
ROW FORMAT SERDE 
  'org.apache.hadoop.hive.hbase.HBaseSerDe' 
STORED BY 
  'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
WITH SERDEPROPERTIES ( 
  'hbase.columns.mapping'=' :key,ci:MEMKEY, .....', 
  'serialization.format'='1')

I want to create same type of table in Azure Databricks where my Input and Output are in parquet format.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

鯉魚旗 2025-02-05 13:11:25

根据我创建并再现了表，用输入和输出在 镶木格式中 。

示例代码：

CREATE EXTERNAL TABLE `vams`(
  `country` string,
  `count` int)
ROW FORMAT SERDE
  'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
  
STORED AS INPUTFORMAT
  'org.apache.hadoop.hive.ql.io.SymlinkTextInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
  'dbfs:/FileStore/'
TBLPROPERTIES (
  'totalSize'='2335',
  'numRows'='240',
  'rawDataSize'='2095',
  'COLUMN_STATS_ACCURATE'='true',
  'numFiles'='1',
  'transient_lastDdlTime'='1418173653')

参考：

https://learn.microsoft.com/en-us/azure /databricks/spark/最新/spark-sql/language-manual/sql-ref-syntax-ddl-create-table-table-hiveformat

As per the official Doc I created and reproduced the table with Input and Output are in parquet format.

Sample code:

CREATE EXTERNAL TABLE `vams`(
  `country` string,
  `count` int)
ROW FORMAT SERDE
  'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
  
STORED AS INPUTFORMAT
  'org.apache.hadoop.hive.ql.io.SymlinkTextInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
  'dbfs:/FileStore/'
TBLPROPERTIES (
  'totalSize'='2335',
  'numRows'='240',
  'rawDataSize'='2095',
  'COLUMN_STATS_ACCURATE'='true',
  'numFiles'='1',
  'transient_lastDdlTime'='1418173653')