Sqoop hive 导入错误数据
我正在尝试从 PGSQL 导入数据并使用 sqoop 加载到 Hive 中 使用的查询:
sqoop import --connect jdbc:postgresql://${HOST}:${PORT}/${INST} --username=${USER} --password=$PASS} --delete-target-dir --target-dir /TC/Customer --query "Select event_id, customer_id,subscriber_id, PROCESSING_STATUS from Customer WHERE \$CONDITIONS and PROCESSING_STATUS='RD'" --where " 1=1 " -m 1 --fields-terminated-by "," --hive-import --create-hive-table --hive-table Customer
sqoop 命令成功后计数将匹配,但目标中的记录很少,而源中根本不存在。 我尝试在 pkey 上进行 split-by 但出现同样的错误
I am trying to import data from PGSQL and load into Hive using sqoop
Query used:
sqoop import --connect jdbc:postgresql://${HOST}:${PORT}/${INST} --username=${USER} --password=$PASS} --delete-target-dir --target-dir /TC/Customer --query "Select event_id, customer_id,subscriber_id, PROCESSING_STATUS from Customer WHERE \$CONDITIONS and PROCESSING_STATUS='RD'" --where " 1=1 " -m 1 --fields-terminated-by "," --hive-import --create-hive-table --hive-table Customer
The counts will match after sqoop command is successful, but there are few records in destination which is not at all present in source.
i tried split-by on pkey but same error
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论