hive 字段分隔符问题
[hadoop@namenode1 ~]$ head -n 2 /data1/hive_data/player/20140304/LOG_USER_20140304.data
##################################华丽的分割线##################################
14030404594718760$#0$#null$#860842020632885$#460008446196604$#$#9001$#9001327$#10$#27$#7$#01$#3245$#
http://v.100tv.com/vi/player/log ... nt=wifi&fmt=xml
$#192.168.0.156$#115.55.81.35$#20140304050003$#1$#10.27.7.3245.9001.9001327$#WIFI$#1
上面从文本输出可以看出字段分隔符为$#,我在hive建立的表分隔符设置为ROW FORMAT DELIMITED FIELDS TERMINATED BY "$#";
数据加载到表中查询数据时,输出数据却是这样的,#号保留下来了。有同学遇到这样问题吗?
hive> select * from log_user where create_time like '%201403042211%';Total MapReduce jobs = 1Launching Job 1 out of 1Number of reduce tasks is set to 0 since there's no reduce operatorStarting Job = job_1394520837026_0027, Tracking URL = http://namenode1:8088/proxy/application_1394520837026_0027/Kill Command = /opt/hadoop/hadoop-2.0.0-cdh4.5.0/bin/hadoop job -kill job_1394520837026_0027Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 02014-03-12 10:09:07,240 Stage-1 map = 0%, reduce = 0%2014-03-12 10:09:13,609 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.71 secMapReduce Total cumulative CPU time: 2 seconds 710 msecEnded Job = job_1394520837026_0027MapReduce Jobs Launched: Job 0: Map: 1 Cumulative CPU: 2.71 sec HDFS Read: 38552732 HDFS Write: 54777 SUCCESSTotal MapReduce CPU Time Spent: 2 seconds 710 msecOK
14030422112971000 #0 #null #863636010152103 #460027715316357 #Minte_E70 #0 #0 #0 #0 #0 #01 #0 #
http://v.100tv.com/vi/player/qnt ... ;num=0&fmt=json
#192.168.0.15#121.63.194.73 #20140304221158 #1 #10.29.7.3245.9001.9001644 #WIFI #11
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
hive 默认只能使用单个字符作为列分割符。