gawk/sed 的嵌套分隔符问题
我有这个文本需要分割:
[{names: {en: 'UK 100', es: 'UK 100'}, status: 'A', displayed: 'Y', start_time: '2011-05-12 00:00:00', start_time_xls: {en: '12th of May 2011 00:00 am', es: '12 May 2011 00:00 am'}, suspend_at: '2011-05-12 15:14:02', is_off: 'Y', score_home: '', score_away: '', bids_status: '', period_id: '', curr_period_start_time: '', score_extra_info: '', settled: 'N', ev_id: 2666872, ev_type_id: 10744, type_name: '|UK 100|'}, {names: {en: 'US 30', es: 'US 30'}, status: 'A', displayed: 'Y', start_time: '2011-05-12 00:00:00', start_time_xls: {en: '12th of May 2011 00:00 am', es: '12 May 2011 00:00 am'}, suspend_at: '2011-05-12 15:13:45', is_off: 'Y', score_home: '', score_away: '', bids_status: '', period_id: '', curr_period_start_time: '', score_extra_info: '', settled: 'N', ev_id: 2666879, ev_type_id: 10745, type_name: '|US 30|'}, {names: {en: 'Germany 30', es: 'Germany 30'}, status: 'A', displayed: 'Y', start_time: '2011-05-12 00:00:00', start_time_xls: {en: '12th of May 2011 00:00 am', es: '12 May 2011 00:00 am'}, suspend_at: '2011-05-12 15:13:52', is_off: 'Y', score_home: '', score_away: '', bids_status: '', period_id: '', curr_period_start_time: '', score_extra_info: '', settled: 'N', ev_id: 2666884, ev_type_id: 10748, type_name: '|Germany 30|'}, {names: {en: 'France 40', es: 'France 40'}, status: 'A', displayed: 'Y', start_time: '2011-05-12 00:00:00', start_time_xls: {en: '12th of May 2011 00:00 am', es: '12 May 2011 00:00 am'}, suspend_at: '2011-05-12 15:13:38', is_off: 'Y', score_home: '', score_away: '', bids_status: '', period_id: '', curr_period_start_time: '', score_extra_info: '', settled: 'N', ev_id: 2666882, ev_type_id: 10747, type_name: '|France 40|'}, {names: {en: 'US 500', es: 'US 500'}, status: 'A', displayed: 'Y', start_time: '2011-05-12 00:00:00', start_time_xls: {en: '12th of May 2011 00:00 am', es: '12 May 2011 00:00 am'}, suspend_at: '2011-05-12 15:14:30', is_off: 'Y', score_home: '', score_away: '', bids_status: '', period_id: '', curr_period_start_time: '', score_extra_info: '', settled: 'N', ev_id: 2666890, ev_type_id: 10749, type_name: '|US 500|'}, {names: {en: 'Spain 35', es: 'Spain 35'}, status: 'A', displayed: 'Y', start_time: '2011-05-12 00:00:00', start_time_xls: {en: '12th of May 2011 00:00 am', es: '12 May 2011 00:00 am'}, suspend_at: '2011-05-12 15:13:51', is_off: 'Y', score_home: '', score_away: '', bids_status: '', period_id: '', curr_period_start_time: '', score_extra_info: '', settled: 'N', ev_id: 2666886, ev_type_id: 10750, type_name: '|Spain 35|'}],
我已经尝试过这些的变体,但一直被我不想分割的“内部”分隔符所困扰!:
gawk -F“[”-v RS= "," "NF{print $0}" text.txt
如何拆分它们 (1) 首先在主“{”上,忽略内部的“{” (2) 然后在逗号上,忽略大写之间的逗号大括号。然后,我只想输出一两个字段,如下所示:
suspend_at: '2011-05-12 15:14:02', ev_id: 2666872, ev_type_id: 10744, type_name: '|UK 100|'
提前致谢。
I've this text that I need to split:
[{names: {en: 'UK 100', es: 'UK 100'}, status: 'A', displayed: 'Y', start_time: '2011-05-12 00:00:00', start_time_xls: {en: '12th of May 2011 00:00 am', es: '12 May 2011 00:00 am'}, suspend_at: '2011-05-12 15:14:02', is_off: 'Y', score_home: '', score_away: '', bids_status: '', period_id: '', curr_period_start_time: '', score_extra_info: '', settled: 'N', ev_id: 2666872, ev_type_id: 10744, type_name: '|UK 100|'}, {names: {en: 'US 30', es: 'US 30'}, status: 'A', displayed: 'Y', start_time: '2011-05-12 00:00:00', start_time_xls: {en: '12th of May 2011 00:00 am', es: '12 May 2011 00:00 am'}, suspend_at: '2011-05-12 15:13:45', is_off: 'Y', score_home: '', score_away: '', bids_status: '', period_id: '', curr_period_start_time: '', score_extra_info: '', settled: 'N', ev_id: 2666879, ev_type_id: 10745, type_name: '|US 30|'}, {names: {en: 'Germany 30', es: 'Germany 30'}, status: 'A', displayed: 'Y', start_time: '2011-05-12 00:00:00', start_time_xls: {en: '12th of May 2011 00:00 am', es: '12 May 2011 00:00 am'}, suspend_at: '2011-05-12 15:13:52', is_off: 'Y', score_home: '', score_away: '', bids_status: '', period_id: '', curr_period_start_time: '', score_extra_info: '', settled: 'N', ev_id: 2666884, ev_type_id: 10748, type_name: '|Germany 30|'}, {names: {en: 'France 40', es: 'France 40'}, status: 'A', displayed: 'Y', start_time: '2011-05-12 00:00:00', start_time_xls: {en: '12th of May 2011 00:00 am', es: '12 May 2011 00:00 am'}, suspend_at: '2011-05-12 15:13:38', is_off: 'Y', score_home: '', score_away: '', bids_status: '', period_id: '', curr_period_start_time: '', score_extra_info: '', settled: 'N', ev_id: 2666882, ev_type_id: 10747, type_name: '|France 40|'}, {names: {en: 'US 500', es: 'US 500'}, status: 'A', displayed: 'Y', start_time: '2011-05-12 00:00:00', start_time_xls: {en: '12th of May 2011 00:00 am', es: '12 May 2011 00:00 am'}, suspend_at: '2011-05-12 15:14:30', is_off: 'Y', score_home: '', score_away: '', bids_status: '', period_id: '', curr_period_start_time: '', score_extra_info: '', settled: 'N', ev_id: 2666890, ev_type_id: 10749, type_name: '|US 500|'}, {names: {en: 'Spain 35', es: 'Spain 35'}, status: 'A', displayed: 'Y', start_time: '2011-05-12 00:00:00', start_time_xls: {en: '12th of May 2011 00:00 am', es: '12 May 2011 00:00 am'}, suspend_at: '2011-05-12 15:13:51', is_off: 'Y', score_home: '', score_away: '', bids_status: '', period_id: '', curr_period_start_time: '', score_extra_info: '', settled: 'N', ev_id: 2666886, ev_type_id: 10750, type_name: '|Spain 35|'}],
I've tried variants of these, but keep getting caught by the 'inner' delimiters that I DON'T want to split!!:
gawk -F "[" -v RS="," "NF{print $0}" text.txt
How can I split them (1) First on the main "{", ignoring the inner "{"'s (2) Then on the commas, ignoring commas in between curly braces. I then want to output only one or two fields like this:
suspend_at: '2011-05-12 15:14:02', ev_id: 2666872, ev_type_id: 10744, type_name: '|UK 100|'
Thanks in advance.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
如前所述,如果 Perl 可以接受:
As already stated, if Perl is acceptable: