使用pandas读取带有空格分隔不均的列的数据文件
我有一个数据文件,该文件的列被隔开不均匀的空格,而缺少数据也显示为Whitespace。这是文件的摘录:
52.75655753231478 34.68953070087977 221090730122510976 156.51972931061138 -17.572643739467043 1.3798599125392499 0.03571818775179444 -1.6014718579307516 0.056195796241959035 -1.2045525174577807 0.04408155757633151 -0.39499593 0.58434874 -0.104220375 -0.09171637 -0.4853293 0.058708116 0.19899127 -0.04585829 -0.14105034 0.0810716 142 13.05808 0.9137573 0.8 AAfrt55
52.73026256069882 34.341460673892975 221065132117493376 156.72043573419933 -17.86431195452588 1.6550088566259955 0.08398022438361424 -2.4547997015391947 0.07126912301702777 -1.9981359966222954 0.056926420123759786 -0.18282509 0.857106 0.03257091 0.30434817 -0.22865324 0.036517035 -0.045696758 0.21635033 0.31341392 0.17989494 145 14.77572 1.280653 0.6 AAfrt55
52.68434549160837 34.00359963856372 220988819138630016 156.90150379516513 -18.15747083558349 1.5876225837952878 0.0701188913290275 -2.4370594533812264 0.08793195992635172 -0.9494780100116795 0.0673739642789701 -0.22557606 0.6524007 -0.18271385 0.12871169 -0.35870972 0.19210297 0.21133502 0.08772008 -0.0059730127 0.2099215 179 10.883165 0.7241144 0.1 AAfrt55
52.676106438399984 34.73368685714245 221094367958697088 156.4364320462988 -17.57682950993822 1.5585619924217913 0.12270341264787946 -1.5320253890594473 0.20488197077169218 -1.5579812500350152 0.15149103725242027 -0.38029686 0.5423903 0.039901607 -0.089358106 -0.47867253 -0.01811142 0.27065852 0.0061952006 -0.08777855 -0.018889245 140 16.879848 1.8379726 0.7 AAfrt55
52.71209959964432 34.578003905054665 221083888238423552 156.55904659440634 -17.683874357669215 1.417315200916427 0.2902256449155021 -1.4031086014124174 0.4445107873794677 -1.2418716061247421 0.32415010404829614 -0.32437858 0.5030531 0.007634173 -0.024154684 -0.61248195 0.007697445 0.16414167 0.0348852 -0.035160106 0.2062511 147 17.979982 2.28545 0.6 AAfrt55
52.976769045904504 34.2043690118208 221013214552826112 156.97805068738012 -17.851469355301145 1.3517371901145452 0.05227677174066813 -1.6998011106704118 0.06393145316119067 -1.5066879492609095 0.05135957095097931 -0.2882013 0.625621 -0.09970278 0.17078118 -0.54429597 0.024621502 0.0029883361 0.18108386 0.14520298 0.29568258 157 14.129068 1.2771263 0.6 AAfrt55
52.725581800207 34.81203178094687 221096708716972032 156.42146313010704 -17.489732979976864 1.3637027483770132 0.06681005036392006 -1.2969967438574475 0.12798297569657666 -0.8700543685880945 0.07589931821548508 -0.47563803 0.54669774 0.21084175 -0.13976684 -0.3919695 -0.15758438 0.29368496 0.073126756 -0.018135564 -0.23391931 117 15.063394 1.2766619 0.5 AAfrt55
52.7985530101187 34.36596277011468 221065956751203840 156.75238845448155 -17.810891786683083 1.3354235835420394 0.07108435274492615 -1.5037101757150335 0.08349724817882194 -1.4176519719616552 0.06716461390950008 -0.23177823 0.71539974 -0.14281902 0.16131751 -0.350566 0.20656587 0.13758425 0.04726053 0.02584095 0.20465821 167 12.012547 0.6679964 0.7 AAfrt55
52.33478707580404 34.07755876601224 221033692956948224 156.61051515401235 -18.2713588192272 1.4506434823877878 0.042935822550938335 -1.711588054551144 0.08021024777572577 -1.428438652911233 0.052394146676305285 -0.46284133 0.39245856 -0.36800838 0.020843003 -0.4325589 0.18059099 0.1403661 0.06952838 -0.13496804 -0.017565649 141 14.258665 1.1050224 1.0 AAfrt55
53.236890326502746 34.60616842868128 221121379009541760 156.90327214337495 -17.401554596271392 1.670779589883186 0.06462934312116957 -1.3870861271226225 0.11007633727034921 -1.5354788621698447 0.06476830152651208 -0.2876506 0.50366104 0.06364631 0.17200021 -0.43980953 0.101488166 0.13044362 0.098199986 0.07485064 0.014440983 139 15.197788 1.420907 0.5 AAfrt55
53.36099004421408 35.02299915490862 221182058307431168 156.7245486656545 -17.00787802432042 1.421888821514962 0.0663340915531341 -1.8461295859228135 0.09336911688161233 -1.6670146028961077 0.0714590019457503 -30.271748692621472 10.746389544584678 -0.39191365 0.6037745 -0.32697898 0.29790965 -0.38904026 0.317259 -0.07794135 0.04330315 0.042706273 -0.004154171 146 12.469966 0.95415306 0.7 AAfrt55
53.52872681724522 35.0477570978552 221159415239971968 156.82317128998952 -16.905034510408548 1.421141411397051 0.05109522675079716 -1.632919678940039 0.06302630418197872 -1.2685055408322787 0.05099118144236511 -0.30175194 0.6874048 -0.17188887 0.065116994 -0.45524448 0.13649681 0.10539054 0.019577736 -0.022071656 0.24296543 155 14.162856 1.0796404 0.9 AAfrt55
53.54751972550367 34.96103948561606 221156696524334848 156.89093588909614 -16.96465396429354 1.294728492354405 0.18934469304374682 -1.1449588491523415 0.24877251454435825 -0.583647366938886 0.200549970304947 -0.37570783 0.5953056 -0.19945645 0.1845328 -0.5173996 0.1606674 0.08281563 0.10032614 0.04386764 0.13707627 157 17.58886 2.0971222 0.2 AAfrt55
52.885088571817256 34.989840585617124 221193431380869120 156.41969263893048 -17.268992175403003 1.4476557809261963 0.07281647540924233 -1.9124556059186422 0.12004352922185996 -1.4703545228732189 0.0790687307010274 -0.31885538 0.37405393 -0.4350186 0.06781138 -0.48610708 0.19664243 0.18642318 0.121886626 -0.13234825 0.15937497 157 15.75981 1.5027304 0.7 AAfrt55
53.54336695375678 35.54122701584878 221234525628153216 156.52106112078582 -16.505126454354887 1.2002583233942294 0.06023879613593315 -1.8575899983935311 0.08916397207630337 -1.4987391120861537 0.05451125116477068 -0.2842391 0.7054797 0.033138227 -0.0997483 -0.32082146 -0.14847293 0.06646432 0.055499673 0.012825485 0.025693728 148 14.743197 1.1996746 0.2 AAfrt55
52.89661447038126 35.01501845115683 221193637539294720 156.41180216188386 -17.243179777505176 1.4365003873992706 0.04152001288165847 -1.5960560447477898 0.06942606759430278 -1.2600472896481731 0.04452615169143268 -0.31485322 0.41341513 -0.44437912 0.016873619 -0.45052707 0.16905014 0.14041857 0.11158022 -0.111745454 0.1811417 167 14.112262 1.0594835 0.9 AAfrt55
我还具有每列开始的字节:
1- 21
23- 43
45- 63
65- 85
87-108
110-131
133-152
154-175
177-196
198-219
221-240
242-265
267-287
289-302
304-317
319-332
334-347
349-362
364-377
379-392
394-407
409-422
424-437
439-442
444-453
455-467
469-475
477-493
我如何使用此数据使用pandas
来领导文件?
I have a data file with columns separated by an uneven amount of whitespace, with missing data also presented as whitespace. Here's an excerpt of the file:
52.75655753231478 34.68953070087977 221090730122510976 156.51972931061138 -17.572643739467043 1.3798599125392499 0.03571818775179444 -1.6014718579307516 0.056195796241959035 -1.2045525174577807 0.04408155757633151 -0.39499593 0.58434874 -0.104220375 -0.09171637 -0.4853293 0.058708116 0.19899127 -0.04585829 -0.14105034 0.0810716 142 13.05808 0.9137573 0.8 AAfrt55
52.73026256069882 34.341460673892975 221065132117493376 156.72043573419933 -17.86431195452588 1.6550088566259955 0.08398022438361424 -2.4547997015391947 0.07126912301702777 -1.9981359966222954 0.056926420123759786 -0.18282509 0.857106 0.03257091 0.30434817 -0.22865324 0.036517035 -0.045696758 0.21635033 0.31341392 0.17989494 145 14.77572 1.280653 0.6 AAfrt55
52.68434549160837 34.00359963856372 220988819138630016 156.90150379516513 -18.15747083558349 1.5876225837952878 0.0701188913290275 -2.4370594533812264 0.08793195992635172 -0.9494780100116795 0.0673739642789701 -0.22557606 0.6524007 -0.18271385 0.12871169 -0.35870972 0.19210297 0.21133502 0.08772008 -0.0059730127 0.2099215 179 10.883165 0.7241144 0.1 AAfrt55
52.676106438399984 34.73368685714245 221094367958697088 156.4364320462988 -17.57682950993822 1.5585619924217913 0.12270341264787946 -1.5320253890594473 0.20488197077169218 -1.5579812500350152 0.15149103725242027 -0.38029686 0.5423903 0.039901607 -0.089358106 -0.47867253 -0.01811142 0.27065852 0.0061952006 -0.08777855 -0.018889245 140 16.879848 1.8379726 0.7 AAfrt55
52.71209959964432 34.578003905054665 221083888238423552 156.55904659440634 -17.683874357669215 1.417315200916427 0.2902256449155021 -1.4031086014124174 0.4445107873794677 -1.2418716061247421 0.32415010404829614 -0.32437858 0.5030531 0.007634173 -0.024154684 -0.61248195 0.007697445 0.16414167 0.0348852 -0.035160106 0.2062511 147 17.979982 2.28545 0.6 AAfrt55
52.976769045904504 34.2043690118208 221013214552826112 156.97805068738012 -17.851469355301145 1.3517371901145452 0.05227677174066813 -1.6998011106704118 0.06393145316119067 -1.5066879492609095 0.05135957095097931 -0.2882013 0.625621 -0.09970278 0.17078118 -0.54429597 0.024621502 0.0029883361 0.18108386 0.14520298 0.29568258 157 14.129068 1.2771263 0.6 AAfrt55
52.725581800207 34.81203178094687 221096708716972032 156.42146313010704 -17.489732979976864 1.3637027483770132 0.06681005036392006 -1.2969967438574475 0.12798297569657666 -0.8700543685880945 0.07589931821548508 -0.47563803 0.54669774 0.21084175 -0.13976684 -0.3919695 -0.15758438 0.29368496 0.073126756 -0.018135564 -0.23391931 117 15.063394 1.2766619 0.5 AAfrt55
52.7985530101187 34.36596277011468 221065956751203840 156.75238845448155 -17.810891786683083 1.3354235835420394 0.07108435274492615 -1.5037101757150335 0.08349724817882194 -1.4176519719616552 0.06716461390950008 -0.23177823 0.71539974 -0.14281902 0.16131751 -0.350566 0.20656587 0.13758425 0.04726053 0.02584095 0.20465821 167 12.012547 0.6679964 0.7 AAfrt55
52.33478707580404 34.07755876601224 221033692956948224 156.61051515401235 -18.2713588192272 1.4506434823877878 0.042935822550938335 -1.711588054551144 0.08021024777572577 -1.428438652911233 0.052394146676305285 -0.46284133 0.39245856 -0.36800838 0.020843003 -0.4325589 0.18059099 0.1403661 0.06952838 -0.13496804 -0.017565649 141 14.258665 1.1050224 1.0 AAfrt55
53.236890326502746 34.60616842868128 221121379009541760 156.90327214337495 -17.401554596271392 1.670779589883186 0.06462934312116957 -1.3870861271226225 0.11007633727034921 -1.5354788621698447 0.06476830152651208 -0.2876506 0.50366104 0.06364631 0.17200021 -0.43980953 0.101488166 0.13044362 0.098199986 0.07485064 0.014440983 139 15.197788 1.420907 0.5 AAfrt55
53.36099004421408 35.02299915490862 221182058307431168 156.7245486656545 -17.00787802432042 1.421888821514962 0.0663340915531341 -1.8461295859228135 0.09336911688161233 -1.6670146028961077 0.0714590019457503 -30.271748692621472 10.746389544584678 -0.39191365 0.6037745 -0.32697898 0.29790965 -0.38904026 0.317259 -0.07794135 0.04330315 0.042706273 -0.004154171 146 12.469966 0.95415306 0.7 AAfrt55
53.52872681724522 35.0477570978552 221159415239971968 156.82317128998952 -16.905034510408548 1.421141411397051 0.05109522675079716 -1.632919678940039 0.06302630418197872 -1.2685055408322787 0.05099118144236511 -0.30175194 0.6874048 -0.17188887 0.065116994 -0.45524448 0.13649681 0.10539054 0.019577736 -0.022071656 0.24296543 155 14.162856 1.0796404 0.9 AAfrt55
53.54751972550367 34.96103948561606 221156696524334848 156.89093588909614 -16.96465396429354 1.294728492354405 0.18934469304374682 -1.1449588491523415 0.24877251454435825 -0.583647366938886 0.200549970304947 -0.37570783 0.5953056 -0.19945645 0.1845328 -0.5173996 0.1606674 0.08281563 0.10032614 0.04386764 0.13707627 157 17.58886 2.0971222 0.2 AAfrt55
52.885088571817256 34.989840585617124 221193431380869120 156.41969263893048 -17.268992175403003 1.4476557809261963 0.07281647540924233 -1.9124556059186422 0.12004352922185996 -1.4703545228732189 0.0790687307010274 -0.31885538 0.37405393 -0.4350186 0.06781138 -0.48610708 0.19664243 0.18642318 0.121886626 -0.13234825 0.15937497 157 15.75981 1.5027304 0.7 AAfrt55
53.54336695375678 35.54122701584878 221234525628153216 156.52106112078582 -16.505126454354887 1.2002583233942294 0.06023879613593315 -1.8575899983935311 0.08916397207630337 -1.4987391120861537 0.05451125116477068 -0.2842391 0.7054797 0.033138227 -0.0997483 -0.32082146 -0.14847293 0.06646432 0.055499673 0.012825485 0.025693728 148 14.743197 1.1996746 0.2 AAfrt55
52.89661447038126 35.01501845115683 221193637539294720 156.41180216188386 -17.243179777505176 1.4365003873992706 0.04152001288165847 -1.5960560447477898 0.06942606759430278 -1.2600472896481731 0.04452615169143268 -0.31485322 0.41341513 -0.44437912 0.016873619 -0.45052707 0.16905014 0.14041857 0.11158022 -0.111745454 0.1811417 167 14.112262 1.0594835 0.9 AAfrt55
I also have the bytes where each column begins-ends:
1- 21
23- 43
45- 63
65- 85
87-108
110-131
133-152
154-175
177-196
198-219
221-240
242-265
267-287
289-302
304-317
319-332
334-347
349-362
364-377
379-392
394-407
409-422
424-437
439-442
444-453
455-467
469-475
477-493
How can I use this data to lead the file using pandas
?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
感谢 quang hoang 建议
pd._read_fwf
我不知道的功能。它很简单:
该功能甚至检测到列之间的正确分离。否则,有关列分离的信息可以用作:
Thanks to Quang Hoang for suggesting the
pd.read_fwf
function that I did not know about.It was as simple as:
and the function even detects the correct separation between columns. Otherwise, the information about the columns separation can be used as: