Python读取格式化字符串

发布于 2024-11-30 10:26:42 字数 686 浏览 1 评论 0原文

我有一个包含多行的文件，其格式如下语法：

FIELD      POSITION  DATA TYPE
------------------------------
COOP ID       1-6    Character
LATITUDE     8-15    Real
LONGITUDE   17-25    Real
ELEVATION   27-32    Real
STATE       34-35    Character
NAME        37-66    Character
COMPONENT1  68-73    Character
COMPONENT2  75-80    Character
COMPONENT3  82-87    Character
UTC OFFSET  89-90    Integer

数据全部采用 ASCII 格式。

一行的一个例子是：

011084  31.0581  -87.0547   26.0 AL BREWTON 3 SSE                  ------ ------ ------ +6

我当前的想法是，我想一次读取一行文件，并且不知何故将每一行分解成一个字典，以便我可以引用组件。是否有一些模块可以在 Python 中执行此操作，或者以其他干净的方式执行此操作？

谢谢！

原文

I have a file with a number of lines formatted with the following syntax:

FIELD      POSITION  DATA TYPE
------------------------------
COOP ID       1-6    Character
LATITUDE     8-15    Real
LONGITUDE   17-25    Real
ELEVATION   27-32    Real
STATE       34-35    Character
NAME        37-66    Character
COMPONENT1  68-73    Character
COMPONENT2  75-80    Character
COMPONENT3  82-87    Character
UTC OFFSET  89-90    Integer

The data is all ASCII-formatted.

An example of a line is:

011084  31.0581  -87.0547   26.0 AL BREWTON 3 SSE                  ------ ------ ------ +6

My current thought is that I'd like to read the file in a line at a time and somehow have each line broken up into a dictionary so I can refer to the components. Is there some module that does this in Python, or some other clean way?

Thanks!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

小女人ら 2024-12-07 10:26:42

编辑：您仍然可以使用结构模块：

请参阅结构模块文档。在我看来，您想使用 struct.unpack()

您想要的可能是这样的：

import struct
with open("filename.txt", "r") as f:
    for line in f:
        (coop_id, lat, lon, elev, state, name, c1, c2, c3, utc_offset
         ) = struct.unpack("6sx8sx9sx6sx2sx30sx6sx6sx6sx2s", line.strip())
        (lat, lon, elev) = map(float, (lat, lon, elev))
        utc_offset = int(utc_offset)

EDIT: You can still use the struct module:

See the struct module documentation. Looks to me like you want to use struct.unpack()

What you want is probably something like:

import struct
with open("filename.txt", "r") as f:
    for line in f:
        (coop_id, lat, lon, elev, state, name, c1, c2, c3, utc_offset
         ) = struct.unpack("6sx8sx9sx6sx2sx30sx6sx6sx6sx2s", line.strip())
        (lat, lon, elev) = map(float, (lat, lon, elev))
        utc_offset = int(utc_offset)

回复收藏 0 原文

樱娆 2024-12-07 10:26:42

我想我从您的问题/评论中了解您在寻找什么。如果我们假设 Real、Character 和 Integer 是唯一的数据类型，那么下面的代码应该可以工作。（我还将假设您显示的格式文件是制表符分隔的）：

format = {}
types = {"Real":float, "Character":str, "Integer":int}

for line in open("format.txt", "r"):
    values = line.split("\t")
    range = values[1].split("-")
    format[values[0]]={"start":int(range[0])-1, "end":int(range[1])-1, "type":types[values[2]]}

results=[]
for line in open("filename.txt"):
    result={}
    for key in format:
        result[key]=format["type"](line[format["start"]:format["end"]])
    results.append(result)

您最终应该得到包含字典列表的结果，其中每个字典都是从格式文件中的键名称到正确数据类型的数据值的映射。

I think I understand from your question/comments what you are looking for. If we assume that Real, Character, and Integer are the only data types, then the following code should work. (I will also assume that the format file you showed is tab delimited):

format = {}
types = {"Real":float, "Character":str, "Integer":int}

for line in open("format.txt", "r"):
    values = line.split("\t")
    range = values[1].split("-")
    format[values[0]]={"start":int(range[0])-1, "end":int(range[1])-1, "type":types[values[2]]}

results=[]
for line in open("filename.txt"):
    result={}
    for key in format:
        result[key]=format["type"](line[format["start"]:format["end"]])
    results.append(result)

You should end up with results containing a list of dictionaries where each dictionary is a mapping from key names in the format file to data values in the correct data type.

回复收藏 0 原文