如何在Python中不使用正则表达式将文本格式与字符串匹配?
我正在阅读一个文件,其中的行格式为
[ 0 ] L= 9 (D) R= 14 (D) p= 0.0347222 e= 10 n= 34
我看到 Matlab 代码读取这个文件,
[I,L,Ls,R,Rs,p,e,n] = textread(f1,'[ %u ] L= %u%s R= %u%s p= %n e=%u n=%u')
我想用Python读取这个文件。我唯一知道的是正则表达式,即使阅读这一行的一部分也会导致类似
re.compile('\s*\[\s*(?P<id>\d+)\s*\]\s*L\s*=\s*(?P<Lint>\d+)\s*\((?P<Ltype>[DG])\)\s*R\s*=\s*(?P<Rint>\d+)\s*')
丑陋的事情!在 Python 中是否有更简单的方法来做到这一点?
I am reading a file with lines of the form exemplified by
[ 0 ] L= 9 (D) R= 14 (D) p= 0.0347222 e= 10 n= 34
I saw Matlab code to read this file given by
[I,L,Ls,R,Rs,p,e,n] = textread(f1,'[ %u ] L= %u%s R= %u%s p= %n e=%u n=%u')
I want to read this file in Python. The only thing I know of is regex, and reading even a part of this line leads to something like
re.compile('\s*\[\s*(?P<id>\d+)\s*\]\s*L\s*=\s*(?P<Lint>\d+)\s*\((?P<Ltype>[DG])\)\s*R\s*=\s*(?P<Rint>\d+)\s*')
which is ugly! Is there an easier way to do this in Python?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
您可以通过使用转义/替换构建正则表达式,使其更具可读性...
You can make the regexp more readable by building it with escape/replace...
对我来说,这看起来或多或少是Pythonic的:
This looks more or less pythonic to me:
Pyparsing 是不可读且脆弱的正则表达式处理器的后备方案。下面的解析器示例处理您指定的格式,加上任何类型的额外空格以及赋值表达式的任意顺序。正如您在正则表达式中使用命名组一样,pyparsing 支持结果名称,以便您可以使用 dict 或属性语法(data['Lint'] 或 data.Lint)访问解析的数据。
此外,解析操作在解析时执行 string->int 或 string->float 转换,以便之后这些值已经处于可用的形式。 (pyparsing 的想法是,在解析这些表达式时,您知道由数字组成的单词 - 或
Word(nums)
- 将安全地转换为 int,所以为什么不正确进行转换然后,不再只是返回匹配的字符串并重新处理字符串序列,而是尝试检测哪些是整数、浮点数等?)Pyparsing is a fallback from unreadable and fragile regex processors. The parser example below handles your stated format, plus any variety of extra whitespace, and arbitrary order of the assignment expressions. Just as you have used named groups in your regex, pyparsing supports results names, so that you can access the parsed data using dict or attribute syntax (data['Lint'] or data.Lint).
Also, the parse actions do the string->int or string->float conversion at parse time, so that afterward the values are already in a usable form. (The thinking in pyparsing is that, while parsing these expressions, you know that a word composed of numeric digits - or
Word(nums)
- will safely convert to an int, so why not do the conversion right then, instead of just getting back matching strings and having to re-process the sequence of strings, trying to detect which ones are integers, floats, etc.?)Python 没有 scanf 等效项Python 的 re 页面上所述。
但是,您可能可以使用该页面上的映射构建自己的 scanf 之类模块。
Python does not have a scanf equivalent as stated on the re page for Python.
However, you could probably build your own scanf like module using the mappings on that page.