Python中的文本文件解析问题
我是 python 新手,如果我找到单词“Lett”,我会尝试删除文本文件中的行。在行中。这是我试图解析的文本文件的示例:
<A>Lamb</A> <W>Let. Moxon</W>
<A>Lamb</A> <W>Danger Confound. Mor. w. Personal Deformity</W>
<A>Lamb</A> <W>Gentle Giantess</W>
<A>Lamb</A> <W>Lett., to Wordsw.</W>
<A>Lamb</A> <W>Lett., to Procter</W>
<A>Lamb</A> <W>Let. to Old Gentleman</W>
<A>Lamb</A> <W>Elia Ser.</W>
<A>Lamb</A> <W>Let. to T. Manning</W>
我知道如何打开该文件,但我只是不确定如何找到匹配的文本以及如何删除该行。任何帮助将不胜感激。
I am new to python and I am trying to delete lines in a text file if I find the word "Lett." in the line. Here is a sample of the text file I am trying to parse:
<A>Lamb</A> <W>Let. Moxon</W>
<A>Lamb</A> <W>Danger Confound. Mor. w. Personal Deformity</W>
<A>Lamb</A> <W>Gentle Giantess</W>
<A>Lamb</A> <W>Lett., to Wordsw.</W>
<A>Lamb</A> <W>Lett., to Procter</W>
<A>Lamb</A> <W>Let. to Old Gentleman</W>
<A>Lamb</A> <W>Elia Ser.</W>
<A>Lamb</A> <W>Let. to T. Manning</W>
I know how to open the file but I am just uncertain of how to find the matching text and then how to delete that line. Any help would be greatly appreciated.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
或者如果您想将结果写入文件:
or if you want to write the result to a file:
我有一个针对此类内容的通用流编辑器框架。我将文件加载到内存中,将更改应用于内存中的行列表,并在发生更改时写出文件。
我有如下所示的样板:
在
# magic here
部分中,我有:对各行的修改,例如:
lines[i] = change_line(lines[i])
调用我的 sed 实用程序来插入、追加和替换行,例如:
lines = delete_range(lines, some_range)
后者使用如下原语:
为了清楚起见,测试适用于整数数组,但转换也适用于字符串数组。
通常,我会扫描行列表来识别要应用的更改(通常使用正则表达式),然后将更改应用于匹配的数据。例如,今天我最终对 150 个文件进行了大约 2000 行更改。
当您需要应用多行模式或附加逻辑来确定更改是否适用时,这比 sed 效果更好。
I have a general streaming editor framework for this kind of stuff. I load the file into memory, apply changes to the in-memory list of lines, and write out the file if changes were made.
I have boilerplate that looks like this:
And in the
# magic here
section, I have either:modifications to individual lines, like:
lines[i] = change_line(lines[i])
calls to my sed utilities for inserting, appending, and replacing lines, like:
lines = delete_range(lines, some_range)
The latter uses primitives like these:
The tests work on arrays of integers, for clarity, but the transformations work for arrays of strings, too.
Generally, I scan the list of lines to identify changes I want to apply, usually with regular expressions, and then I apply the changes on matching data. Today, for example, I ended up making about 2000 line changes across 150 files.
This works better than
sed
when you need to apply multiline patterns or additional logic to identify whether a change is applicable.return [l for l in open(fname) if 'Lett' 不在 l 中]
return [l for l in open(fname) if 'Lett' not in l]