如何根据文本文件中的特定单词过滤特定值并将其存储在列表中?

发布于 2024-10-22 05:08:00 字数 306 浏览 3 评论 0原文

就像我有一个文本文件 abc.txt 一样,

we 2 rt 3 re 3 tr vh kn mo
we 3 rt 5 re 5 tr yh kn me
we 4 rt 6 re 33 tr ph kn m3
we 5 rt 9 re 34 tr oh kn me
we 6 rt 8 re 32 tr kh kn md

现在我想要针对 tr 的值,过滤后应该得到这个结果,

[vh,yh,ph,oh,kh]

任何人都可以告诉如何做到这一点。应该为其编写什么代码

Like i have a text file abc.txt and it is like this

we 2 rt 3 re 3 tr vh kn mo
we 3 rt 5 re 5 tr yh kn me
we 4 rt 6 re 33 tr ph kn m3
we 5 rt 9 re 34 tr oh kn me
we 6 rt 8 re 32 tr kh kn md

now i want the values against the tr and after filtering it should get this result

[vh,yh,ph,oh,kh]

can anyone tell how to do it.what code should be write for it

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

走过海棠暮 2024-10-29 05:08:00
mylist = [line.split()[7] for line in myfile] 

如果始终是第 8 列,则应该有效。

如果 tr 的位置是可变的,你可以这样做

mylist = []
for line in myfile:
    items = line.split()
    mylist.append(items[items.index("tr")+1])
mylist = [line.split()[7] for line in myfile] 

should work if it's always the 8th column.

If the position of tr is variable, you could do

mylist = []
for line in myfile:
    items = line.split()
    mylist.append(items[items.index("tr")+1])
葬心 2024-10-29 05:08:00

您可以将行拆分为 before trafter tr 并获取第二部分中的第一个单词。

[ line.split(' tr ')[1].split()[0] for line in file ] 

如果有多个 tr,则表达式将收集第一个之后的单词。或者,这个收集一行中最后一个 tr 之后的单词:

[ line.split(' tr ')[-1].split()[0] for line in file ]

You can split the lines as before tr and after tr and obtain the first word in the second part.

[ line.split(' tr ')[1].split()[0] for line in file ] 

If there is more than one tr, the expression collects the word after the first one. Alternatively, this one collects the words after the last tr in a line:

[ line.split(' tr ')[-1].split()[0] for line in file ]
北城半夏 2024-10-29 05:08:00

你的问题不太清楚。这就是你所追求的吗?

[line.split()[7] for line in open("abc.txt")]

它返回每行的第八个“单词”。

Your question is not quite clear. Does this what you are after?

[line.split()[7] for line in open("abc.txt")]

It returns the eighth "word" from every line.

淡水深流 2024-10-29 05:08:00

如果我理解正确,类似这样的事情应该可以完成工作(未经测试):

resultArray = []
for aString in yourFile:
    anArray = aString.split()
    for i in range(0, len(anArray) - 1):  //-1 in case tr is at the end of array
        if anArray[i] == 'tr':
            resultArray.append(anArray[i + 1])

If I understand correctly, something like this should do the job (not tested):

resultArray = []
for aString in yourFile:
    anArray = aString.split()
    for i in range(0, len(anArray) - 1):  //-1 in case tr is at the end of array
        if anArray[i] == 'tr':
            resultArray.append(anArray[i + 1])
安静 2024-10-29 05:08:00
from operator import itemgetter

# tr value is in the 8th column
tr = itemgetter(7)

print map(tr, (line.split() for line in myfile.readlines()))
from operator import itemgetter

# tr value is in the 8th column
tr = itemgetter(7)

print map(tr, (line.split() for line in myfile.readlines()))
扬花落满肩 2024-10-29 05:08:00

使用正则表达式不是更简单吗?

如果 'we' 、 'rt' 、 're' 、 'tr' 在它们的位置上确实是恒定的:

import re

ch = '''
we 2 rt 3 re 3 tr vh kn mo
we 3 rt 5 re 5 tr yh kn me
we 4 rt 6 re 33 tr ph kn m3
we 5 rt 9 re 34 tr oh kn me
we 6 rt 8 re 32 tr kh kn md'''

print re.findall('(?<= tr )([^ ]+)',ch)

如果不是,那么位置就是确定要捕获的内容的标准:

import re

ch = '''
we 2 rt 3 re 3 tr vh kn mo
we 3 rt 5 re 5 tr yh kn me
we 4 rt 6 re 33 tr ph kn m3
we 5 rt 9 re 34 tr oh kn me
we 6 rt 8 re 32 tr kh kn md'''

print [ mat.group(1)
        for mat in re.finditer('^(?:\w+ \d+ ){3}\w+ ([^ ]+) .+',ch,re.M)]

Wouldn't be simpler to use a regex ?

If 'we' , 'rt' , 're' , 'tr' are really constant at their places :

import re

ch = '''
we 2 rt 3 re 3 tr vh kn mo
we 3 rt 5 re 5 tr yh kn me
we 4 rt 6 re 33 tr ph kn m3
we 5 rt 9 re 34 tr oh kn me
we 6 rt 8 re 32 tr kh kn md'''

print re.findall('(?<= tr )([^ ]+)',ch)

If not, and then the position being the criterium to determine what to catch:

import re

ch = '''
we 2 rt 3 re 3 tr vh kn mo
we 3 rt 5 re 5 tr yh kn me
we 4 rt 6 re 33 tr ph kn m3
we 5 rt 9 re 34 tr oh kn me
we 6 rt 8 re 32 tr kh kn md'''

print [ mat.group(1)
        for mat in re.finditer('^(?:\w+ \d+ ){3}\w+ ([^ ]+) .+',ch,re.M)]
忆依然 2024-10-29 05:08:00

人们可以尝试以下操作:

def filter_words(filename, magic_word):
    with open(filename) as f:
        all_words = f.read().strip().split()
        filtered_words = []
        i = 0
        while True:
            try:
                i = all_words.index(magic_word, i) + 1
                filtered_words.append(all_words[i])
            except IndexError, ValueError:
                break
        return filtered_words

如果“tr”恰好是所提供文本文件中的最后一个单词,则该算法不会失败。

例子:

>>> filter_words('abc.txt', 'tr')
['vh', 'yh', 'ph', 'oh', 'kh']

One may try the following:

def filter_words(filename, magic_word):
    with open(filename) as f:
        all_words = f.read().strip().split()
        filtered_words = []
        i = 0
        while True:
            try:
                i = all_words.index(magic_word, i) + 1
                filtered_words.append(all_words[i])
            except IndexError, ValueError:
                break
        return filtered_words

This algorithm does not fail in case 'tr' happens to be the last word in the provided text file.

Example:

>>> filter_words('abc.txt', 'tr')
['vh', 'yh', 'ph', 'oh', 'kh']
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文