删除文本文件中的行

发布于 2024-08-28 09:20:36 字数 600 浏览 4 评论 0原文

我拥有的以下文本文件的示例是:

> 1 -4.6    -4.6    -7.6
> 
> 2 -1.7    -3.8    -3.1
> 
> 3 -1.6    -1.6    -3.1

文本文件中的数据由制表符分隔,第一列指示位置。

我需要迭代文本文件中除第 0 列之外的每个值并找到最小值。

一旦找到最低值,则需要将该值与列名称和位置一起写入新的文本文件。第 0 列的名称为“position”,第 1 列的名称为“fifteen”,第 2 列的名称为“sixteen”,第 3 列的名称为“seventeen”,

例如,上述数据中的最小值为“-7.6”,位于名称为“seventeen”的第 3 列中”。因此,“7.6”、“seventeen”及其位置值(在本例中为 1)需要写入新的文本文件。

然后我需要从上面的文本文件中删除一些行。

EG,上面的最低值是“-7.6”并且在位置“1”处找到并且在第3列中找到,其名称为“十七”。因此,我需要从文本文件中删除从位置 1 开始(包括位置 1)的 17 行,

因此找到最低值的列表示需要删除的行数,找到它的位置表示该行的起始点。删除

A sample of the following text file i have is:

> 1 -4.6    -4.6    -7.6
> 
> 2 -1.7    -3.8    -3.1
> 
> 3 -1.6    -1.6    -3.1

the data is separated by tabs in the text file and the first column indicates the position.

I need to iterate through every value in the text file apart from column 0 and find the lowest value.

once the lowest value has been found that value needs to be written to a new text file along with the column name and position. Column 0 has the name "position" Column 1 "fifteen", column 2 "sixteen" and column 3 "seventeen"

for example the lowest value in the above data is "-7.6" and is in column 3 which has the name "seventeen". Therefore "7.6", "seventeen" and its position value which in this case is 1 need to be written to the new text file.

I then need a number of rows deleted from the above text file.

E.G. the lowest value above is "-7.6" and is found at position "1" and is found in column 3 which as the name "seventeen". I therefore need seventeen rows deleted from the text file starting from and including position 1

so the the column in which the lowest value is found denotes the amount of rows that needs to be deleted and the position it is found at states the start point of the deletion

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

白日梦 2024-09-04 09:20:36

打开此文件进行读取,打开另一个文件进行写入,然后复制所有与过滤器不匹配的行:

readfile = open('somefile', 'r')
writefile = open('otherfile', 'w')

for line in readfile:
  if not somepredicate(line):
    writefile.write(line)

readfile.close()
writefile.close()

Open this file for reading, another file for writing, and copy all the lines that don't match the filter:

readfile = open('somefile', 'r')
writefile = open('otherfile', 'w')

for line in readfile:
  if not somepredicate(line):
    writefile.write(line)

readfile.close()
writefile.close()
静水深流 2024-09-04 09:20:36

这是我认为你想要的东西(尽管你的要求有点难以遵循):

def extract_bio_data(input_path, output_path):
    #open the output file and write it's headers
    output_file = open(output_path, 'w')
    output_file.write('\t'.join(('position', 'min_value', 'rows_skipped')) + '\n')

    #map column indexes (after popping the row number) to the number of rows to skip
    col_index = { 0: 15, 
                  1: 16, 
                  2: 17 }

    skip_to_position = 0
    for line in open(input_path, 'r'):
        #remove the '> ' from the beginning of the line and strip newline characters off the end
        line = line[2:].strip()

        #if the line contains no data, skip it
        if line == '':
            continue

        #split the columns on whitespace (change this to split('\t') for splitting only on tabs)
        columns = line.split()

        #extract the row number/position of this data
        position = int(columns.pop(0))

        #this is where we skip rows/positions
        if position < skip_to_position:  
            continue

        #if two columns share the minimum value, this will be the first encountered in the list
        min_index = columns.index(min(columns, key=float))

        #this is an integer version of the 'column name' which corresponds to the number of rows that need to be skipped
        rows_to_skip = col_index[min_index]

        #write data to your new file (row number, minimum value, number of rows skipped)
        output_file.write('\t'.join(str(x) for x in (position, columns[min_index], rows_to_skip)) + '\n')

        #set the number of data rows to skip from this position
        skip_to_position = position + rows_to_skip


if __name__ == '__main__':
    in_path = r'c:\temp\test_input.txt'
    out_path = r'c:\temp\test_output.txt'
    extract_bio_data(in_path, out_path)

我不清楚的事情:

  1. 每行的开头是否真的有“>”,还是复制/粘贴错误?
    • 我认为这不是一个错误。
  2. 您想要将“7.6”或“-7.6”写入新文件吗?
    • 我以为您想要原始值。
  3. 您想跳过文件中的行吗?或基于第一列的位置?
    • 我以为您想跳过职位。
  4. 您说您想从原始文件中删除数据。
    • 我认为跳过位置就足够了。

Here's a stab at what I think you wanted (though your requirements were kind of difficult to follow):

def extract_bio_data(input_path, output_path):
    #open the output file and write it's headers
    output_file = open(output_path, 'w')
    output_file.write('\t'.join(('position', 'min_value', 'rows_skipped')) + '\n')

    #map column indexes (after popping the row number) to the number of rows to skip
    col_index = { 0: 15, 
                  1: 16, 
                  2: 17 }

    skip_to_position = 0
    for line in open(input_path, 'r'):
        #remove the '> ' from the beginning of the line and strip newline characters off the end
        line = line[2:].strip()

        #if the line contains no data, skip it
        if line == '':
            continue

        #split the columns on whitespace (change this to split('\t') for splitting only on tabs)
        columns = line.split()

        #extract the row number/position of this data
        position = int(columns.pop(0))

        #this is where we skip rows/positions
        if position < skip_to_position:  
            continue

        #if two columns share the minimum value, this will be the first encountered in the list
        min_index = columns.index(min(columns, key=float))

        #this is an integer version of the 'column name' which corresponds to the number of rows that need to be skipped
        rows_to_skip = col_index[min_index]

        #write data to your new file (row number, minimum value, number of rows skipped)
        output_file.write('\t'.join(str(x) for x in (position, columns[min_index], rows_to_skip)) + '\n')

        #set the number of data rows to skip from this position
        skip_to_position = position + rows_to_skip


if __name__ == '__main__':
    in_path = r'c:\temp\test_input.txt'
    out_path = r'c:\temp\test_output.txt'
    extract_bio_data(in_path, out_path)

Things that weren't clear to me:

  1. Is there really "> " at the beginning of each line or is that a copy/paste error?
    • I assumed it wasn't an error.
  2. Did you want "7.6" or "-7.6" written to the new file?
    • I assumed you wanted the original value.
  3. Did you want to skip rows in the file? or positions based on the first column?
    • I assumed you wanted to skip positions.
  4. You say you want to delete data from the original file.
    • I assumed that skipping positions was sufficient.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文