文本 - 从文本文件中提取具有某些字符的句子python

发布于 2025-01-15 16:03:58 字数 526 浏览 6 评论 0原文

我有一个文本文件（将其视为 main.txt），其中包含多种语言内容，并且我有一个包含特定字符的字符集文本文件。例如：字符集文本文件包含

a
b
c
d

我想从 main.txt 中提取行。如果存在除 a、b、c、d 之外的任何字符。不应将其提取。

我的代码：

character_set = ['a', 'b', 'c', 'd']
    if any([character in character_set for character in line]):
        with open('text.txt', 'a+', encoding='utf8') as f:
            f.write(line)

它以这样一种方式提供输出：如果存在 a、b、c、d 中的任何一个字符，则提取该行。

预期输出：

不要提取包含 a、b、c、d 以外的字符的行。

所以这个逻辑是不同的。帮我解决这个问题

原文

I am having a text file(consider it as main.txt) which has multiple language contents and I am having a charcter set text file which has particular characters. For example: character set text file contains

a
b
c
d

I want to extract lines from main.txt. If any characters other than a,b,c,d is present. It should not be extracted.

My code:

character_set = ['a', 'b', 'c', 'd']
    if any([character in character_set for character in line]):
        with open('text.txt', 'a+', encoding='utf8') as f:
            f.write(line)

It gives output in such a way that if any one of the character from a,b,c,d is present, then that line is extracted.

Expected Output:

Don't Extract lines that have character other than a,b,c,d.

So the logic is different for this.
Help me in this problem

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

梦归所梦 2025-01-22 16:03:58

character_set = ['a', 'b', 'c', 'd']
with open('text.txt', 'a+', encoding='utf8') as f:
    for character in line.readlines():
        if character.strip() not in character_set:
            f.write(character.strip())

我还没有测试过这个。希望我能帮上忙！

character_set = ['a', 'b', 'c', 'd']
with open('text.txt', 'a+', encoding='utf8') as f:
    for character in line.readlines():
        if character.strip() not in character_set:
            f.write(character.strip())

I haven't tested this yet. Hopefully I helped something!

回复收藏 0 原文

月下客 2025-01-22 16:03:58

使用设置：

character_set = {'a', 'b', 'c', 'd'}
# or even:
# character_set = set('abcd')

# ... open the file you read from and the one you write to.
for line in in_file:
        if not set(line) - character_set:
            out_file.write(line)

use sets:

character_set = {'a', 'b', 'c', 'd'}
# or even:
# character_set = set('abcd')

# ... open the file you read from and the one you write to.
for line in in_file:
        if not set(line) - character_set:
            out_file.write(line)

回复收藏 0 原文

~没有更多了~