删除 Python 2.x 中的特定标点符号

发布于 2024-12-27 02:33:54 字数 188 浏览 0 评论 0原文

我正在使用 Python v2.6,并且有一个字符串,其中包含许多我想删除的标点符号。现在我已经考虑使用 string.punctuation() 函数,但不幸的是,我想删除除句号和破折号之外的所有标点符号。总共,我只想删除 5 个标点符号 - ()\"'

什么建议吗?我希望这是最有效的方法。

谢谢

I'm using Python v2.6 and I have a string which contains a number of punctuation characters I'd like to strip out. Now I've looked at using the string.punctuation() function but unfortunately, I want to strip out all punctuation characters except fullstops and dashes. In total, there are only a total of 5 punctuation marks I'd like to strip out - ()\"'

Any suggestions? I'd like this to be the most efficient way.

Thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

凯凯我们等你回来 2025-01-03 02:33:54

您可以使用 str.translate(table[, deletechars])table 设置为 None,这将导致 deletechars 中的所有字符从字符串中删除:

s.translate(None, r"()\"'")

一些示例:

>>> "\"hello\" '(world)'".translate(None, r"()\"'")
'hello world'
>>> "a'b c\"d e(f g)h i\\j".translate(None, r"()\"'")
'ab cd ef gh ij'

You can use str.translate(table[, deletechars]) with table set to None, which will result in all characters from deletechars being removed from the string:

s.translate(None, r"()\"'")

Some examples:

>>> "\"hello\" '(world)'".translate(None, r"()\"'")
'hello world'
>>> "a'b c\"d e(f g)h i\\j".translate(None, r"()\"'")
'ab cd ef gh ij'
往日情怀 2025-01-03 02:33:54

您可以列出所有不需要的字符:

unwanted = ['(', ')', '\\', '"', '\'']

然后您可以创建一个函数 strip_punctuation(s) ,如下所示:

def strip_punctuation(s): 
    for u in unwanted: 
        s = s.replace(u, '')
    return s

You could make a list of all the characters you don't want:

unwanted = ['(', ')', '\\', '"', '\'']

Then you could make a function strip_punctuation(s) like so:

def strip_punctuation(s): 
    for u in unwanted: 
        s = s.replace(u, '')
    return s
∝单色的世界 2025-01-03 02:33:54
>>> import re
>>> r = re.compile("[\(\)\\\\'\"]")
>>> r.sub("", "\"hello\" '(world)'\\\\\\")
'hello world'
>>> import re
>>> r = re.compile("[\(\)\\\\'\"]")
>>> r.sub("", "\"hello\" '(world)'\\\\\\")
'hello world'
靑春怀旧 2025-01-03 02:33:54

使用 string.translate

s = ''' abc(de)f\gh"i' '''
print(s.translate(None, r"()\"'"))
 # abcdefghi 

re.sub

import re
re.sub(r"[\\()'\"]",'',s)

但是 string.translate 似乎快了一个数量级:

In [148]: %timeit (s*1000).translate(None, r"()\"'")
10000 loops, best of 3: 112 us per loop

In [146]: %timeit re.sub(r"[\\()'\"]",'',s*1000)
100 loops, best of 3: 2.11 ms per loop

Using string.translate:

s = ''' abc(de)f\gh"i' '''
print(s.translate(None, r"()\"'"))
 # abcdefghi 

or re.sub:

import re
re.sub(r"[\\()'\"]",'',s)

but string.translate appears to be an order of magnitude faster:

In [148]: %timeit (s*1000).translate(None, r"()\"'")
10000 loops, best of 3: 112 us per loop

In [146]: %timeit re.sub(r"[\\()'\"]",'',s*1000)
100 loops, best of 3: 2.11 ms per loop
尴尬癌患者 2025-01-03 02:33:54

您可以创建一个包含所有要替换的字符的字典,并将它们替换为您选择的字符。

char_replace = {"'":"" , "(":"" , ")":"" , "\":"" , """:""}

for i,j in char_replace.iteritems():
        string = string.replace(i,j)

You can create a dict of all the characters you want to be replaced and replace them with char of your choice.

char_replace = {"'":"" , "(":"" , ")":"" , "\":"" , """:""}

for i,j in char_replace.iteritems():
        string = string.replace(i,j)
ぶ宁プ宁ぶ 2025-01-03 02:33:54
my_string = r'''\(""Hello ''W\orld)'''
strip_chars = r'''()\'"'''

使用理解:

''.join(x for x in my_string if x not in strip_chars)

使用过滤器:

''.join(filter(lambda x: x not in strip_chars, my_string))

输出:

Hello World
my_string = r'''\(""Hello ''W\orld)'''
strip_chars = r'''()\'"'''

using comprehension:

''.join(x for x in my_string if x not in strip_chars)

using filter:

''.join(filter(lambda x: x not in strip_chars, my_string))

output:

Hello World
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文