如何在Python中用空格替换所有这些特殊字符?

发布于 2024-12-26 01:51:17 字数 384 浏览 1 评论 0原文

如何在 python 中用空格替换所有这些特殊字符?

我有一份公司名称清单。 。 。

例如:-[myfiles.txt]

我的公司.INC

老酒私人有限公司

大师头脑有限公司

“apex实验室有限公司”

“印度新公司”

印美私人有限公司

这里,按照上面的例子。 。 。我需要文件 myfiles.txt 中的所有特殊字符[-,",/,.] 必须替换为单个空格并保存到另一个文本文件 myfiles1.txt.

有人可以帮我吗?

How to replace all those special characters with white spaces in python ?

I have a list of names of a company . . .

Ex:-[myfiles.txt]

MY company.INC

Old Wine pvt

master-minds ltd

"apex-labs ltd"

"India-New corp"

Indo-American pvt/ltd

Here, as per the above example . . . I need all the special characters[-,",/,.] in the file myfiles.txt must be replaced with a single white space and saved into another text file myfiles1.txt.

Can anyone please help me out?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

蓝色星空 2025-01-02 01:51:17

假设您打算更改所有非字母数字的内容,您可以在命令行上执行此操作:

cat foo.txt | sed "s/[^A-Za-z0-99]/ /g" > bar.txt

或者在 Python 中使用 re 模块:

import re
original_string = open('foo.txt').read()
new_string = re.sub('[^a-zA-Z0-9\n\.]', ' ', original_string)
open('bar.txt', 'w').write(new_string)

Assuming you mean to change everything non-alphanumeric, you can do this on the command line:

cat foo.txt | sed "s/[^A-Za-z0-99]/ /g" > bar.txt

Or in Python with the re module:

import re
original_string = open('foo.txt').read()
new_string = re.sub('[^a-zA-Z0-9\n\.]', ' ', original_string)
open('bar.txt', 'w').write(new_string)
江城子 2025-01-02 01:51:17
import string

specials = '-"/.' #etc
trans = string.maketrans(specials, ' '*len(specials))
#for line in file
cleanline = line.translate(trans)

例如

>>> line = "Indo-American pvt/ltd"
>>> line.translate(trans)
'Indo American pvt ltd'
import string

specials = '-"/.' #etc
trans = string.maketrans(specials, ' '*len(specials))
#for line in file
cleanline = line.translate(trans)

e.g.

>>> line = "Indo-American pvt/ltd"
>>> line.translate(trans)
'Indo American pvt ltd'
奈何桥上唱咆哮 2025-01-02 01:51:17
import re
strs = "how much for the maple syrup? $20.99? That's ricidulous!!!"
strs = re.sub(r'[?|$|.|!]',r'',strs) #for remove particular special char
strs = re.sub(r'[^a-zA-Z0-9 ]',r'',strs) #for remove all characters
strs=''.join(c if c not in map(str,range(0,10)) else '' for c in strs) #for remove numbers
strs = re.sub('  ',' ',strs) #for remove extra spaces
print(strs) 

Ans: how much for the maple syrup Thats ricidulous
import re
strs = "how much for the maple syrup? $20.99? That's ricidulous!!!"
strs = re.sub(r'[?|$|.|!]',r'',strs) #for remove particular special char
strs = re.sub(r'[^a-zA-Z0-9 ]',r'',strs) #for remove all characters
strs=''.join(c if c not in map(str,range(0,10)) else '' for c in strs) #for remove numbers
strs = re.sub('  ',' ',strs) #for remove extra spaces
print(strs) 

Ans: how much for the maple syrup Thats ricidulous
苏大泽ㄣ 2025-01-02 01:51:17

虽然 maketrans 是最快的方法,但我从来不记得语法。由于速度很少成为问题,而且我了解正则表达式,因此我倾向于这样做:

>>> line = "-[myfiles.txt] MY company.INC"
>>> import re
>>> re.sub(r'[^a-zA-Z0-9]', ' ',line)
'  myfiles txt  MY company INC'

这样做的另一个好处是声明您接受的字符而不是您拒绝的字符,在这种情况下感觉更容易。

当然,如果您使用非 ASCII 字符,则必须返回删除您拒绝的字符。如果只有标点符号,你可以这样做:

>>> import string
>>> chars = re.escape(string.punctuation)
>>> re.sub(r'['+chars+']', ' ',line)
'  myfiles txt  MY company INC'

但你会注意到

While maketrans is the fastes way to do it, I never remerber the syntax. Since speed is rarely an issue and I know regular expression, I would tend to do this:

>>> line = "-[myfiles.txt] MY company.INC"
>>> import re
>>> re.sub(r'[^a-zA-Z0-9]', ' ',line)
'  myfiles txt  MY company INC'

This has the additional benefit of declaring the character you accept instead of the one you reject, which feels easier in this case.

Of couse if you are using non ASCII caracters you'll have to go back to removing the characters you reject. If there are just punctuations sign, you can do:

>>> import string
>>> chars = re.escape(string.punctuation)
>>> re.sub(r'['+chars+']', ' ',line)
'  myfiles txt  MY company INC'

But you'll notice

好久不见√ 2025-01-02 01:51:17

起初我想提供一个 string.maketrans/translate 示例,但也许你正在使用一些 utf-8 编码的字符串,并且 ord() 排序的翻译表会在你脸上爆炸,所以我想到了另一个解决方案:

conversion = '-"/.'
text =  f.read()
newtext = ''
for c in text:
    newtext += ' ' if c in conversion else c

这不是最快的方法,但易于掌握和修改。

因此,如果您的文本是非 ascii,您可以将 conversion 和文本字符串解码为 un​​icode,然后以您想要的任何编码重新编码。

At first i thought to provide a string.maketrans/translate example, but maybe you are using some utf-8 encoded strings and the ord() sorted translate-table will blow in your face, so i thought about another solution:

conversion = '-"/.'
text =  f.read()
newtext = ''
for c in text:
    newtext += ' ' if c in conversion else c

It's not the fastest way, but easy to grasp and modify.

So if your text is non-ascii you could decode conversion and the text-strings to unicode and afterwards reencode in whichever encoding you want to.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文