如何在Python中用空格替换所有这些特殊字符?
如何在 python 中用空格替换所有这些特殊字符?
我有一份公司名称清单。 。 。
例如:-[myfiles.txt]
我的公司.INC
老酒私人有限公司
大师头脑有限公司
“apex实验室有限公司”
“印度新公司”
印美私人有限公司
这里,按照上面的例子。 。 。我需要文件 myfiles.txt
中的所有特殊字符[-,",/,.] 必须替换为单个空格并保存到另一个文本文件 myfiles1.txt.
有人可以帮我吗?
How to replace all those special characters with white spaces in python ?
I have a list of names of a company . . .
Ex:-[myfiles.txt]
MY company.INC
Old Wine pvt
master-minds ltd
"apex-labs ltd"
"India-New corp"
Indo-American pvt/ltd
Here, as per the above example . . . I need all the special characters[-,",/,.] in the file myfiles.txt
must be replaced with a single white space and saved into another text file myfiles1.txt
.
Can anyone please help me out?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
假设您打算更改所有非字母数字的内容,您可以在命令行上执行此操作:
或者在 Python 中使用
re
模块:Assuming you mean to change everything non-alphanumeric, you can do this on the command line:
Or in Python with the
re
module:例如
e.g.
虽然 maketrans 是最快的方法,但我从来不记得语法。由于速度很少成为问题,而且我了解正则表达式,因此我倾向于这样做:
这样做的另一个好处是声明您接受的字符而不是您拒绝的字符,在这种情况下感觉更容易。
当然,如果您使用非 ASCII 字符,则必须返回删除您拒绝的字符。如果只有标点符号,你可以这样做:
但你会注意到
While maketrans is the fastes way to do it, I never remerber the syntax. Since speed is rarely an issue and I know regular expression, I would tend to do this:
This has the additional benefit of declaring the character you accept instead of the one you reject, which feels easier in this case.
Of couse if you are using non ASCII caracters you'll have to go back to removing the characters you reject. If there are just punctuations sign, you can do:
But you'll notice
起初我想提供一个 string.maketrans/translate 示例,但也许你正在使用一些 utf-8 编码的字符串,并且 ord() 排序的翻译表会在你脸上爆炸,所以我想到了另一个解决方案:
这不是最快的方法,但易于掌握和修改。
因此,如果您的文本是非 ascii,您可以将
conversion
和文本字符串解码为 unicode,然后以您想要的任何编码重新编码。At first i thought to provide a string.maketrans/translate example, but maybe you are using some utf-8 encoded strings and the ord() sorted translate-table will blow in your face, so i thought about another solution:
It's not the fastest way, but easy to grasp and modify.
So if your text is non-ascii you could decode
conversion
and the text-strings to unicode and afterwards reencode in whichever encoding you want to.