如何使用Python删除字符串中的重复单词?
以下示例:
string1 = "calvin klein design dress calvin klein"
如何删除后两个重复的 "calvin"
和 "klein"
?
结果应该看起来
string2 = "calvin klein design dress"
只应删除第二个重复项,并且不应更改单词的顺序!
Following example:
string1 = "calvin klein design dress calvin klein"
How can I remove the second two duplicates "calvin"
and "klein"
?
The result should look like
string2 = "calvin klein design dress"
only the second duplicates should be removed and the sequence of the words should not be changed!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(17)
您可以使用一组来跟踪已处理的单词。
You can use a set to keep track of already processed words.
有几个答案与此非常接近,但还没有完全达到我的预期:
当然,如果您想要它更干净或更快一点,我们可以重构一下:
我认为第二个版本的性能尽可能高获取少量代码。 (可以使用更多代码在一次扫描输入字符串中完成所有工作,但对于大多数工作负载来说,这应该足够了。)
Several answers are pretty close to this but haven't quite ended up where I did:
Of course, if you want it a tiny bit cleaner or faster, we can refactor a bit:
I think the second version is about as performant as you can get in a small amount of code. (More code could be used to do all the work in a single scan across the input string but for most workloads, this should be sufficient.)
问题:去除字符串中的重复项
Question: Remove the duplicates in a string
使用numpy函数
进行导入最好有一个导入别名(如 np)
,然后你可以像这样 bing 它
您可以在您的情况下使用它,您可以使用
为了从数组中删除重复项,如果您想要字符串结果,
Use numpy function
make an import its better to have an alias for the import (as np)
and then you can bing it like this
for removing duplicates from array you can use it this way
for your case if you want result in string you can use
要从句子中删除重复的单词并保留单词的顺序,您可以使用 dict.fromkeys 方法。
To remove duplicate words from sentence and preserve the order of the words you can use
dict.fromkeys
method.11 和 2 完美工作:
和 2
11 and 2 work perfectly:
and 2
您可以使用以下代码从文本文件或字符串中删除重复或重复的单词 -
PS - 根据需要进行标识。
希望这有帮助!
You can remove duplicate or repeated words from a text file or string using following codes -
P.S. -Do identations as per required.
Hope this helps!!!
不使用 split 功能(在面试中会有帮助)
Without using the split function (will help in interviews)
初始化列表
使用
set()
和split()
结果
initializing list
using
set()
andsplit()
Result
或者这个
或者这个
OR THIS
OR THIS
您只需获取与字符串关联的集合即可做到这一点,字符串是一个根据定义不包含重复元素的数学对象。将集合中的单词重新连接成字符串就足够了:
You can do that simply by getting the set associated to the string, which is a mathematical object containing no repeated elements by definition. It suffices to join the words in the set back into a string:
这将根据原始单词列表中的单词索引对字符串中所有(唯一)单词的集合进行排序。
This sorts the set of all the (unique) words in your string by the word's index in the original list of words.
在 Python 2.7+ 中,您可以使用
collections.OrderedDict
为此:In Python 2.7+, you could use
collections.OrderedDict
for this:从 itertools 食谱 剪切并粘贴
我真的希望他们可以继续制作一个很快就可以摆脱这些食谱了。我非常希望能够执行
from itertools_recipes import unique_everseen
而不是每次需要时都使用剪切和粘贴。像这样使用:
Cut and paste from the itertools recipes
I really wish they could go ahead and make a module out of those recipes soon. I'd very much like to be able to do
from itertools_recipes import unique_everseen
instead of using cut-and-paste every time I need something.Use like this:
说明:
.split()
- 这是一种将字符串拆分为列表的方法(没有参数,它用空格拆分)set()
- 它是排除重复项的无序集合类型'separator'.join(list)
- 表示您想要将列表从参数连接到字符串,并在元素之间使用“分隔符”Explanation:
.split()
- it is a method to split string to list (without params it split by spaces)set()
- it is type of unordered collections that exclude dublicates'separator'.join(list)
- mean that you want to join list from params to string with 'separator' between elements