按长度查找字符串的重复项
我有一串类似于下图所示的字母:
'ABTSOFDNSOHASAPMAPDSNFAKSGMOMAPEPTNSNTROMAPKSDFANSDHASOMAPDODDFG'
我将其视为密文,因此想要开始查找重复的位置,以便找到加密密钥的长度(上面的示例是随机的,因此没有直接的方法)答案将来自它)
现在我想要做的是编写一个可以找到长度为 3 的重复的代码 - 例如重复“MAP”和“HAS”。我希望代码能够找到这些重复项,而不是我必须指定它应该查找的子字符串。
以前我使用过:
text.find("MAP")
使用下面的答案我写过:
substring = []
for i in range(len(Phrase)-4):
substring.append(Phrase[i:i+4])
for index, value in freq.iteritems():
if value > 1:
for i in range(len(Phrase)-4):
if index == Phrase[i:i+4]:
print(index)
这给出了每个重复子字符串出现次数的列表,理想情况下我希望这只是子字符串及其出现位置的列表
I have a string of letters similar to that shown below:
'ABTSOFDNSOHASAPMAPDSNFAKSGMOMAPEPTNSNTROMAPKSDFANSDHASOMAPDODDFG'
I am treating this as a cipher text and therefore want to begin to find the position of repetitions in order to find the length of the encryption key (the example above is random so no direct answers will come from it)
For now what I want to be able to do is write a code that can find repetitions of length 3 - for example 'MAP' and 'HAS' are repeated. I want the code to find these repetitions as opposed to me having to specify the substring it should look for.
Previously I have used:
text.find("MAP")
Using the answer below I have written:
substring = []
for i in range(len(Phrase)-4):
substring.append(Phrase[i:i+4])
for index, value in freq.iteritems():
if value > 1:
for i in range(len(Phrase)-4):
if index == Phrase[i:i+4]:
print(index)
This gives a list of each repeated substring as many times as it appears, ideally I want this to just be a list of the substring with the positions it appears in
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
这是一个仅使用内置函数的解决方案
创建一个函数,该函数将生成三个重叠的块 - 灵感来自成对函数。
用每个块的位置创建一个字典。
过滤字典以查找具有多个位置的块。
例子:
Here is a solution using only built-ins
Make a function that will produce overlapping chunks of three - inspired by the pairwise function.
Make a dictionary with the position(s) of each chunk.
Filter the dictionary for chunks that have more than one position.
Example: