是否有一种有效的算法来输出存储在按字典顺序排序的列表中的所有字符串,这些字符串是输入字符串的排列?
我想找到解决这个问题的最有效的算法: 给定一个字符串 str
和一个仅由小写英文字符组成且按字典顺序排序的字符串列表 lst
,找到 中的所有单词>lst
是 str
的排列。
例如: str
= "cat", lst
= {"aca", "acc", "act", "cta", "tac"}
将返回:{"act" , "cta", "tac"}
我已经有了一个算法,该算法没有利用 lst
按字典顺序排序的事实,并且我正在寻找利用此优势的最有效算法事实。
我的算法是这样的:
public List<String> getPermutations(String str, List<String> lst){
List<String> res = new ArrayList<>();
for (String word : lst)
if (checkPermutation(word, str))
res.add(word);
return res;
}
public boolean checkPermutation(String word1, String word2) {
if (word1.length() != word2.length())
return false;
int[] count = new int[26];
int i;
for (i = 0; i < word1.length(); i++) {
count[word1.charAt(i) - 'a']++;
count[word2.charAt(i) - 'a']--;
}
for (i = 0; i < 26; i++)
if (count[i] != 0) {
return false;
}
return true;
}
总运行时间是 O(NK),其中 N 是 lst
中的字符串数量,k 是 str
的长度。
I would like to find the most efficient algorithm for this problem:
Given a string str
and a list of strings lst
that consists of only lowercase English characters and is sorted lexicographically, find all the words in lst
that are a permutation of str
.
for example:str
= "cat", lst
= {"aca", "acc", "act", "cta", "tac"}
would return: {"act", "cta", "tac"}
I already have an algorithm that doesn't take advantage of the fact that lst
is lexicographically ordered, and I am looking for the most efficient algorithm that takes advantage of this fact.
My algorithm goes like this:
public List<String> getPermutations(String str, List<String> lst){
List<String> res = new ArrayList<>();
for (String word : lst)
if (checkPermutation(word, str))
res.add(word);
return res;
}
public boolean checkPermutation(String word1, String word2) {
if (word1.length() != word2.length())
return false;
int[] count = new int[26];
int i;
for (i = 0; i < word1.length(); i++) {
count[word1.charAt(i) - 'a']++;
count[word2.charAt(i) - 'a']--;
}
for (i = 0; i < 26; i++)
if (count[i] != 0) {
return false;
}
return true;
}
Total runtime is O(NK) where N is the number of strings in lst
, and k is the length of str
.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
一种简单的优化(仅对非常大的数据集才有意义,因为它并没有真正提高 O(NK):
str
的所有字符放入 SetstrChars< /code>
strChars.contains(charFromListEntry
):检查它是否是排列注意:排序顺序在这里没有多大帮助:因为您仍然需要检查列表中下一个字符串的第一个字符,
可能还有其他检查 。避免昂贵的
checkPermutation()
运行,例如首先比较单词的长度:当列表字符串比输入字符串短时,它显然不能是 all 的排列 字符。说,最后你必须迭代列表中的所有条目并确定一个条目是否是排列。没有办法避免相应的“循环”。您唯一可以影响的是循环内发生的成本。
最后:如果您的字符串列表是一个集合,那么您可以“简单地”计算传入的
str
的所有排列,并检查每个排列是否包含在该集合中。但是当然,为了将列表转换为集合,您必须迭代该操作。One simple optimisation (that only becomes meaningful for really large data sets, as it doesn't really improve the O(NK):
str
into a SetstrChars
strChars.contains(charFromListEntry
): check whether it is a permutationNote: the sorted ordering doesn't help much here: because you still have to check the first char of the next string from your list.
There might be other checks to avoid the costly
checkPermutation()
run, for example to first compare the lengths of the words: when the list string is shorter than the input string, it obviously can't be a permutation of all chars.But as said, in the end you have to iterate over all entries in your list and determine whether an entry is a permutation. There is no way avoiding the corresponding "looping". The only thing you can affect is the cost that occurs within your loop.
Finally: if your List of strings would be a Set, then you could "simply" compute all permutations of your incoming
str
, and check for each permutation whether it is contained in that Set. But of course, in order to turn a list into a set, you have to iterate that thing.您可以迭代字符串的所有排列并使用二分搜索检查列表中的每个元素,而不是迭代列表并检查每个元素是否为字符串的排列。
例如
,现在时间复杂度为 O(K! × log N),这不一定比您的方法的 O(NK) 更好。它在很大程度上取决于 K 和 N 的大小。如果字符串非常短并且列表非常大,则它可能具有优势。
有很多可以想象的优化。例如,代替构造每个排列,然后进行二分搜索,每个递归步骤可以进行部分搜索来识别下一步的潜在搜索范围,并在清楚不能包含排列时跳过。虽然这可以显着提高性能,但它不能改变基本的时间复杂度,即最坏的情况。
Instead of iterating over the list and checking each element for being a permutation of your string, you can iterate over all permutations of the string and check each presence in the list using binary search.
E.g.
Now, the time complexity is O(K! × log N) which is not necessarily better than the O(NK) of your approach. It heavily depends on the magnitude of K and N. If the string is really short and the list really large, it may have an advantage.
There are a lot of optimizations imaginable. E.g. instead constructing each permutation, followed by a binary search, each recursion step could do a partial search to identify the potential search range for the next step and skip when it’s clear that the permutations can’t be contained. While this could raise the performance significantly, it can’t change the fundamental time complexity, i.e. the worst case.