优化两个列表之间的前缀搜索的时间复杂性

发布于 2025-02-14 01:04:32 字数 1494 浏览 0 评论 0 原文

希望重构一些旧代码，并且我有一种与以下简化版本相似的方法：

public static List<String> getAllPrefixedCodes(List<String> codes, List<String> prefixes) {
   var prefixedCodes = new ArrayList<String>();
   for (String code : codes) {
      for (String prefix : prefixes) {
         if (code.startsWith(prefix)) {
            prefixedCodes.add(code);
         }
      }
   }

   return prefixedCodes;
}

我正在寻找提高方法速度的方法，并在其他堆栈溢出帖子中进行了尝试。经过一些研究后，尽管它们很酷，所以我重新完成了该方法如下：

public static List<String> getAllPrefixedCodes(Collection<String> codes, Collection<String> prefixes) {
   final var trie = codes.stream()
                    .collect(Collectors.toMap(Function.identity(), Function.identity(), 
                       (a, b) -> a, PatriciaTrie::new));

   return prefixes.parallelStream()
             .map(trie::prefixMap)
             .map(SortedMap::values)
             .flatMap(Collection::stream)
             .collect(Collectors.toList());
}

我认为该方法更简单，这对我来说已经是一个加号，但是我第二次猜测这是时间复杂性的改善。我的第一个本能是它已经从o（n^2）变成了o（n）。在所有第二种方法都将O（n）加载到Trie之后，o（m）通过Trie搜索O（n）查找的前缀（来自）。

但是在第二个流中，我正在做一个O（n）查找o（m）次，所以无论如何我都击中o（n^2），对吗？是否有一种更有效或更聪明的方法来执行这种操作风格？

原文

Looking to refactor some legacy code, and I have a method that's functionally similar to the following simplified version:

public static List<String> getAllPrefixedCodes(List<String> codes, List<String> prefixes) {
   var prefixedCodes = new ArrayList<String>();
   for (String code : codes) {
      for (String prefix : prefixes) {
         if (code.startsWith(prefix)) {
            prefixedCodes.add(code);
         }
      }
   }

   return prefixedCodes;
}

I was looking for ways to increase the speed of the method, and came across Tries in another stack overflow post somewhere. After doing some research, though they were cool so I reimplemented the method as following:

public static List<String> getAllPrefixedCodes(Collection<String> codes, Collection<String> prefixes) {
   final var trie = codes.stream()
                    .collect(Collectors.toMap(Function.identity(), Function.identity(), 
                       (a, b) -> a, PatriciaTrie::new));

   return prefixes.parallelStream()
             .map(trie::prefixMap)
             .map(SortedMap::values)
             .flatMap(Collection::stream)
             .collect(Collectors.toList());
}

The method is much simpler to look at in my opinion, and that's a plus already for me but I'm second guessing that it's time complexity improvement. My first instinct is that it's gone from O(n^2) to O(n). After all the second method is taking O(n) to load the trie, O(m) to search prefixes with O(n)lookup thanks to the trie (from the the apache docs).

But in the second stream, I'm doing an O(n) lookup O(m) times, so I'm hitting O(n^2) anyway, correct? Is there a more efficient or intelligent way to perform this style of operation?

分享到QQ

分享到微博