如何使用 Java 8 避免多个 Stream

发布于 2025-01-19 18:34:10 字数 702 浏览 4 评论 0原文

我有以下代码

trainResponse.getIds().stream()
        .filter(id -> id.getType().equalsIgnoreCase("Company"))
        .findFirst()
        .ifPresent(id -> {
            domainResp.setId(id.getId());
        });

trainResponse.getIds().stream()
        .filter(id -> id.getType().equalsIgnoreCase("Private"))
        .findFirst()
        .ifPresent(id ->
            domainResp.setPrivateId(id.getId())
        );

,这里我正在迭代/流式传输 Id 对象列表 2 次。

两个流之间的唯一区别在于 filter() 操作。

如何在单次迭代中实现它,以及什么是最佳方法(时间空间复杂度< /strong>)这样做?

I am having the below code

trainResponse.getIds().stream()
        .filter(id -> id.getType().equalsIgnoreCase("Company"))
        .findFirst()
        .ifPresent(id -> {
            domainResp.setId(id.getId());
        });

trainResponse.getIds().stream()
        .filter(id -> id.getType().equalsIgnoreCase("Private"))
        .findFirst()
        .ifPresent(id ->
            domainResp.setPrivateId(id.getId())
        );

Here I'm iterating/streaming the list of Id objects 2 times.

The only difference between the two streams is in the filter() operation.

How to achieve it in single iteration, and what is the best approach (in terms of time and space complexity) to do this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

不喜欢何必死缠烂打 2025-01-26 18:34:10

您可以使用 Stream IPA 一次通过给定的数据集来实现这一点,并且不会增加内存消耗(即结果将仅包含具有所需属性的 id ) )。

为此,您可以创建一个自定义的Collector,它需要一个要查找的Collection属性作为其参数,以及一个负责提取该属性的Function来自流元素。

这就是这个通用收集器的实现方式。

/** *
 * @param <T> - the type of stream elements
 * @param <F> - the type of the key (a field of the stream element)
 */
class CollectByKey<T, F> implements Collector<T, Map<F, T>, Map<F, T>> {
    private final Set<F> keys;
    private final Function<T, F> keyExtractor;
    
    public CollectByKey(Collection<F> keys, Function<T, F> keyExtractor) {
        this.keys = new HashSet<>(keys);
        this.keyExtractor = keyExtractor;
    }
    
    @Override
    public Supplier<Map<F, T>> supplier() {
        return HashMap::new;
    }
    
    @Override
    public BiConsumer<Map<F, T>, T> accumulator() {
        return this::tryAdd;
    }
    
    private void tryAdd(Map<F, T> map, T item) {
        F key = keyExtractor.apply(item);
        if (keys.remove(key)) {
            map.put(key, item);
        }
    }
    
    @Override
    public BinaryOperator<Map<F, T>> combiner() {
        return this::tryCombine;
    }
    
    private Map<F, T> tryCombine(Map<F, T> left, Map<F, T> right) {
        right.forEach(left::putIfAbsent);
        return left;
    }
    
    @Override
    public Function<Map<F, T>, Map<F, T>> finisher() {
        return Function.identity();
    }
    
    @Override
    public Set<Characteristics> characteristics() {
        return Collections.emptySet();
    }
}

main() - 演示(虚拟 Id 类未显示)

public class CustomCollectorByGivenAttributes {
    public static void main(String[] args) {
        List<Id> ids = List.of(new Id(1, "Company"), new Id(2, "Fizz"),
                               new Id(3, "Private"), new Id(4, "Buzz"));
        
        Map<String, Id> idByType = ids.stream()
                .collect(new CollectByKey<>(List.of("Company", "Private"), Id::getType));
        
        idByType.forEach((k, v) -> {
            if (k.equalsIgnoreCase("Company")) domainResp.setId(v);
            if (k.equalsIgnoreCase("Private")) domainResp.setPrivateId(v);
        });
    
        System.out.println(idByType.keySet()); // printing keys - added for demo purposes
    }
}

输出

[Company, Private]

注意,在键集变为空(即已获取所有结果数据)流的其他元素将被忽略,但仍然需要处理所有剩余数据。

You can achieve that with Stream IPA in one pass though the given set of data and without increasing memory consumption (i.e. the result will contain only ids having required attributes).

For that, you can create a custom Collector that will expect as its parameters a Collection attributes to look for and a Function responsible for extracting the attribute from the stream element.

That's how this generic collector could be implemented.

/** *
 * @param <T> - the type of stream elements
 * @param <F> - the type of the key (a field of the stream element)
 */
class CollectByKey<T, F> implements Collector<T, Map<F, T>, Map<F, T>> {
    private final Set<F> keys;
    private final Function<T, F> keyExtractor;
    
    public CollectByKey(Collection<F> keys, Function<T, F> keyExtractor) {
        this.keys = new HashSet<>(keys);
        this.keyExtractor = keyExtractor;
    }
    
    @Override
    public Supplier<Map<F, T>> supplier() {
        return HashMap::new;
    }
    
    @Override
    public BiConsumer<Map<F, T>, T> accumulator() {
        return this::tryAdd;
    }
    
    private void tryAdd(Map<F, T> map, T item) {
        F key = keyExtractor.apply(item);
        if (keys.remove(key)) {
            map.put(key, item);
        }
    }
    
    @Override
    public BinaryOperator<Map<F, T>> combiner() {
        return this::tryCombine;
    }
    
    private Map<F, T> tryCombine(Map<F, T> left, Map<F, T> right) {
        right.forEach(left::putIfAbsent);
        return left;
    }
    
    @Override
    public Function<Map<F, T>, Map<F, T>> finisher() {
        return Function.identity();
    }
    
    @Override
    public Set<Characteristics> characteristics() {
        return Collections.emptySet();
    }
}

main() - demo (dummy Id class is not shown)

public class CustomCollectorByGivenAttributes {
    public static void main(String[] args) {
        List<Id> ids = List.of(new Id(1, "Company"), new Id(2, "Fizz"),
                               new Id(3, "Private"), new Id(4, "Buzz"));
        
        Map<String, Id> idByType = ids.stream()
                .collect(new CollectByKey<>(List.of("Company", "Private"), Id::getType));
        
        idByType.forEach((k, v) -> {
            if (k.equalsIgnoreCase("Company")) domainResp.setId(v);
            if (k.equalsIgnoreCase("Private")) domainResp.setPrivateId(v);
        });
    
        System.out.println(idByType.keySet()); // printing keys - added for demo purposes
    }
}

Output

[Company, Private]

Note, after the set of keys becomes empty (i.e. all resulting data has been fetched) the further elements of the stream will get ignored, but still all remained data is required to be processed.

洒一地阳光 2025-01-26 18:34:10

IMO,两个流解决方案是最可读的。它甚至可能是使用流的最有效的解决方案。

IMO,避免多个流的最佳方法是使用经典循环。例如:

// There may be bugs ...

boolean seenCompany = false;
boolean seenPrivate = false;
for (Id id: getIds()) {
   if (!seenCompany && id.getType().equalsIgnoreCase("Company")) {
      domainResp.setId(id.getId());
      seenCompany = true;
   } else if (!seenPrivate && id.getType().equalsIgnoreCase("Private")) {
      domainResp.setPrivateId(id.getId());
      seenPrivate = true;
   }
   if (seenCompany && seenPrivate) {
      break;
   }
}

目前尚不清楚执行一次迭代或两次迭代是否更有效。这将取决于getIDS()和迭代代码返回的类。

带有两个标志的复杂内容是您如何在2个流解决方案中复制findfirst()的短路行为。我不知道是否可以使用一个流才能完全这样做。如果可以的话,它将涉及一些非常狡猾的代码。

但是,正如您可以看到2流的原始解决方案显然比上面更容易理解。


使用流的要点是使您的代码更简单。这与效率无关。当您尝试做复杂的事情以提高流的效率时,您可能首先会击败使用流的(真)目的。

IMO, the two streams solution is the most readable. And it may even be the most efficient solution using streams.

IMO, the best way to avoid multiple streams is to use a classical loop. For example:

// There may be bugs ...

boolean seenCompany = false;
boolean seenPrivate = false;
for (Id id: getIds()) {
   if (!seenCompany && id.getType().equalsIgnoreCase("Company")) {
      domainResp.setId(id.getId());
      seenCompany = true;
   } else if (!seenPrivate && id.getType().equalsIgnoreCase("Private")) {
      domainResp.setPrivateId(id.getId());
      seenPrivate = true;
   }
   if (seenCompany && seenPrivate) {
      break;
   }
}

It is unclear whether that is more efficient to performing one iteration or two iterations. It will depend on the class returned by getIds() and the code of iteration.

The complicated stuff with two flags is how you replicate the short circuiting behavior of findFirst() in your 2 stream solution. I don't know if it is possible to do that at all using one stream. If you can, it will involve something pretty cunning code.

But as you can see your original solution with 2 stream is clearly easier to understand than the above.


The main point of using streams is to make your code simpler. It is not about efficiency. When you try to do complicated things to make the streams more efficient, you are probably defeating the (true) purpose of using streams in the first place.

你是我的挚爱i 2025-01-26 18:34:10

对于您的 ID 列表,您可以只使用地图,然后在检索后分配它们(如果存在)。

Map<String, Integer> seen = new HashMap<>();

for (Id id : ids) {
    if (seen.size() == 2) {
        break;
    }
    seen.computeIfAbsent(id.getType().toLowerCase(), v->id.getId());
}

如果你想测试它,你可以使用以下内容:

record Id(String getType, int getId) {
    @Override
    public String toString() {
        return String.format("[%s,%s]", getType, getId);
    }
}

Random r = new Random();
List<Id> ids = r.ints(20, 1, 100)
        .mapToObj(id -> new Id(
                r.nextBoolean() ? "Company" : "Private", id))
        .toList();

编辑为只允许检查某些类型

如果你有两种以上的类型,但只想检查某些类型,你可以这样做接下来。

  • 除了您有一组允许的类型之外,过程是相同的。
  • 您只需使用 contains 检查是否正在处理其中一种类型。
Map<String, Integer> seen = new HashMap<>();

Set<String> allowedTypes = Set.of("company", "private");
for (Id id : ids) {
    String type = id.getType();

    if (allowedTypes.contains(type.toLowerCase())) {
        if (seen.size() == allowedTypes.size()) {
            break;
        }
        seen.computeIfAbsent(type,
                v -> id.getId());
    }
}

测试类似,只是需要包含其他类型。

  • 创建可能存在的某些类型的列表。
  • 并像以前一样建立一个列表。
  • 请注意,允许类型的大小替换了值2,以允许在退出循环之前检查两种以上的类型。
List<String> possibleTypes = 
      List.of("Company", "Type1", "Private", "Type2");
Random r = new Random();
List<Id> ids =
        r.ints(30, 1, 100)
                .mapToObj(id -> new Id(possibleTypes.get(
                        r.nextInt((possibleTypes.size()))),
                        id))
                .toList();

For your list of ids, you could just use a map, then assign them after retrieving, if present.

Map<String, Integer> seen = new HashMap<>();

for (Id id : ids) {
    if (seen.size() == 2) {
        break;
    }
    seen.computeIfAbsent(id.getType().toLowerCase(), v->id.getId());
}

If you want to test it, you can use the following:

record Id(String getType, int getId) {
    @Override
    public String toString() {
        return String.format("[%s,%s]", getType, getId);
    }
}

Random r = new Random();
List<Id> ids = r.ints(20, 1, 100)
        .mapToObj(id -> new Id(
                r.nextBoolean() ? "Company" : "Private", id))
        .toList();

Edited to allow only certain types to be checked

If you have more than two types but only want to check on certain ones, you can do it as follows.

  • the process is the same except you have a Set of allowed types.
  • You simply check to see that your are processing one of those types by using contains.
Map<String, Integer> seen = new HashMap<>();

Set<String> allowedTypes = Set.of("company", "private");
for (Id id : ids) {
    String type = id.getType();

    if (allowedTypes.contains(type.toLowerCase())) {
        if (seen.size() == allowedTypes.size()) {
            break;
        }
        seen.computeIfAbsent(type,
                v -> id.getId());
    }
}

Testing is similar except that additional types need to be included.

  • create a list of some types that could be present.
  • and build a list of them as before.
  • notice that the size of allowed types replaces the value 2 to permit more than two types to be checked before exiting the loop.
List<String> possibleTypes = 
      List.of("Company", "Type1", "Private", "Type2");
Random r = new Random();
List<Id> ids =
        r.ints(30, 1, 100)
                .mapToObj(id -> new Id(possibleTypes.get(
                        r.nextInt((possibleTypes.size()))),
                        id))
                .toList();

梦归所梦 2025-01-26 18:34:10

您可以按类型分组并检查生成的地图。
我认为 ids 的类型是 IdType

Map<String, List<IdType>> map = trainResponse.getIds()
                                .stream()
                                .collect(Collectors.groupingBy(
                                                     id -> id.getType().toLowerCase()));

Optional.ofNullable(map.get("company")).ifPresent(ids -> domainResp.setId(ids.get(0).getId()));
Optional.ofNullable(map.get("private")).ifPresent(ids -> domainResp.setPrivateId(ids.get(0).getId()));

You can group by type and check the resulting map.
I suppose the type of ids is IdType.

Map<String, List<IdType>> map = trainResponse.getIds()
                                .stream()
                                .collect(Collectors.groupingBy(
                                                     id -> id.getType().toLowerCase()));

Optional.ofNullable(map.get("company")).ifPresent(ids -> domainResp.setId(ids.get(0).getId()));
Optional.ofNullable(map.get("private")).ifPresent(ids -> domainResp.setPrivateId(ids.get(0).getId()));
唐婉 2025-01-26 18:34:10

我建议使用传统的 for 循环。除了易于扩展之外,这还可以防止您多次遍历集合。
您的代码看起来像是将来会被推广的代码,因此是我的通用方法。

这是一些伪代码(有错误,只是为了说明)

Set<String> matches = new TreeSet<>(String.CASE_INSENSITIVE_ORDER);
for(id : trainResponse.getIds()) {

    if (! matches.add(id.getType())) {
        continue;
    }

    switch (id.getType().toLowerCase()) {

        case "company":
            domainResp.setId(id.getId());
            break;

        case "private":
            ...
    }
}

I'd recommend a traditionnal for loop. In addition of being easily scalable, this prevents you from traversing the collection multiple times.
Your code looks like something that'll be generalised in the future, thus my generic approch.

Here's some pseudo code (with errors, just for the sake of illustration)

Set<String> matches = new TreeSet<>(String.CASE_INSENSITIVE_ORDER);
for(id : trainResponse.getIds()) {

    if (! matches.add(id.getType())) {
        continue;
    }

    switch (id.getType().toLowerCase()) {

        case "company":
            domainResp.setId(id.getId());
            break;

        case "private":
            ...
    }
}
如梦亦如幻 2025-01-26 18:34:10

沿着这些线路可能会起作用,但是它会贯穿整个流,并且在第一次发生时不会停止。
但是,假设每种类型的小流和一个ID只有一个ID,为什么不呢?

Map<String, Consumer<String>> setters = new HashMap<>();
setters.put("Company", domainResp::setId);
setters.put("Private", domainResp::setPrivateId);

trainResponse.getIds().forEach(id -> {
    if (setters.containsKey(id.getType())) {
        setters.get(id.getType()).accept(id.getId());
    }
});

Something along these lines can might work, it would go through the whole stream though, and won't stop at the first occurrence.
But assuming a small stream and only one Id for each type, why not?

Map<String, Consumer<String>> setters = new HashMap<>();
setters.put("Company", domainResp::setId);
setters.put("Private", domainResp::setPrivateId);

trainResponse.getIds().forEach(id -> {
    if (setters.containsKey(id.getType())) {
        setters.get(id.getType()).accept(id.getId());
    }
});
独﹏钓一江月 2025-01-26 18:34:10

Java 9开始,我们可以使用Collectors.filtering来根据条件收集值。

对于这种情况,我更改了如下代码

final Map<String, String> results = trainResponse.getIds()
            .stream()
            .collect(Collectors.filtering(
                id -> id.getType().equals("Company") || id.getIdContext().equals("Private"),
                Collectors.toMap(Id::getType, Id::getId, (first, second) -> first)));

并从 results Map 获取 id

We can use the Collectors.filtering from Java 9 onwards to collect the values based on condition.

For this scenario, I have changed code like below

final Map<String, String> results = trainResponse.getIds()
            .stream()
            .collect(Collectors.filtering(
                id -> id.getType().equals("Company") || id.getIdContext().equals("Private"),
                Collectors.toMap(Id::getType, Id::getId, (first, second) -> first)));

And getting the id from results Map.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文