图算法 - 寻求提高可扩展性

发布于 2024-08-03 14:34:04 字数 1669 浏览 9 评论 0原文

我编写了一个算法来计算和存储 DAG 的所有路径，并且它在小图上运行得很好 - 但现在我正在寻求提高它在较大图上运行的效率。该算法的核心逻辑在下面的 createSF() 和 makePathList() 中，其他方法是帮助器 - 我可以看到追加是一个瓶颈。然而，我想最大的帮助是设计一个可以在字典中存储路径的数据结构，因为许多路径是由其他路径组成的，这是我问题的关键。

private Multiset<String> paths = new Multiset<String>();    

public Multiset<String> createSF(DAGNode n) {

    for (DAGNode succ : n.getSuccessors())
        createSF(succ);
    if (!n.isVisited())
        for (String s : makePathList(n)) 
            paths.put(s);

    n.setVisited(true);
    return paths;
}

private List<String> makePathList(DAGNode n) {
    List<String> list = new ArrayList<String>();

    list.add(n.getLabel());
    for (DAGNode node : n.getSuccessors())
        list.addAll(append(n.getLabel(), makePathList(node)));

return list;
}

private List<String> append(String s, List<String> src) {
    List<String> ls = new ArrayList<String>();
    for (String str : src) 
    ls.add(s + "/" + str);

    return ls;
}

编辑：

我现在使用路径对象来表示路径，并确定了这两种方法的瓶颈：

public List<Path> createPathList(Tree n) {
    List<Path> list = new ArrayList<Path>();
    list.add(new Path(n.getNodeName()));
    for (Tree node : n.getSuccessors()) {
        list.addAll(append(n.getNodeName(), createPathList(node)));
    }
    return list;
}

public List<Path> append(String s, List<Path> src) {
    List<Path> ls = new ArrayList<Path>();
    for (Path path : src) {
        ls.add(new Path(path, s));
    }
    return ls;
}

问题是当图形大小为 M 时，这些方法将被调用 M 次，这意味着有很多列表在这里创建...是否有更有效的方法来构建 createPathList() 的返回？

原文

I've written an algorithm which calculates and stores all paths of a DAG, and it works nicely on small graphs - but now i'm looking to improve it's efficiency to run over larger graphs. The core logic of the algorithm is in createSF() and makePathList() below, the other methods are helpers - I can see that append is a bottleneck. However, I guess the biggest help would be to devise a data structure that can store paths in a dictionary, since many of the paths are composed of other paths, this is the crux of my question.

private Multiset<String> paths = new Multiset<String>();    

public Multiset<String> createSF(DAGNode n) {

    for (DAGNode succ : n.getSuccessors())
        createSF(succ);
    if (!n.isVisited())
        for (String s : makePathList(n)) 
            paths.put(s);

    n.setVisited(true);
    return paths;
}

private List<String> makePathList(DAGNode n) {
    List<String> list = new ArrayList<String>();

    list.add(n.getLabel());
    for (DAGNode node : n.getSuccessors())
        list.addAll(append(n.getLabel(), makePathList(node)));

return list;
}

private List<String> append(String s, List<String> src) {
    List<String> ls = new ArrayList<String>();
    for (String str : src) 
    ls.add(s + "/" + str);

    return ls;
}

EDIT:

I'm now using a path object to represent paths and have pin-pointed the bottle neck to these two methods:

public List<Path> createPathList(Tree n) {
    List<Path> list = new ArrayList<Path>();
    list.add(new Path(n.getNodeName()));
    for (Tree node : n.getSuccessors()) {
        list.addAll(append(n.getNodeName(), createPathList(node)));
    }
    return list;
}

public List<Path> append(String s, List<Path> src) {
    List<Path> ls = new ArrayList<Path>();
    for (Path path : src) {
        ls.add(new Path(path, s));
    }
    return ls;
}

Trouble is when a graph is size M these methods will be called M times, this means there is a lot of lists being created here... is there a more efficient way to build up the return for createPathList()?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

萌能量女王 2024-08-10 14:34:04

为了回答这个问题，有必要了解为什么需要路径列表。路径列表不会为您提供有关 DAG 表示形式的任何附加信息。

如果您想分别计算每个路径的内容，或者计算所有路径的总和/最小值/最大值之类的内容，也可以使用 DAG 本身来完成。

如果您确实坚持保存单独的路径，一种选择是将 DAG 转换为 Trie< 的变体/a>.另一种选择可能是使用 Lempel- 的某些变体Ziv 表示。这取决于您的 DAG 类型以及您希望如何处理路径信息。

回复收藏 0 原文

分开我的手 2024-08-10 14:34:04

看一下 Graphviz 的 DOT 源代码，它可能会给你一些想法。

回复收藏 0 原文

毁梦 2024-08-10 14:34:04

请允许我首先提出两条（希望有帮助的）评论：

我在理解您的代码时遇到了一些困难，因为某些方法名称误导了我。从名字上看，我期待的是别的东西。我可以建议一些重构吗：（

makePathList -> createPathList  // you actually create a List here
append -> createPathList // yes, same name as above because it creates the same
                         // type of list, just with different parameters

删除了在 Robert 编辑后变得过时的部分答案）

正如 Margus 所说，用 StringBuilder 附加链替换字符串连接并不会提高性能。无论如何，编译器可能会优化 String 串联到 StringBuilder 附加（我见过这样的字节代码）。

您可以尝试将 DAG 转换为树结构。引入一个不可见的根，将所有节点作为直接子节点。现在，为每个节点添加它的后继节点（子节点和/或同级节点）。现在叶子的数量应该等于路径的数量，并且从根到任何叶子的每个图都是 DAG 中的一条路径。

编辑

一个小改进 - 它是微优化，但至少会留下更少的垃圾：

private List<String> append(String node, List<String> allPathsStartingAfterNode) {
    List<String> allPathsStartingAtNode = new ArrayList<String>();
    String nodeWithSeparator = node + "/";

    for (String aPathStartingAfterNode : allPathsStartingAfterNode) {
        allPathsStartingAtNode.add(nodeWithSeparator + aPathStartingAfterNode);
    }

    return allPathsStartingAtNode;
}

Please allow me to put two (hopefully helpful) comments first:

I had some difficulties understanding your code, because some of the method names misled me. From looking at the names, I expected something else. May I suggest some refactoring:

makePathList -> createPathList  // you actually create a List here
append -> createPathList // yes, same name as above because it creates the same
                         // type of list, just with different parameters

(removed part of the answer that became obsolete after Robert's edit)

As Margus said, replacing the String concatenation with a StringBuilder append chain doesn't increase your performance. Compilers may optimize String concatenations to StringBuilder appends anyway (I've seen such byte code).

You could try to convert the DAG into a tree structure. Introduce an invisible root with all nodes as direct children. Now for each node add it's successors (child and/or sibling). The number of leaves now should be equal to the number of path and every graph from the root to any leaf is one path in the DAG.

Edit

A small improvement - it's micro-optimization but at least it will leave less garbage:

private List<String> append(String node, List<String> allPathsStartingAfterNode) {
    List<String> allPathsStartingAtNode = new ArrayList<String>();
    String nodeWithSeparator = node + "/";

    for (String aPathStartingAfterNode : allPathsStartingAfterNode) {
        allPathsStartingAtNode.add(nodeWithSeparator + aPathStartingAfterNode);
    }

    return allPathsStartingAtNode;
}

回复收藏 0 原文