Python 的层次结构遍历和比较模块?
我在日常开发中处理很多层次结构。 文件系统、Autodesk Maya 中的嵌套 DAG 节点等。
我想知道,Python 是否有专门设计用于遍历和比较对象层次结构的好模块?
特别令人感兴趣的是在两个几乎相同的层次结构之间进行“模糊”比较的方法。 这样做的部分原因是为了匹配 Maya 中来自两个不同角色的两个节点层次结构,以便将动画从一个角色传输到另一个角色。
根据我一直在阅读的内容,我可能需要带有名称阈值的东西(我可以自己构建)来比较两个节点名称彼此的接近程度。 然后,我需要一种方法来选择性地忽略子节点在层次结构中出现的顺序。 最后,我需要处理深度阈值,以防节点可能在层次结构中稍微向上或向下移动。
I deal with a lot of hierarchies in my day to day development. File systems, nested DAG nodes in Autodesk Maya, etc.
I'm wondering, are there any good modules for Python specifically designed to traverse and compare hierarchies of objects?
Of particular interest would be ways to do 'fuzzy' comparisons between two nearly identical hierarchies. Some of the reasons for doing this would be for matching two node hierarchies in Maya from two different characters in order to transfer animation from one to the other.
Based on what I've been reading, I'd probably need something with a name threshold (which I could build myself) for comparing how close two node names are to each other. I'd then need a way to optionally ignore the order that child nodes appear in the hierarchy. Lastly, I'd need to deal with a depth threshold, in cases where a node may have been slightly moved up or down the hierarchy.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我不确定我是否需要一个完整的模块——层次结构是一种设计模式,每个层次结构都有足够的独特功能,很难概括。
我发现这就是我所需要的。 我发现很难从中创建一个可重用的模块,因为(a)这里的模块太少了,(b)每个应用程序都添加或更改了太多的代码。
此外,我发现最常用的层次结构是文件系统,为此我有 os 模块。 第二个最常用的层次结构是 XML 消息,我有 ElementTree(通常通过 lxml)。 在这两个之后,我使用上述结构作为我的类的模板,而不是作为字面可重用模块。
I'm not sure I see the need for a complete module -- hierarchies are a design pattern, and each hierarchy has enough unique features that it's hard to generalize.
I find that this is all I need. And I've found that it's hard to make a reusable module out of this because (a) there's so little here and (b) each application adds or changes so much code.
Further, I find that the most commonly used hierarchy is the file system, for which I have the
os
module. The second most commonly used hierarchy is XML messages, for which I have ElementTree (usually via lxml). After those two, I use the above structures as templates for my classes, not as a literal reusable module.我建议深入研究 xmldifff http://www.logilab.org/859 并查看它们如何比较节点并处理并行树。 或者,尝试编写一个[递归]生成器,生成树中的每个[重要]节点,例如
f(t)
,然后使用itertools.izip(f(t1),f(t2) ))
将节点对收集在一起进行比较。我处理的大多数层次结构都有多个“轴”,就像 XML 中的元素和属性,并且某些节点比其他节点更重要。
对于更奇怪的解决方案,将两棵树序列化为文本文件,并注明行 #n 来自树中的节点 #x。 对两棵树执行此操作,将文件输入 diff,然后扫描结果以注意树的哪些部分已更改。 您可以映射文件 1 中的行 #n(因此第一棵树中的节点 #x)和文件 2 中的行 #m(因此第二棵树的节点 #y)意味着每棵树的某些部分是相同的或不同的。
对于任何解决方案,您都必须建立树的“规范形式”,该形式可能会从比较过程中删除所有可忽略的空白、显示属性、可选节点等。 它还可能意味着对树进行广度优先与深度优先的遍历。
I recommend digging around xmldifff http://www.logilab.org/859 and seeing how they compare nodes and handle parallel trees. Or, try writing a [recursive] generator that yields each [significant] node in a tree, say
f(t)
, then useitertools.izip(f(t1),f(t2))
to collect together pairs of nodes for comparison.Most of the hierarchical structures I deal with have more than one "axis", like elements and attributes in XML, and some nodes are more significant than others.
For a more bizarre solution, serialize the two trees to text files, make a referential note that line #n comes from node #x in a tree. Do that to both trees, feed the files into diff, and scan the results to notice which parts of the tree have changed. You can map that line #n from file 1 (and therefore node #x in the first tree) and line #m from file 2 (and therefore node #y of the second tree) mean that some part of each tree is the same or different.
For any solution your are going to have to establish a "canonical form" of your tree, one that might drop all ignorable whitespace, display attributes, optional nodes, etc., from the comparison process. It might also mean doing a breadth first vs. depth first traversal of the tree(s).
http://code.google.com/p/pytree/
这些可能有点矫枉过正完全适合您的需求:
http://networkx.lanl.gov/
http://www.osl.iu.edu/~dgregor/bgl-python/
http://code.google.com/p/pytree/
these maybe overkill or not suited at all for what you need:
http://networkx.lanl.gov/
http://www.osl.iu.edu/~dgregor/bgl-python/