两个字典(键和值)的递归差异?
所以我有一个 Python 字典,称之为 d1,以及该字典稍后的一个版本,称之为 d2。我想找到 d1
和 d2
之间的所有更改。换句话说,添加、删除或更改的所有内容。棘手的一点是,值可以是整数、字符串、列表或字典,因此它需要是递归的。这就是我到目前为止所拥有的:
def dd(d1, d2, ctx=""):
print "Changes in " + ctx
for k in d1:
if k not in d2:
print k + " removed from d2"
for k in d2:
if k not in d1:
print k + " added in d2"
continue
if d2[k] != d1[k]:
if type(d2[k]) not in (dict, list):
print k + " changed in d2 to " + str(d2[k])
else:
if type(d1[k]) != type(d2[k]):
print k + " changed to " + str(d2[k])
continue
else:
if type(d2[k]) == dict:
dd(d1[k], d2[k], k)
continue
print "Done with changes in " + ctx
return
除非该值是一个列表,否则它工作得很好。我无法想出一种优雅的方式来处理列表,除非在 if(type(d2) == list)
之后重复这个函数的一个巨大的、稍微改变的版本。
有什么想法吗?
编辑:这与 这篇文章 不同,因为键可以更改
So I have a python dictionary, call it d1
, and a version of that dictionary at a later point in time, call it d2
. I want to find all the changes between d1
and d2
. In other words, everything that was added, removed or changed. The tricky bit is that the values can be ints, strings, lists, or dicts, so it needs to be recursive. This is what I have so far:
def dd(d1, d2, ctx=""):
print "Changes in " + ctx
for k in d1:
if k not in d2:
print k + " removed from d2"
for k in d2:
if k not in d1:
print k + " added in d2"
continue
if d2[k] != d1[k]:
if type(d2[k]) not in (dict, list):
print k + " changed in d2 to " + str(d2[k])
else:
if type(d1[k]) != type(d2[k]):
print k + " changed to " + str(d2[k])
continue
else:
if type(d2[k]) == dict:
dd(d1[k], d2[k], k)
continue
print "Done with changes in " + ctx
return
It works just fine unless the value is a list. I cant quite come up with an elegant way to deal with lists, without having a huge, slightly changed version of this function repeated after a if(type(d2) == list)
.
Any thoughts?
EDIT: This differs from this post because the keys can change
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(10)
如果你想要递归地区别,我已经为 python 编写了一个包:
https://github.com/seperman/deepdiff
安装
从 PyPi 安装:
示例用法
导入
相同对象返回空
项目类型有已更改
项目的值已更改
添加和/或删除的项目
字符串差异
字符串差异 2
类型更改
列表差异
列表差异 2:
忽略顺序或重复的列表差异:(使用与上面相同的字典)
包含字典的列表:
集合:
命名元组:
自定义对象:
添加对象属性:
In case you want the difference recursively, I have written a package for python:
https://github.com/seperman/deepdiff
Installation
Install from PyPi:
Example usage
Importing
Same object returns empty
Type of an item has changed
Value of an item has changed
Item added and/or removed
String difference
String difference 2
Type change
List difference
List difference 2:
List difference ignoring order or duplicates: (with the same dictionaries as above)
List that contains dictionary:
Sets:
Named Tuples:
Custom objects:
Object attribute added:
以下是受 Winston Ewert 将返回的实现启发
:
Here's an implementation inspired by Winston Ewert
will return:
一种选择是将您遇到的任何列表转换为字典,并以索引作为键。例如:
以下是您在注释中给出的示例字典的输出:
请注意,这将逐个索引进行比较,因此需要进行一些修改才能很好地添加或删除列表项。
One option would be to convert any lists you run into as dictionaries with the index as a key. For example:
Here is the output with the sample dictionaries you gave in comments:
Note that this will compare index by index, so it will need some modification to work well for list items being added or removed.
只是一个想法:您可以尝试一种面向对象的方法,在该方法中派生您自己的字典类,该类跟踪对其所做的任何更改(并报告它们)。看起来这比尝试比较两个字典有很多优点......最后指出了一个。
为了展示如何做到这一点,这里有一个相当完整且经过最低限度测试的示例实现,它应该适用于 Python 2 和 3:
注意,这与之前的简单比较不同在字典状态之后,此类将告诉您有关添加和删除的键的信息 - 换句话说,它会保留完整的历史记录,直到其
_changelist
被清除。输出:
Just a thought: You could try an object-oriented approach where you derive your own dictionary class that keeps track of any changes made to it (and reports them). Seems like this might have many advantages over trying to compare two dicts...one is noted at the end.
To show how that might be done, here's a reasonably complete and minimally tested sample implementation which should work with both Python 2 and 3:
Note that unlike a simple comparison of the before and after state of a dictionary, this class will tell you about keys which were added and then deleted—in other words, it keeps a complete history until its
_changelist
is cleared.Output:
正如 Serge 所建议的,我发现这个解决方案有助于快速获取两个字典是否“一路向下”匹配的布尔值:
As suggested by Serge I found this solution helpful to get a quick boolean return on whether two dictionaries match "all the way down":
您的函数应该首先检查其参数的类型,编写函数以便它可以处理列表、字典、整数和字符串。这样您就不必重复任何内容,只需递归调用即可。
伪代码:
Your function should begin by checking the type of its arguments, write the function so that it can handle lists, dictionaries, ints, and strings. That way you don't have to duplicate anything, you just call recursively.
Psuedocode:
递归访问对象时,请考虑使用
hasattr(obj, '__iter__')
。如果一个对象实现了 __iter__ 方法,您就知道可以迭代它。Consider using
hasattr(obj, '__iter__')
as you recurse through the object. If an object implements the__iter__
method you know you can iterate over it.自己做一些事情来练习和学习是很有趣的,但我发现对于重要的任务,准备好的和维护的包通常效果更好。
考虑转换为 json 并使用一些像样的“语义”json 比较器,例如 https://www.npmjs。 com/package/compare-json 或在线 http://jsondiff.com。需要字符串化数字键。
如果你确实需要,可以尝试将 jsondiff 翻译为 python。
从 JavaScript 转换为 Python 代码?
It is fun to do something yourselves to practice and learn, yet I find that for non-trivial tasks, the ready and maintained packages often work better.
Consider convert to json and use some decent "semantic" json comparator say https://www.npmjs.com/package/compare-json or online http://jsondiff.com. Would need stringify number key.
If you can try translate jsondiff to python if you really need.
Conversion from JavaScript to Python code?
您可以尝试以下简单的实现
You can try the following simple implementation
这是一个示例,它也可以轻松扩展以处理其他 python 数据类型:
Here's a sample, which can be easily extended to handle other python data types too: