检查两个海量 Python 字典是否等价

发布于 2024-12-07 13:33:10 字数 183 浏览 2 评论 0原文

我有一个庞大的 Python 字典,包含超过 90,000 个条目。由于我不会详细说明的原因,我需要将此字典存储在我的数据库中,然后稍后从数据库条目重新编译字典。

我正在尝试建立一个程序来验证我的存储和重新编译是否准确,以及我的新字典是否与旧字典相同。测试这个的最佳方法是什么。

有一些细微的差别,我想弄清楚它们是什么。

I have a massive python dictionary with over 90,000 entries. For reasons I won't get into, I need to store this dictionary in my database and then at a later point recompile dictionary from the database entries.

I am trying to set up a procedure to verify that my storage and recompilation was faithful and that my new dictionary is equivalent to the old one. What is the best methodology for testing this.

There are minor differences and I want to figure out what they are.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

错爱 2024-12-14 13:33:10

最明显的方法当然是:

if oldDict != newDict:
  print "**Failure to rebuild, new dictionary is different from the old"

这应该是最快的,因为它依赖于 Python 的内部结构来进行比较。

更新:看来你追求的不是“平等”,而是更弱的东西。我认为您需要编辑您的问题,以明确您认为“等效”的含义。

The most obvious approach is of course:

if oldDict != newDict:
  print "**Failure to rebuild, new dictionary is different from the old"

That ought to be the fastest possible, since it relies on Python's internals to do the comparison.

UPDATE: It seems you're not after "equal", but something weaker. I think you need to edit your question to make it clear what you consider "equivalent" to mean.

凉世弥音 2024-12-14 13:33:10
>>> d1 = {'a':1,'b':2,'c':3}
>>> d2 = {'b':2,'x':2,'a':5}
>>> set(d1.iteritems()) - set(d2.iteritems()) # items in d1 not in d2
set([('a', 1), ('c', 3)])
>>> set(d2.iteritems()) - set(d1.iteritems()) # items in d2 not in d1
set([('x', 2), ('a', 5)])

编辑
不要投票给这个答案。转到两个Python字典之间的快速比较并添加赞成票。这是一个非常完整的解决方案。

>>> d1 = {'a':1,'b':2,'c':3}
>>> d2 = {'b':2,'x':2,'a':5}
>>> set(d1.iteritems()) - set(d2.iteritems()) # items in d1 not in d2
set([('a', 1), ('c', 3)])
>>> set(d2.iteritems()) - set(d1.iteritems()) # items in d2 not in d1
set([('x', 2), ('a', 5)])

Edit
Don't vote for this answer. Go to Fast comparison between two Python dictionary and add an upvote. It is a very complete solution.

星光不落少年眉 2024-12-14 13:33:10

您可以从这样的东西开始,然后调整它以满足您的需求

>>> bigd = dict([(x, random.randint(0, 1024)) for x in xrange(90000)])
>>> bigd2 = dict([(x, random.randint(0, 1024)) for x in xrange(90000)])
>>> dif = set(bigd.items()) - set(bigd2.items())

You could start with something like this and tweak it to suit your needs

>>> bigd = dict([(x, random.randint(0, 1024)) for x in xrange(90000)])
>>> bigd2 = dict([(x, random.randint(0, 1024)) for x in xrange(90000)])
>>> dif = set(bigd.items()) - set(bigd2.items())
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文