如何使用权重将字典合并在一起？

发布于 2024-11-24 03:35:58 字数 826 浏览 4 评论 0原文

d1 = {'weight':1, 'data': { 'apples': 8, 'oranges': 7 } }
d2 = {'weight':3, 'data': { 'apples': 4, 'bananas': 3 } }
all_dictionaries = [d1, d2, ... ]

def mergeDictionariesWithWeight(all_dictionaries)

如何将这些字典合并在一起（如果重叠，则多个值与权重）

该函数将返回：

{ 'apples': 4, 'oranges': 7, 'bananas': 3 }

Apples is 4 因为 8 * .25 + 4 * .75

编辑：我刚刚写了一篇取平均值的文章，类似这样。但当然，这与我想要做的确实不同，因为我将所有内容都放在列表中，然后除以长度。

result = {}
keymap = {}
for the_dict in dlist:
    for (k, v) in the_dict.items():
        if not keymap.has_key(k):
            keymap[k] = []
        keymap[k].append(v)
for (k, v) in keymap.items():
    average = sum(int(x) for x in keymap[k]) / float(len(keymap[k]))
    result[k] = float(average)
return result

原文

d1 = {'weight':1, 'data': { 'apples': 8, 'oranges': 7 } }
d2 = {'weight':3, 'data': { 'apples': 4, 'bananas': 3 } }
all_dictionaries = [d1, d2, ... ]

def mergeDictionariesWithWeight(all_dictionaries)

How do I merge these dictionaries together (if overlap, multiple value with the weight)

The function would return:

{ 'apples': 4, 'oranges': 7, 'bananas': 3 }

Apples is 4 because 8 * .25 + 4 * .75

Edit: I just wrote one that takes the average, something like this. But of course it's really different from what I want to do, because I stick everything in a list and just divide by the length.

result = {}
keymap = {}
for the_dict in dlist:
    for (k, v) in the_dict.items():
        if not keymap.has_key(k):
            keymap[k] = []
        keymap[k].append(v)
for (k, v) in keymap.items():
    average = sum(int(x) for x in keymap[k]) / float(len(keymap[k]))
    result[k] = float(average)
return result

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

我不会写诗 2024-12-01 03:35:58

>>> from collections import defaultdict
>>> d=defaultdict(lambda:(0,0))
>>> for D in all_dictionaries:
...   weight = D['weight']
...   for k,v in D['data'].items():
...     d[k]=d[k][0]+weight*v,d[k][1]+weight
... 
>>> dict((k,v[0]/v[1]) for k,v in d.items())
{'apples': 5, 'oranges': 7, 'bananas': 3}

如果需要浮点结果

>>> dict((k,1.*v[0]/v[1]) for k,v in d.items())
{'apples': 5.0, 'oranges': 7.0, 'bananas': 3.0}

关于defaultdict的注释

通常您会看到 defaultdict(int) 或 defaultdict(list) 甚至可能是 defaultdict(set)。 defaultdict 的参数必须可以不带参数地调用。每当发现密钥丢失时，都会使用调用此参数的结果。的默认值，

即 - 调用此函数会返回字典

>>> d=defaultdict(int)
>>> d[1]
0
>>> d['foo']
0

这通常用于计数，因为 int() 返回 0。如果您希望默认值为1 而不是 0，它更棘手，因为您无法将参数传递给 int，但您所需要的只是一个返回 1 的可调用函数。通过使用 lambda 函数。

>>> d=defaultdict(lambda:1)
>>> d[1]
1
>>> d['foo']
1

在这个答案中，我想跟踪加权总和以及权重总和。我可以通过使用二元组作为默认值来做到这一点。

>>> d=defaultdict(lambda:(0,0))
>>> d[1]
(0, 0)
>>> d['foo']
(0, 0)

>>> from collections import defaultdict
>>> d=defaultdict(lambda:(0,0))
>>> for D in all_dictionaries:
...   weight = D['weight']
...   for k,v in D['data'].items():
...     d[k]=d[k][0]+weight*v,d[k][1]+weight
... 
>>> dict((k,v[0]/v[1]) for k,v in d.items())
{'apples': 5, 'oranges': 7, 'bananas': 3}

If you need floating point result

>>> dict((k,1.*v[0]/v[1]) for k,v in d.items())
{'apples': 5.0, 'oranges': 7.0, 'bananas': 3.0}

Notes about defaultdict

Often you see defaultdict(int) or defaultdict(list) maybe even defaultdict(set). The argument to defaultdict must be callable with no parameters. The result of calling this parameter is used whenever a key is found to be missing. ie - calling this returns the default value for the dictionary

for example

>>> d=defaultdict(int)
>>> d[1]
0
>>> d['foo']
0

This is often used for counting things up because int() returns 0. If you want the default value to be 1 instead of 0, it's more tricky because you can't pass a parameter to int, but all you need is a callable that returns 1. This can be accomplished without too much fuss by using a lambda function.

>>> d=defaultdict(lambda:1)
>>> d[1]
1
>>> d['foo']
1

In this answer, I want to keep track of the weighted total, and the total of the weights. I can do this by using a 2-tuple as the default value.

>>> d=defaultdict(lambda:(0,0))
>>> d[1]
(0, 0)
>>> d['foo']
(0, 0)

回复收藏 0 原文

把回忆走一遍 2024-12-01 03:35:58

这是一个首先使用临时字典将项目收集到列表中的解决方案，然后计算最终的加权字典。可能不需要临时就可以完成，但这很容易理解。

from collections import defaultdict

def mergeDictionariesWithWeight(dlist):
    tmp = defaultdict(list)
    for d in dlist:
        weight = d['weight']
        for k, v in d['data'].items():
            tmp[k].append((weight, v))
    r = {}
    for k, v in tmp.items():
        # If there's just one item, ignore the weight
        if len(v) == 1:
            r[k] = v[0][1]
        else:
            total_weight = sum((x[0] for x in v), 0.0)
            r[k] = sum(x[1] * x[0]/total_weight for x in v)
    return r

返回：{'苹果'：5.0，'橙子'：7，'香蕉'：3}（因为 8 * .25 + 4 * .75 = 5.0）

Here's a solution that first uses gathers the items into a list using a temporary dict, and then computes the final weighted dict. It can probably be done without a temporary, but this is easy to understand.

from collections import defaultdict

def mergeDictionariesWithWeight(dlist):
    tmp = defaultdict(list)
    for d in dlist:
        weight = d['weight']
        for k, v in d['data'].items():
            tmp[k].append((weight, v))
    r = {}
    for k, v in tmp.items():
        # If there's just one item, ignore the weight
        if len(v) == 1:
            r[k] = v[0][1]
        else:
            total_weight = sum((x[0] for x in v), 0.0)
            r[k] = sum(x[1] * x[0]/total_weight for x in v)
    return r

Returns: {'apples': 5.0, 'oranges': 7, 'bananas': 3} (because 8 * .25 + 4 * .75 = 5.0)

回复收藏 0 原文

美男兮 2024-12-01 03:35:58

试试这个：

def mergeDictionariesWithWeight(all_dictionaries):
    weightSum = 0
    weightDictionary ={}    

    for dictionary in all_dictionaries: 

        weight = dictionary['weight']
        data = dictionary['data']

        #find the total weight of the elements in data
        for (k,v) in data.items(): 
            if k in weightDictionary:
                weightDictionary[k] += weight*v
        weightSum += weight 
        #normalize the results by deviding by the weight sum
        for (key, value) in weightDictionary:
            weightDictionary[key] = value / float(weightSum)
    return weightDictionary 

d1 = {'weight':1, 'data': { 'apples': 8, 'oranges': 7 } }
d2 = {'weight':3, 'data': { 'apples': 4, 'bananas': 3 } }
all_dictionaries = [d1, d2]

mergeDictionariesWithWeight(all_dictionaries)

try this:

def mergeDictionariesWithWeight(all_dictionaries):
    weightSum = 0
    weightDictionary ={}    

    for dictionary in all_dictionaries: 

        weight = dictionary['weight']
        data = dictionary['data']

        #find the total weight of the elements in data
        for (k,v) in data.items(): 
            if k in weightDictionary:
                weightDictionary[k] += weight*v
        weightSum += weight 
        #normalize the results by deviding by the weight sum
        for (key, value) in weightDictionary:
            weightDictionary[key] = value / float(weightSum)
    return weightDictionary 

d1 = {'weight':1, 'data': { 'apples': 8, 'oranges': 7 } }
d2 = {'weight':3, 'data': { 'apples': 4, 'bananas': 3 } }
all_dictionaries = [d1, d2]

mergeDictionariesWithWeight(all_dictionaries)

回复收藏 0 原文

娇纵 2024-12-01 03:35:58

from collections import defaultdict

def merge_dictionaries_with_weight(all_dictionaries):
    totals = defaultdict(int)
    result = defaultdict(int)
    for each in all_dictionaries:
        weight = float(each['weight'])
        for key, value in each['data'].items():
            totals[key] += weight
            result[key] += weight * value
    for key, total in totals.items():
       result[key] /= total
    return result

from collections import defaultdict

def merge_dictionaries_with_weight(all_dictionaries):
    totals = defaultdict(int)
    result = defaultdict(int)
    for each in all_dictionaries:
        weight = float(each['weight'])
        for key, value in each['data'].items():
            totals[key] += weight
            result[key] += weight * value
    for key, total in totals.items():
       result[key] /= total
    return result

回复收藏 0 原文

刘备忘录 2024-12-01 03:35:58

在算法上与gnibbler的无法区分，但不知何故生成器表达式让我很高兴。

>>> from collections import defaultdict
>>> weights, values = defaultdict(int), defaultdict(int)
>>> key_weight_value = ((key, d['weight'], value) 
                        for d in all_dictionaries 
                        for key, value in d['data'].iteritems())
>>> for k, w, v in key_weight_value:
...     weights[k], values[k] = weights[k] + w, values[k] + w * v
... 

>>> dict((k, values[k] * 1.0 / weights[k]) for k in weights)
{'apples': 5.0, 'oranges': 7.0, 'bananas': 3.0}

Algorithmically indistinguishable from gnibbler's, but somehow the generator expression pleases me.

>>> from collections import defaultdict
>>> weights, values = defaultdict(int), defaultdict(int)
>>> key_weight_value = ((key, d['weight'], value) 
                        for d in all_dictionaries 
                        for key, value in d['data'].iteritems())
>>> for k, w, v in key_weight_value:
...     weights[k], values[k] = weights[k] + w, values[k] + w * v
... 

>>> dict((k, values[k] * 1.0 / weights[k]) for k in weights)
{'apples': 5.0, 'oranges': 7.0, 'bananas': 3.0}

回复收藏 0 原文

~没有更多了~