在Python中按数量对多个列表的元素进行排名

发布于 2024-08-13 07:19:33 字数 1038 浏览 9 评论 0原文

我想根据元素在每个列表中出现的频率对多个列表进行排名。示例：

列表 1 = 1,2,3,4
列表2 = 4,5,6,7
list3 = 4,1,8,9

结果 = 4,1,2,3,4,5,6,7,8 （4 计算三次，1 计算两次，其余一次）

我尝试过以下，但我需要一些更智能的东西，以及我可以对任意数量的列表执行的操作。


 l = []
 l.append([ 1, 2, 3, 4, 5])
 l.append([ 1, 9, 3, 4, 5])
 l.append([ 1, 10, 8, 4, 5])
 l.append([ 1, 12, 13, 7, 5])
 l.append([ 1, 14, 13, 13, 6])

 x1 = set(l[0]) & set(l[1]) & set(l[2]) & set(l[3])
 x2 = set(l[0]) & set(l[1]) & set(l[2]) & set(l[4])
 x3 = set(l[0]) & set(l[1]) & set(l[3]) & set(l[4])
 x4 = set(l[0]) & set(l[2]) & set(l[3]) & set(l[4])
 x5 = set(l[1]) & set(l[2]) & set(l[3]) & set(l[4])
 set1 = set(x1) | set(x2) | set(x3) | set(x4) | set(x5)

 a1 = list(set(l[0]) & set(l[1]) & set(l[2]) & set(l[3]) & set(l[4]))
 a2 = getDifference(list(set1),a1)
 print a1
 print a2

现在问题来了...我可以用a3，a4和a5一次又一次地做，但是它太复杂了，我需要一个函数来完成这个...但我不知道如何...我的数学被卡住了;）

已解决：非常感谢您的讨论。作为一个新手，我喜欢这个系统：快速+信息丰富。你帮了我所有的忙！泰

原文

I want to rank multiple lists according to their elements how often they appear in each list. Example:

list1 = 1,2,3,4
list2 = 4,5,6,7
list3 = 4,1,8,9

result = 4,1,2,3,4,5,6,7,8 (4 is counted three times, 1 two times and the rest once)

I've tried the following but i need something more intelligent and something i can do with any ammount of lists.


 l = []
 l.append([ 1, 2, 3, 4, 5])
 l.append([ 1, 9, 3, 4, 5])
 l.append([ 1, 10, 8, 4, 5])
 l.append([ 1, 12, 13, 7, 5])
 l.append([ 1, 14, 13, 13, 6])

 x1 = set(l[0]) & set(l[1]) & set(l[2]) & set(l[3])
 x2 = set(l[0]) & set(l[1]) & set(l[2]) & set(l[4])
 x3 = set(l[0]) & set(l[1]) & set(l[3]) & set(l[4])
 x4 = set(l[0]) & set(l[2]) & set(l[3]) & set(l[4])
 x5 = set(l[1]) & set(l[2]) & set(l[3]) & set(l[4])
 set1 = set(x1) | set(x2) | set(x3) | set(x4) | set(x5)

 a1 = list(set(l[0]) & set(l[1]) & set(l[2]) & set(l[3]) & set(l[4]))
 a2 = getDifference(list(set1),a1)
 print a1
 print a2

Now here is the problem... i can do it again and again with a3,a4 and a5 but its too complex then, i need a function for this... But i don't know how... my math got stuck ;)

SOLVED: thanks alot for the discussion. As a newbee i like this system somehow: fast+informative. You helped me all out! Ty

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

ま柒月 2024-08-20 07:19:33

import collections

data = [
  [1, 2, 3, 4, 5],
  [1, 9, 3, 4, 5],
  [1, 10, 8, 4, 5],
  [1, 12, 13, 7, 5],
  [1, 14, 13, 13, 6],
]

def sorted_by_count(lists):
  counts = collections.defaultdict(int)
  for L in lists:
    for n in L:
      counts[n] += 1

  return [num for num, count in
          sorted(counts.items(),
                 key=lambda k_v: (k_v[1], k_v[0]),
                 reverse=True)]

print sorted_by_count(data)

现在让我们概括它（采用任何可迭代、放宽可散列要求），允许键和反向参数（以匹配排序），并将其重命名为 freq_sorted：

def freq_sorted(iterable, key=None, reverse=False, include_freq=False):
  """Return a list of items from iterable sorted by frequency.

  If include_freq, (item, freq) is returned instead of item.

  key(item) must be hashable, but items need not be.

  *Higher* frequencies are returned first.  Within the same frequency group,
  items are ordered according to key(item).
  """
  if key is None:
    key = lambda x: x

  key_counts = collections.defaultdict(int)
  items = {}
  for n in iterable:
    k = key(n)
    key_counts[k] += 1
    items.setdefault(k, n)

  if include_freq:
    def get_item(k, c):
      return items[k], c
  else:
    def get_item(k, c):
      return items[k]

  return [get_item(k, c) for k, c in
          sorted(key_counts.items(),
                 key=lambda kc: (-kc[1], kc[0]),
                 reverse=reverse)]

示例：

>>> import itertools
>>> print freq_sorted(itertools.chain.from_iterable(data))
[1, 5, 4, 13, 3, 2, 6, 7, 8, 9, 10, 12, 14]
>>> print freq_sorted(itertools.chain.from_iterable(data), include_freq=True)
# (slightly reformatted)
[(1, 5),
 (5, 4),
 (4, 3), (13, 3),
 (3, 2),
 (2, 1), (6, 1), (7, 1), (8, 1), (9, 1), (10, 1), (12, 1), (14, 1)]

import collections

data = [
  [1, 2, 3, 4, 5],
  [1, 9, 3, 4, 5],
  [1, 10, 8, 4, 5],
  [1, 12, 13, 7, 5],
  [1, 14, 13, 13, 6],
]

def sorted_by_count(lists):
  counts = collections.defaultdict(int)
  for L in lists:
    for n in L:
      counts[n] += 1

  return [num for num, count in
          sorted(counts.items(),
                 key=lambda k_v: (k_v[1], k_v[0]),
                 reverse=True)]

print sorted_by_count(data)

Now let's generalize it (to take any iterable, loosen hashable requirement), allow key and reverse parameters (to match sorted), and rename to freq_sorted:

def freq_sorted(iterable, key=None, reverse=False, include_freq=False):
  """Return a list of items from iterable sorted by frequency.

  If include_freq, (item, freq) is returned instead of item.

  key(item) must be hashable, but items need not be.

  *Higher* frequencies are returned first.  Within the same frequency group,
  items are ordered according to key(item).
  """
  if key is None:
    key = lambda x: x

  key_counts = collections.defaultdict(int)
  items = {}
  for n in iterable:
    k = key(n)
    key_counts[k] += 1
    items.setdefault(k, n)

  if include_freq:
    def get_item(k, c):
      return items[k], c
  else:
    def get_item(k, c):
      return items[k]

  return [get_item(k, c) for k, c in
          sorted(key_counts.items(),
                 key=lambda kc: (-kc[1], kc[0]),
                 reverse=reverse)]

Example:

>>> import itertools
>>> print freq_sorted(itertools.chain.from_iterable(data))
[1, 5, 4, 13, 3, 2, 6, 7, 8, 9, 10, 12, 14]
>>> print freq_sorted(itertools.chain.from_iterable(data), include_freq=True)
# (slightly reformatted)
[(1, 5),
 (5, 4),
 (4, 3), (13, 3),
 (3, 2),
 (2, 1), (6, 1), (7, 1), (8, 1), (9, 1), (10, 1), (12, 1), (14, 1)]

回复收藏 0 原文

如果没结果 2024-08-20 07:19:33

结合已经发布的几个想法：

from itertools import chain
from collections import defaultdict

def frequency(*lists):
    counter = defaultdict(int)
    for x in chain(*lists):
        counter[x] += 1
    return [key for (key, value) in 
        sorted(counter.items(), key=lambda kv: (kv[1], kv[0]), reverse=True)]

注意：

在 Python 2.7 中，您可以使用 Counter 而不是 defaultdict(int)。
该版本采用任意数量的列表作为其参数；前导星号意味着它们将全部打包到一个元组中。如果您想传递包含所有列表的单个列表，请省略该前导星号。
如果您的列表包含不可散列的类型，则会中断。

Combining a couple of ideas already posted:

from itertools import chain
from collections import defaultdict

def frequency(*lists):
    counter = defaultdict(int)
    for x in chain(*lists):
        counter[x] += 1
    return [key for (key, value) in 
        sorted(counter.items(), key=lambda kv: (kv[1], kv[0]), reverse=True)]

Notes:

In Python 2.7, you can use Counter instead of defaultdict(int).
This version takes any number of lists as its argument; the leading asterisk means they'll all be packed into a tuple. If you want to pass in a single list containing all of your lists, omit that leading asterisk.
This breaks if your lists contain an unhashable type.

回复收藏 0 原文

究竟谁懂我的在乎 2024-08-20 07:19:33

def items_ordered_by_frequency(*lists):

    # get a flat list with all the values
    biglist = []
    for x in lists:
        biglist += x

    # sort it in reverse order by frequency
    return sorted(set(biglist), 
                  key=lambda x: biglist.count(x), 
                  reverse=True)

def items_ordered_by_frequency(*lists):

    # get a flat list with all the values
    biglist = []
    for x in lists:
        biglist += x

    # sort it in reverse order by frequency
    return sorted(set(biglist), 
                  key=lambda x: biglist.count(x), 
                  reverse=True)

回复收藏 0 原文

时光礼记 2024-08-20 07:19:33

试试这个：

def rank(*lists):
    d = dict()
    for lst in lists:
        for e in lst:
            if e in d: d[e] += 1
            else: d[e] = 1
    return [j[1] for j in sorted([(d[i],i) for i in d], reverse=True)]

使用示例：

a = [1,2,3,4]
b = [4,5,6,7]
c = [4,1,8,9]

print rank(a,b,c)

您可以使用任意数量的列表作为输入

Try this one:

def rank(*lists):
    d = dict()
    for lst in lists:
        for e in lst:
            if e in d: d[e] += 1
            else: d[e] = 1
    return [j[1] for j in sorted([(d[i],i) for i in d], reverse=True)]

Usage example:

a = [1,2,3,4]
b = [4,5,6,7]
c = [4,1,8,9]

print rank(a,b,c)

You can use any number of lists as input

回复收藏 0 原文

离线来电— 2024-08-20 07:19:33

您可以计算每个元素的出现次数（直方图），然后按其排序：

def histogram(enumerable):
  result = {}
  for x in enumerable:
    result.setdefault(x, 0)
    result[x] += 1
  return result

lists = [ [1,2,3,4], [4,5,6,7], ... ]

from itertools import chain

h = histogram(chain(*lists))
ranked = sorted(set(chain(*lists)), key = lambda x : h[x], reverse = True)

You can count the number of appearances of each element (a histogram), then sort by it:

def histogram(enumerable):
  result = {}
  for x in enumerable:
    result.setdefault(x, 0)
    result[x] += 1
  return result

lists = [ [1,2,3,4], [4,5,6,7], ... ]

from itertools import chain

h = histogram(chain(*lists))
ranked = sorted(set(chain(*lists)), key = lambda x : h[x], reverse = True)

回复收藏 0 原文

莫多说 2024-08-20 07:19:33

试试这个代码：

def elementFreq(myList):
    #myList is the list of lists
    from collections import Counter
    tmp = []
    for i in myList: tmp += i        
    return(Counter(tmp))

注意：你的列表应该是可散列类型

Try this code:

def elementFreq(myList):
    #myList is the list of lists
    from collections import Counter
    tmp = []
    for i in myList: tmp += i        
    return(Counter(tmp))

Note: Your lists should be hashable type

回复收藏 0 原文

~没有更多了~