在Python中对计数列表进行排序
(我对任何类型的编程都是全新的,所以回答时请尽可能具体) 问题:我编写了一个程序来解决 pythonchallenge.com level 2。该程序可以工作,但结果很混乱。我想将字符计数的结果排序到一个漂亮的列表中。当我尝试使用sorted()对字符计数结果进行排序时,它会删除所有计数,只给出字符串中的字符列表。我需要能够保持查看文件中每个字符的数量的能力。无论如何,这里是代码:
countstring = open('pagesource.txt').read()
charcount = {}
for x in countstring:
charcount[x] = charcount.get(x, 0) + 1
print charcount
这就是我在 cmd 中得到的:
>>> {'\n': 1219, '!': 6079, '#': 6115, '%': 6104, '$': 6046, '&': 6043, ')': 6186, '
(': 6154, '+': 6066, '*': 6034, '@': 6157, '[': 6108, ']': 6152, '_': 6112, '^':
6030, 'a': 1, 'e': 1, 'i': 1, 'l': 1, 'q': 1, 'u': 1, 't': 1, 'y': 1, '{': 6046
, '}': 6105}
如果我向其中添加一个排序()函数,例如 print Sorted(charcount) ,我会在 cmd 中得到这个:
>>> ['\n', '!', '#', '$', '%', '&', '(', ')', '*', '+', '@', '[', ']', '^', '_', 'a'
, 'e', 'i', 'l', 'q', 't', 'u', 'y', '{', '}']
感谢您的解决方案,如果您能花时间添加对您的代码的注释解释了一切的作用我将不胜感激!
(I am brand new to any kind of programming so please be as specific as you can when you answer)
Problem: I have written a program to solve pythonchallenge.com level 2. The program works but the results are messy. I want to sort the results of the character count into a nice looking list. When I try to sort the results of the character count using sorted() it removes all the counts and just gives me a list of the characters that were in my string. I need to be able to keep the ability to see how much of each character was in my file. Anyway here is the code:
countstring = open('pagesource.txt').read()
charcount = {}
for x in countstring:
charcount[x] = charcount.get(x, 0) + 1
print charcount
this is what i get in cmd:
>>> {'\n': 1219, '!': 6079, '#': 6115, '%': 6104, '
if I add a sorted() function such as print sorted(charcount) to it I get this in cmd:
>>> ['\n', '!', '#', '
Thanks for your solutions and if you can take the time to add comments to your code explaining what everything does I would greatly appreciate it!
: 6046, '&': 6043, ')': 6186, '
(': 6154, '+': 6066, '*': 6034, '@': 6157, '[': 6108, ']': 6152, '_': 6112, '^':
6030, 'a': 1, 'e': 1, 'i': 1, 'l': 1, 'q': 1, 'u': 1, 't': 1, 'y': 1, '{': 6046
, '}': 6105}
if I add a sorted() function such as print sorted(charcount) to it I get this in cmd:
Thanks for your solutions and if you can take the time to add comments to your code explaining what everything does I would greatly appreciate it!
, '%', '&', '(', ')', '*', '+', '@', '[', ']', '^', '_', 'a'
, 'e', 'i', 'l', 'q', 't', 'u', 'y', '{', '}']
Thanks for your solutions and if you can take the time to add comments to your code explaining what everything does I would greatly appreciate it!
: 6046, '&': 6043, ')': 6186, ' (': 6154, '+': 6066, '*': 6034, '@': 6157, '[': 6108, ']': 6152, '_': 6112, '^': 6030, 'a': 1, 'e': 1, 'i': 1, 'l': 1, 'q': 1, 'u': 1, 't': 1, 'y': 1, '{': 6046 , '}': 6105}if I add a sorted() function such as print sorted(charcount) to it I get this in cmd:
Thanks for your solutions and if you can take the time to add comments to your code explaining what everything does I would greatly appreciate it!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
您确实应该使用
Counter
类,而不是重新发明自己的轮子。charcount
是一个字典,字典没有隐式排序顺序。因此,我们必须将其转换为可以排序的列表。该列表中的每个条目都将是计数和字符的 元组 。charcount.items()
已经为我们提供了一个类似于[('\n', 1219), ('!', 6079)]
的列表。不幸的是,如果我们要对该列表进行排序,它将首先按字符排序,然后(如果字符曾经相等)按计数排序,而不是相反。因此,我们需要一个关键函数来告诉sort首先查看count,并且然后(如果计数相等)字符。幸运的是,我们的关键功能非常简单;它只是交换元组:或者,我们可以使用列表理解来交换值,以获得类似以下内容的内容:
[('\n', 1219), ('!', 6079)]
,然后排序,然后再次交换值。charcount_list 现在将是:
如果您想要相反的顺序,只需将
reverse=True
参数指定为 已排序。You should really use the
Counter
class instead of reinventing your own wheel.charcount
is a dictionary, and dictionaries have no implicit sort order. Therefore, we'll have to convert it to a list, which can be sorted. Each entry in that list will be a tuple of count and character.charcount.items()
already gives us a list that looks like[('\n', 1219), ('!', 6079)]
. Unfortunately, if we would sort this list, it would sort by character first and then (if characters were ever equal) by count instead of the other way round. Therefore, we need a key function to tell sort to look at count first, and then (if counts are equal) the character. Fortunately, our key function is really simple; it just swaps around the tuple:Alternatively, we could use a list comprehension to swap the values, to get something like:
[('\n', 1219), ('!', 6079)]
, then sort, and then swap the values again.charcount_list will now be:
If you want the reverse order, simply specify the
reverse=True
argument to sorted.charcount
是一个dict
(字典)。迭代字典会迭代它的键,这就是为什么sorted()
会产生键的排序列表。您需要获取项目列表,然后按第二个值对其进行排序:
charcount
is adict
(dictionary). Iterating a dictionary iterates over it's keys, that's whysorted()
results in a sorted list of keys.You need to get list of items then sort it by the second value:
字典({} 的含义)是无序集合。这意味着您无法以任何有意义的方式对它们进行排序。我建议将信息存储为元组列表 [(), ...],然后根据该列表对它们进行排序。
如您所见,sorted 采用可选的第二个参数。该参数的目的是提供一个函数,告诉排序如何对某些内容进行排序。您所做的就是分解列表中每个元组中的信息以提供可以排序的值,因为您无法真正以任何有意义的方式对元组列表进行排序。
有道理吗?
它也可以写成:
print Sorted(foo, key=lambda (x,y): y)
lambda 只是意味着没有名称的内联函数,它允许您分解元组另一种方式。
您可以通过执行
print [y for (x,y) insorted_list]
来了解它的工作原理,您甚至可以像这样重新定义之前的关键函数:
顺便说一句,为了清楚起见,我只在前面放入括号。如果您没有定义函数,则逗号是元组构造函数。
Dictionaries ( what {} means) are unordered collections. Which means you can't sort them in any kind of meaningful way. I suggest storing the information as a list of tuples [(), ...] and then sorting them based on that.
As you can see, sorted takes an optional second parameter. The purpose of that parameter is to provide a function that tells sorted how to sort something. All you're doing is breaking down the information in each tuple in the list to provide a value that can be ordered, since you can't really order a list of tuples in any meaningful way.
Make sense?
It can also be written like:
print sorted(foo, key=lambda (x,y): y)
lambda just means an inline function with no name, and it allows you to break down the tuple in a different way.
You can see how this works by doing
print [y for (x,y) in sorted_list]
You can even redefine the key function from before like this:
BTW, I only put in the parentheses before for clarity. If you're not defining a function then the comma is the tuple constructor.
字典按键迭代,因此当您将字典传递给
sorted
时,您会得到一个排序的键列表。按值对字典的项目元组进行排序以获得已排序元组的列表。如果您使用的是 Python 2.7+,则可以使用元组列表来初始化
OrderedDict
,这将维护项目元组的排序顺序。Dictionary is iterated by key, so you get a sorted list of keys when you pass the dictionary to
sorted
. Sort the dictionary's item tuples by value to get a list of sorted tuples.If you're using Python 2.7+, then you can use the list of tuples to initialize an
OrderedDict
, which will maintain the sorted order of item tuples.