从 csv.DictReader 查找字典值

发布于 2024-10-18 05:42:41 字数 829 浏览 3 评论 0原文

我正在尝试通过 csv.DictReader 获取 csv 文件并将其转换为字典。完成此操作后,我想修改字典的一列,然后将数据写入 tsv 文件。我正在处理文本中的单词和词频。

我尝试使用 dict.value() 函数来获取字典值,但收到一条错误消息,显示“AttributeError:DictReader 实例没有属性“值””

下面是我的代码:

#calculate frequencies of each word in Jane Austen's "Pride and Prejudice"
import csv

#open file with words and counts for the book, and turn into dictionary
fob = open("P&P.csv", "r")
words = csv.DictReader(fob)
dict = words

#open a file to write the words and frequencies to
fob = open("AustenWords.tsv", "w")

#set total word count
wordcount = 120697

for row in words:
    values = dict.values()
    print values

基本上,我有文本中的每个单词(即“a”、“1937”),我想找到相关单词使用的总字数的百分比(因此,对于“a”,百分比将为 1937/120697。)现在我的代码没有执行此操作的方程式,但我希望一旦获得每行的值,就可以将包含单词和计算出的百分比的行写入新文件。如果有人有更好的方法(或任何方法!)来做到这一点,我将不胜感激。

谢谢

I'm trying to take a csv file and turn it into a dictionary, via csv.DictReader. After doing this, I want to modify one of the columns of the dictionary, and then write the data into a tsv file. I'm dealing with words and word frequencies in a text.

I've tried using the dict.value() function to obtain the dictionary values, but I get an error message saying "AttributeError: DictReader instance has no attribute "values""

Below is my code:

#calculate frequencies of each word in Jane Austen's "Pride and Prejudice"
import csv

#open file with words and counts for the book, and turn into dictionary
fob = open("P&P.csv", "r")
words = csv.DictReader(fob)
dict = words

#open a file to write the words and frequencies to
fob = open("AustenWords.tsv", "w")

#set total word count
wordcount = 120697

for row in words:
    values = dict.values()
    print values

Basically, I have the total count of each word in the text (i.e. "a","1937") and I want to find the percentage of the total word count that the word in question uses (thus, for "a", the percentage would be 1937/120697.) Right now my code doesn't have the equation for doing this, but I'm hoping, once I obtain the values of each row, to write a row to the new file with the word and the calculated percentage. If anyone has a better way (or any way!) to do this, I would greatly appreciate any input.

Thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

南风几经秋 2024-10-25 05:42:41

要回答基本问题 - “为什么我会收到此错误” - 当您调用 csv.DictReader() 时,返回类型是 iterator 而不是 Dictionary

迭代器中的每个 ROW 都是一个 Dictionary,您可以将其用于脚本:

for row in words:    
    values = row.values()    
    print values

To answer the basic question - "why am I getting this error" - when you call csv.DictReader(), the return type is an iterator not a Dictionary.

Each ROW in the iterator is a Dictionary which you can then use for your script:

for row in words:    
    values = row.values()    
    print values
一笑百媚生 2024-10-25 05:42:41

感谢上帝马特·邓南的回答(我会回复它,但我不知道如何回复)。 csv.DictReader 对象非常违反直觉,不是字典对象(尽管我认为我开始看到为什么不这样做有一些用处)。正如他所说, csv.DictReader 对象是一个迭代器(以我对 python 的入门水平,我认为这可能就像一个列表)。该对象(不是字典)中的每个条目都是字典。

因此,csv.DictReader 返回类似于字典列表的内容,这与返回一个字典对象不同,尽管名称相同。

到目前为止,好的一点是 csv.DictReader 确实在第一行中保留了我的键值,并将它们正确地放置在作为它实际返回的可迭代对象一部分的许多字典对象中(同样,它不返回一个字典对象!)。

我已经浪费了大约一个小时的时间来思考这个问题,文档还不够清晰,尽管现在我了解了 csv.DictReader 返回的对象类型,文档却清晰多了。我认为文档说了类似它如何返回可迭代对象的内容,但如果您认为它返回一个字典并且您不知道字典是否可迭代,那么这很容易理解为“返回字典对象”。

文档应该说“这不会返回字典对象,而是返回一个包含每个条目的字典对象的可迭代对象”或类似的内容。作为一个已经 20 年没有编码过的 Python 新手,我不断遇到这样的问题:文档是由专家编写的,并且对于初学者来说过于密集。

我很高兴它的存在,并且人们已经投入了时间,但它可以让初学者更容易,同时又不会降低它对 Python 专家的价值。

Thank goodness for Matt Dunnam's answer (I'd reply to it but I don't see how to). csv.DictReader objects are, quite counter-intuitively, NOT dictionary objects (although I think I am beginning to see some usefulness in why not). As he says, csv.DictReader objects are an iterator (with my intro level to python, I think this is like a list maybe). Each entry in that object (which is not a dictionary) is a dictionary.

So, csv.DictReader returns something like a list of dictionaries, which is not the same as returning one dictionary object, despite the name.

What is nice, so far, is that csv.DictReader did preserve my key values in the first row, and placed them correctly in each of the many dictionary objects that are a part of the iterable object it actually returned (again, it does not return a dictionary object!).

I've wasted about an hour banging my head on this, the documentation is not clear enough, although now that I understand what type of object csv.DictReader returns, the documentation is a lot clearer. I think the documentation says something like how it returns an iterable object, but if you think it returns a dictionary and you don't know if dictionaries are iterable or not then this is easy to read as "returns a dictionary object".

The documentation should say "This does not return a dictionary object, but instead returns an iterable object containing a dictionary object for each entry" or some such thing. As a python newbie who hasn't coded in 20 years, I keep running into problems where the documentation is written by and for experts and it is too dense for beginners.

I'm glad it's there and that people have given their time to it, but it could be made easier for beginners while not reducing its worth to expert pythonistas.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文