Python 中的示例函数:计算单词数

发布于 2024-09-26 21:12:26 字数 500 浏览 4 评论 0原文

我对 Python 有点生疏,只是在寻求帮助来实现一个示例函数来计算单词数(这只是 scons 脚本的示例目标,它不做任何“真实”的事情):

def countWords(target, source, env):
  if (len(target) == 1 and len(source) == 1):
    fin = open(str(source[0]), 'r')
    # do something with "f.read()"
    fin.close()

    fout = open(str(target[0]), 'w')
    # fout.write(something)
    fout.close()
  return None

你能帮我填写一下吗细节?计算单词数的常用方法是读取每一行,分解为单词,并为该行中的每个单词增加字典中的计数器;然后对于输出,按计数递减对单词进行排序。

编辑:我正在使用Python 2.6(准确地说是Python 2.6.5)

I'm a bit rusty in Python and am just looking for help implementing an example function to count words (this is just a sample target for a scons script that doesn't do anything "real"):

def countWords(target, source, env):
  if (len(target) == 1 and len(source) == 1):
    fin = open(str(source[0]), 'r')
    # do something with "f.read()"
    fin.close()

    fout = open(str(target[0]), 'w')
    # fout.write(something)
    fout.close()
  return None

Could you help me fill in the details? The usual way to count words is to read each line, break up into words, and for each word in the line increment a counter in a dictionary; then for the output, sort the words by decreasing count.

edit: I'm using Python 2.6 (Python 2.6.5 to be exact)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

不甘平庸 2024-10-03 21:12:26
from collections import defaultdict

def countWords(target, source, env):
    words = defaultdict(int)
    if (len(target) == 1 and len(source) == 1):
        with open(str(source[0]), 'r') as fin:
            for line in fin:
                for word in line.split():
                    words[word] += 1

        with open(str(target[0]), 'w') as fout:
            for word in sorted(words, key=words.__getitem__, reverse=True):
                fout.write('%s\n' % word)
    return None
from collections import defaultdict

def countWords(target, source, env):
    words = defaultdict(int)
    if (len(target) == 1 and len(source) == 1):
        with open(str(source[0]), 'r') as fin:
            for line in fin:
                for word in line.split():
                    words[word] += 1

        with open(str(target[0]), 'w') as fout:
            for word in sorted(words, key=words.__getitem__, reverse=True):
                fout.write('%s\n' % word)
    return None
云裳 2024-10-03 21:12:26

在不知道为什么 env 存在的情况下,我只能执行以下操作:

def countWords(target, source, env):
    wordCount = {}
    if len(target) == 1 and len(source) == 1:
        with fin as open(source[0], 'r'):
            for line in f
                for word in line.split():
                    if word in wordCount.keys():
                        wordCount[word] += 1
                    else:
                        wordCount[word] = 0

        rev = {}
        for v in wordCount.values():
            rev[v] = []
        for w in wordCount.keys():
            rev[wordCOunt[w]].append(w)
        with open(target[0], 'w') as f:
            for v in rev.keys():
                f.write("%d: %s\n" %(v, " ".join(rev[v])))

Without knowing why env exists, I can only do the following:

def countWords(target, source, env):
    wordCount = {}
    if len(target) == 1 and len(source) == 1:
        with fin as open(source[0], 'r'):
            for line in f
                for word in line.split():
                    if word in wordCount.keys():
                        wordCount[word] += 1
                    else:
                        wordCount[word] = 0

        rev = {}
        for v in wordCount.values():
            rev[v] = []
        for w in wordCount.keys():
            rev[wordCOunt[w]].append(w)
        with open(target[0], 'w') as f:
            for v in rev.keys():
                f.write("%d: %s\n" %(v, " ".join(rev[v])))
一梦浮鱼 2024-10-03 21:12:26

此处有一个有用的示例。它的工作原理大致如您所描述的那样,并且还对句子进行计数。

There is a helpful example here. It works roughly as you describe and also counts sentences.

忘东忘西忘不掉你 2024-10-03 21:12:26

虽然效率不是很高,但是很简洁!

with open(fname) as f:
   res = {}
   for word in f.read().split():
       res[word] = res.get(word, 0)+1
with open(dest, 'w') as f:
    f.write("\n".join(sorted(res, key=lambda w: -res[w])))

Not too efficient but it is concise!

with open(fname) as f:
   res = {}
   for word in f.read().split():
       res[word] = res.get(word, 0)+1
with open(dest, 'w') as f:
    f.write("\n".join(sorted(res, key=lambda w: -res[w])))
忘东忘西忘不掉你 2024-10-03 21:12:26

这是我的版本:

import string
import itertools as it
drop = string.punctuation+string.digits

def countWords(target, source, env=''):
    inputstring=open(source).read()
    words = sorted(word.strip(drop)
                   for word in inputstring.lower().replace('--',' ').split())
    wordlist = sorted([(word, len(list(occurances)))
                      for word, occurances in it.groupby(words, lambda x: x)],
                        key = lambda x: x[1],
                      reverse = True)
    with open(target,'w') as results:
        results.write('\n'.join('%16s : %s' % word for word in wordlist))

Here my version:

import string
import itertools as it
drop = string.punctuation+string.digits

def countWords(target, source, env=''):
    inputstring=open(source).read()
    words = sorted(word.strip(drop)
                   for word in inputstring.lower().replace('--',' ').split())
    wordlist = sorted([(word, len(list(occurances)))
                      for word, occurances in it.groupby(words, lambda x: x)],
                        key = lambda x: x[1],
                      reverse = True)
    with open(target,'w') as results:
        results.write('\n'.join('%16s : %s' % word for word in wordlist))
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文