Code Golf 7 月 4 日版:计算出现次数最多的 10 个单词

发布于 2024-07-26 18:00:32 字数 1428 浏览 12 评论 0原文

给定以下总统列表,在尽可能小的程序中计算前十个字数:

输入文件

    Washington
    Washington
    Adams
    Jefferson
    Jefferson
    Madison
    Madison
    Monroe
    Monroe
    John Quincy Adams
    Jackson
    Jackson
    Van Buren
    Harrison 
    DIES
    Tyler
    Polk
    Taylor 
    DIES
    Fillmore
    Pierce
    Buchanan
    Lincoln
    Lincoln 
    DIES
    Johnson
    Grant
    Grant
    Hayes
    Garfield 
    DIES
    Arthur
    Cleveland
    Harrison
    Cleveland
    McKinley
    McKinley
    DIES
    Teddy Roosevelt
    Teddy Roosevelt
    Taft
    Wilson
    Wilson
    Harding
    Coolidge
    Hoover
    FDR
    FDR
    FDR
    FDR
    Dies
    Truman
    Truman
    Eisenhower
    Eisenhower
    Kennedy 
    DIES
    Johnson
    Johnson
    Nixon
    Nixon 
    ABDICATES
    Ford
    Carter
    Reagan
    Reagan
    Bush
    Clinton
    Clinton
    Bush
    Bush
    Obama

bash 97字符开始

cat input.txt | tr " " "\n" | tr -d "\t " | sed 's/^$//g' | sort | uniq -c | sort -n | tail -n 10

输出

      2 Nixon
      2 Reagan
      2 Roosevelt
      2 Truman
      2 Washington
      2 Wilson
      3 Bush
      3 Johnson
      4 FDR
      7 DIES

只要你认为合适就断绝关系! 第四个快乐!

对于那些关心有关总统的更多信息的人,可以在此处找到。

Given the following list of presidents do a top ten word count in the smallest program possible:

INPUT FILE

    Washington
    Washington
    Adams
    Jefferson
    Jefferson
    Madison
    Madison
    Monroe
    Monroe
    John Quincy Adams
    Jackson
    Jackson
    Van Buren
    Harrison 
    DIES
    Tyler
    Polk
    Taylor 
    DIES
    Fillmore
    Pierce
    Buchanan
    Lincoln
    Lincoln 
    DIES
    Johnson
    Grant
    Grant
    Hayes
    Garfield 
    DIES
    Arthur
    Cleveland
    Harrison
    Cleveland
    McKinley
    McKinley
    DIES
    Teddy Roosevelt
    Teddy Roosevelt
    Taft
    Wilson
    Wilson
    Harding
    Coolidge
    Hoover
    FDR
    FDR
    FDR
    FDR
    Dies
    Truman
    Truman
    Eisenhower
    Eisenhower
    Kennedy 
    DIES
    Johnson
    Johnson
    Nixon
    Nixon 
    ABDICATES
    Ford
    Carter
    Reagan
    Reagan
    Bush
    Clinton
    Clinton
    Bush
    Bush
    Obama

To start it off in bash 97 characters

cat input.txt | tr " " "\n" | tr -d "\t " | sed 's/^$//g' | sort | uniq -c | sort -n | tail -n 10

Output:

      2 Nixon
      2 Reagan
      2 Roosevelt
      2 Truman
      2 Washington
      2 Wilson
      3 Bush
      3 Johnson
      4 FDR
      7 DIES

Break ties as you see fit! Happy fourth!

For those of you who care more information on presidents can be found here.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(16

半仙 2024-08-02 18:00:32

C#,153:

读取 p 处的文件并将结果打印到控制台:

File.ReadLines(p)
    .SelectMany(s=>s.Split(' '))
    .GroupBy(w=>w)
    .OrderBy(g=>-g.Count())
    .Take(10)
    .ToList()
    .ForEach(g=>Console.WriteLine(g.Count()+"|"+g.Key));

如果仅生成列表而不打印到控制台,则为 93 个字符。

6|DIES
4|FDR
3|Johnson
3|Bush
2|Washington
2|Adams
2|Jefferson
2|Madison
2|Monroe
2|Jackson

C#, 153:

Reads in the file at p and prints results to the console:

File.ReadLines(p)
    .SelectMany(s=>s.Split(' '))
    .GroupBy(w=>w)
    .OrderBy(g=>-g.Count())
    .Take(10)
    .ToList()
    .ForEach(g=>Console.WriteLine(g.Count()+"|"+g.Key));

If merely producing the list but not printing to the console, it's 93 characters.

6|DIES
4|FDR
3|Johnson
3|Bush
2|Washington
2|Adams
2|Jefferson
2|Madison
2|Monroe
2|Jackson
锦爱 2024-08-02 18:00:32

较短的 shell 版本:

xargs -n1 < input.txt | sort | uniq -c | sort -nr | head

如果您想要不区分大小写的排名,请将 uniq -c 更改为 uniq -ci

如果您对排名被颠倒以及由于缺少空格而导致可读性受损感到高兴,那么还可以稍微短一些。 其时钟长度为 46 个字符:(

xargs -n1<input.txt|sort|uniq -c|sort -n|tail

如果允许您先将输入文件重命名为简单的“i”,您可以将其缩减为 38 个字符。)

观察到,在这种特殊情况下,没有单词出现超过 9 次,我们可以将其删除通过从最终排序中删除“-n”参数,再减少 3 个字符:

xargs -n1<input.txt|sort|uniq -c|sort|tail

这使该解决方案的长度减少到 43 个字符,而无需重命名输入文件。 (或者 35,如果您这样做的话。)

使用 xargs -n1 将文件拆分为每行一个单词优于 tr \ \\n 解决方案,因为创建大量空行。 这意味着该解决方案不正确,因为它错过了 Nixon,并且显示了 256 次的空白字符串。 但是,空白字符串不是“单词”。

A shorter shell version:

xargs -n1 < input.txt | sort | uniq -c | sort -nr | head

If you want case insensitive ranking, change uniq -c into uniq -ci.

Slightly shorter still, if you're happy about the rank being reversed and readability impaired by lack of spaces. This clocks in at 46 characters:

xargs -n1<input.txt|sort|uniq -c|sort -n|tail

(You could strip this down to 38 if you were allowed to rename the input file to simply "i" first.)

Observing that, in this special case, no word occur more than 9 times we can shave off 3 more characters by dropping the '-n' argument from the final sort:

xargs -n1<input.txt|sort|uniq -c|sort|tail

That takes this solution down to 43 characters without renaming the input file. (Or 35, if you do.)

Using xargs -n1 to split the file into one word on each line is preferable to the tr \ \\n solution, as that creates lots of blank lines. This means that the solution is not correct, because it misses out Nixon and shows a blank string showing up 256 times. However, a blank string is not a "word".

音盲 2024-08-02 18:00:32

vim 60

    :1,$!tr " " "\n"|tr -d "\t "|sort|uniq -c|sort -n|tail -n 10

vim 60

    :1,$!tr " " "\n"|tr -d "\t "|sort|uniq -c|sort -n|tail -n 10
岁月蹉跎了容颜 2024-08-02 18:00:32

Vim 36

:%s/\W/\r/g|%!sort|uniq -c|sort|tail

Vim 36

:%s/\W/\r/g|%!sort|uniq -c|sort|tail
鹿港小镇 2024-08-02 18:00:32

Haskell,102 个字符(哇,与原始版本非常接近):

import List
(take 10.map snd.sort.map(\(x:y)->(-length y,x)).group.sort.words)`fmap`readFile"input.txt"

J,只有 55 个字符:

10{.\:~~.(,.~[:<"0@(+/)=/~);;:&.><;._2[1!:1<'input.txt'

(I还没有弄清楚如何在 J 中优雅地执行文本操作......它在数组结构数据方面要好得多。)


   NB. read the file
   <1!:1<'input.txt'
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------...
|    Washington     Washington     Adams     Jefferson     Jefferson     Madison     Madison     Monroe     Monroe     John Quincy Adams     Jackson     Jackson     Van Buren     Harrison DIES     Tyler     Polk     Taylor DIES     Fillmore     Pierce     ...
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------...
   NB. split into lines
   <;._2[1!:1<'input.txt'
+--------------+--------------+---------+-------------+-------------+-----------+-----------+----------+----------+---------------------+-----------+-----------+-------------+-----------------+---------+--------+---------------+------------+----------+----...
|    Washington|    Washington|    Adams|    Jefferson|    Jefferson|    Madison|    Madison|    Monroe|    Monroe|    John Quincy Adams|    Jackson|    Jackson|    Van Buren|    Harrison DIES|    Tyler|    Polk|    Taylor DIES|    Fillmore|    Pierce|    ...
+--------------+--------------+---------+-------------+-------------+-----------+-----------+----------+----------+---------------------+-----------+-----------+-------------+-----------------+---------+--------+---------------+------------+----------+----...
   NB. split into words
   ;;:&.><;._2[1!:1<'input.txt'
+----------+----------+-----+---------+---------+-------+-------+------+------+----+------+-----+-------+-------+---+-----+--------+----+-----+----+------+----+--------+------+--------+-------+-------+----+-------+-----+-----+-----+--------+----+------+---...
|Washington|Washington|Adams|Jefferson|Jefferson|Madison|Madison|Monroe|Monroe|John|Quincy|Adams|Jackson|Jackson|Van|Buren|Harrison|DIES|Tyler|Polk|Taylor|DIES|Fillmore|Pierce|Buchanan|Lincoln|Lincoln|DIES|Johnson|Grant|Grant|Hayes|Garfield|DIES|Arthur|Cle...
+----------+----------+-----+---------+---------+-------+-------+------+------+----+------+-----+-------+-------+---+-----+--------+----+-----+----+------+----+--------+------+--------+-------+-------+----+-------+-----+-----+-----+--------+----+------+---...
   NB. count reptititions
   |:~.(,.~[:<"0@(+/)=/~);;:&.><;._2[1!:1<'input.txt'
+----------+-----+---------+-------+------+----+------+-------+---+-----+--------+----+-----+----+------+--------+------+--------+-------+-------+-----+-----+--------+------+---------+--------+---------+----+------+-------+--------+------+---+------+------...
|2         |2    |2        |2      |2     |1   |1     |2      |1  |1    |2       |6   |1    |1   |1     |1       |1     |1       |2      |3      |2    |1    |1       |1     |2        |2       |2        |1   |2     |1      |1       |1     |4  |2     |2     ...
+----------+-----+---------+-------+------+----+------+-------+---+-----+--------+----+-----+----+------+--------+------+--------+-------+-------+-----+-----+--------+------+---------+--------+---------+----+------+-------+--------+------+---+------+------...
|Washington|Adams|Jefferson|Madison|Monroe|John|Quincy|Jackson|Van|Buren|Harrison|DIES|Tyler|Polk|Taylor|Fillmore|Pierce|Buchanan|Lincoln|Johnson|Grant|Hayes|Garfield|Arthur|Cleveland|McKinley|Roosevelt|Taft|Wilson|Harding|Coolidge|Hoover|FDR|Truman|Eisenh...
+----------+-----+---------+-------+------+----+------+-------+---+-----+--------+----+-----+----+------+--------+------+--------+-------+-------+-----+-----+--------+------+---------+--------+---------+----+------+-------+--------+------+---+------+------...
   NB. sort
   |:\:~~.(,.~[:<"0@(+/)=/~);;:&.><;._2[1!:1<'input.txt'
+----+---+-------+----+------+----------+------+---------+------+-----+------+--------+-------+-------+---------+-------+--------+-----+----------+-------+---------+-----+---+-----+------+----+------+----+------+-----+-------+----+------+-----+-------+----...
|6   |4  |3      |3   |2     |2         |2     |2        |2     |2    |2     |2       |2      |2      |2        |2      |2       |2    |2         |2      |2        |2    |1  |1    |1     |1   |1     |1   |1     |1    |1      |1   |1     |1    |1      |1   ...
+----+---+-------+----+------+----------+------+---------+------+-----+------+--------+-------+-------+---------+-------+--------+-----+----------+-------+---------+-----+---+-----+------+----+------+----+------+-----+-------+----+------+-----+-------+----...
|DIES|FDR|Johnson|Bush|Wilson|Washington|Truman|Roosevelt|Reagan|Nixon|Monroe|McKinley|Madison|Lincoln|Jefferson|Jackson|Harrison|Grant|Eisenhower|Clinton|Cleveland|Adams|Van|Tyler|Taylor|Taft|Quincy|Polk|Pierce|Obama|Kennedy|John|Hoover|Hayes|Harding|Garf...
+----+---+-------+----+------+----------+------+---------+------+-----+------+--------+-------+-------+---------+-------+--------+-----+----------+-------+---------+-----+---+-----+------+----+------+----+------+-----+-------+----+------+-----+-------+----...
   NB. take 10
   10{.\:~~.(,.~[:<"0@(+/)=/~);;:&.><;._2[1!:1<'input.txt'
+-+----------+
|6|DIES      |
+-+----------+
|4|FDR       |
+-+----------+
|3|Johnson   |
+-+----------+
|3|Bush      |
+-+----------+
|2|Wilson    |
+-+----------+
|2|Washington|
+-+----------+
|2|Truman    |
+-+----------+
|2|Roosevelt |
+-+----------+
|2|Reagan    |
+-+----------+
|2|Nixon     |
+-+----------+

Haskell, 102 characters (wow, so close to matching the original):

import List
(take 10.map snd.sort.map(\(x:y)->(-length y,x)).group.sort.words)`fmap`readFile"input.txt"

J, only 55 characters:

10{.\:~~.(,.~[:<"0@(+/)=/~);;:&.><;._2[1!:1<'input.txt'

(I've yet to figure out how to elegantly perform text manipulations in J... it's much better at array-structured data.)


   NB. read the file
   <1!:1<'input.txt'
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------...
|    Washington     Washington     Adams     Jefferson     Jefferson     Madison     Madison     Monroe     Monroe     John Quincy Adams     Jackson     Jackson     Van Buren     Harrison DIES     Tyler     Polk     Taylor DIES     Fillmore     Pierce     ...
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------...
   NB. split into lines
   <;._2[1!:1<'input.txt'
+--------------+--------------+---------+-------------+-------------+-----------+-----------+----------+----------+---------------------+-----------+-----------+-------------+-----------------+---------+--------+---------------+------------+----------+----...
|    Washington|    Washington|    Adams|    Jefferson|    Jefferson|    Madison|    Madison|    Monroe|    Monroe|    John Quincy Adams|    Jackson|    Jackson|    Van Buren|    Harrison DIES|    Tyler|    Polk|    Taylor DIES|    Fillmore|    Pierce|    ...
+--------------+--------------+---------+-------------+-------------+-----------+-----------+----------+----------+---------------------+-----------+-----------+-------------+-----------------+---------+--------+---------------+------------+----------+----...
   NB. split into words
   ;;:&.><;._2[1!:1<'input.txt'
+----------+----------+-----+---------+---------+-------+-------+------+------+----+------+-----+-------+-------+---+-----+--------+----+-----+----+------+----+--------+------+--------+-------+-------+----+-------+-----+-----+-----+--------+----+------+---...
|Washington|Washington|Adams|Jefferson|Jefferson|Madison|Madison|Monroe|Monroe|John|Quincy|Adams|Jackson|Jackson|Van|Buren|Harrison|DIES|Tyler|Polk|Taylor|DIES|Fillmore|Pierce|Buchanan|Lincoln|Lincoln|DIES|Johnson|Grant|Grant|Hayes|Garfield|DIES|Arthur|Cle...
+----------+----------+-----+---------+---------+-------+-------+------+------+----+------+-----+-------+-------+---+-----+--------+----+-----+----+------+----+--------+------+--------+-------+-------+----+-------+-----+-----+-----+--------+----+------+---...
   NB. count reptititions
   |:~.(,.~[:<"0@(+/)=/~);;:&.><;._2[1!:1<'input.txt'
+----------+-----+---------+-------+------+----+------+-------+---+-----+--------+----+-----+----+------+--------+------+--------+-------+-------+-----+-----+--------+------+---------+--------+---------+----+------+-------+--------+------+---+------+------...
|2         |2    |2        |2      |2     |1   |1     |2      |1  |1    |2       |6   |1    |1   |1     |1       |1     |1       |2      |3      |2    |1    |1       |1     |2        |2       |2        |1   |2     |1      |1       |1     |4  |2     |2     ...
+----------+-----+---------+-------+------+----+------+-------+---+-----+--------+----+-----+----+------+--------+------+--------+-------+-------+-----+-----+--------+------+---------+--------+---------+----+------+-------+--------+------+---+------+------...
|Washington|Adams|Jefferson|Madison|Monroe|John|Quincy|Jackson|Van|Buren|Harrison|DIES|Tyler|Polk|Taylor|Fillmore|Pierce|Buchanan|Lincoln|Johnson|Grant|Hayes|Garfield|Arthur|Cleveland|McKinley|Roosevelt|Taft|Wilson|Harding|Coolidge|Hoover|FDR|Truman|Eisenh...
+----------+-----+---------+-------+------+----+------+-------+---+-----+--------+----+-----+----+------+--------+------+--------+-------+-------+-----+-----+--------+------+---------+--------+---------+----+------+-------+--------+------+---+------+------...
   NB. sort
   |:\:~~.(,.~[:<"0@(+/)=/~);;:&.><;._2[1!:1<'input.txt'
+----+---+-------+----+------+----------+------+---------+------+-----+------+--------+-------+-------+---------+-------+--------+-----+----------+-------+---------+-----+---+-----+------+----+------+----+------+-----+-------+----+------+-----+-------+----...
|6   |4  |3      |3   |2     |2         |2     |2        |2     |2    |2     |2       |2      |2      |2        |2      |2       |2    |2         |2      |2        |2    |1  |1    |1     |1   |1     |1   |1     |1    |1      |1   |1     |1    |1      |1   ...
+----+---+-------+----+------+----------+------+---------+------+-----+------+--------+-------+-------+---------+-------+--------+-----+----------+-------+---------+-----+---+-----+------+----+------+----+------+-----+-------+----+------+-----+-------+----...
|DIES|FDR|Johnson|Bush|Wilson|Washington|Truman|Roosevelt|Reagan|Nixon|Monroe|McKinley|Madison|Lincoln|Jefferson|Jackson|Harrison|Grant|Eisenhower|Clinton|Cleveland|Adams|Van|Tyler|Taylor|Taft|Quincy|Polk|Pierce|Obama|Kennedy|John|Hoover|Hayes|Harding|Garf...
+----+---+-------+----+------+----------+------+---------+------+-----+------+--------+-------+-------+---------+-------+--------+-----+----------+-------+---------+-----+---+-----+------+----+------+----+------+-----+-------+----+------+-----+-------+----...
   NB. take 10
   10{.\:~~.(,.~[:<"0@(+/)=/~);;:&.><;._2[1!:1<'input.txt'
+-+----------+
|6|DIES      |
+-+----------+
|4|FDR       |
+-+----------+
|3|Johnson   |
+-+----------+
|3|Bush      |
+-+----------+
|2|Wilson    |
+-+----------+
|2|Washington|
+-+----------+
|2|Truman    |
+-+----------+
|2|Roosevelt |
+-+----------+
|2|Reagan    |
+-+----------+
|2|Nixon     |
+-+----------+
云裳 2024-08-02 18:00:32

Perl:90

Perl:114(包括 perl、命令行开关、单引号和文件名)

perl -nle'$h{$_}++for split/ /;END{$i++<=10?print"$h{$_} $_":0for reverse sort{$h{$a}cmp$h{$b}}keys%h}' input.txt

Perl: 90

Perl: 114 (Including perl, command-line switches, single quotes and filename)

perl -nle'$h{$_}++for split/ /;END{$i++<=10?print"$h{$_} $_":0for reverse sort{$h{$a}cmp$h{$b}}keys%h}' input.txt
空气里的味道 2024-08-02 18:00:32

缺少 AWK 令人不安。

xargs -n1<input.txt|awk '{c[$1]++}END{for(p in c)print c[p],p|"sort|tail"}'

75 个字符。

如果你想获得更多 AWKy,你可以忘记 xargs:

awk -v RS='[^a-zA-Z]' /./'{c[$1]++}END{for(p in c)print c[p],p|"sort|tail"}' input.txt

The lack of AWK is disturbing.

xargs -n1<input.txt|awk '{c[$1]++}END{for(p in c)print c[p],p|"sort|tail"}'

75 characters.

If you want to get a bit more AWKy, you can forget xargs:

awk -v RS='[^a-zA-Z]' /./'{c[$1]++}END{for(p in c)print c[p],p|"sort|tail"}' input.txt
没有你我更好 2024-08-02 18:00:32

到目前为止,我对 ruby​​ 的最佳尝试是 166 个字符:

h = Hash.new
File.open('f.l').each_line{|l|l.split(/ /).each{|e|h[e]==nil ?h[e]=1:h[e]+=1}}
h.sort{|a,b|a[1]<=>b[1]}.last(10).each{|e|puts"#{e[1]} #{e[0]}"}

令我惊讶的是,还没有人发布疯狂的 J 解决方案。

My best try with ruby so far, 166 chars:

h = Hash.new
File.open('f.l').each_line{|l|l.split(/ /).each{|e|h[e]==nil ?h[e]=1:h[e]+=1}}
h.sort{|a,b|a[1]<=>b[1]}.last(10).each{|e|puts"#{e[1]} #{e[0]}"}

I am surprised that no one has posted a crazy J solution yet.

梦情居士 2024-08-02 18:00:32

这是 shell 脚本的压缩版本,观察到为了对输入数据进行合理解释(没有前导或尾随空格),原始文件中的第二个“tr”和“sed”命令不会更改数据(通过插入验证)在适当的点“tee out.N”并检查输出文件大小 - 相同)。 shell 需要的空间比人类少 - 并且使用 cat 代替输入 I/O 重定向会浪费空间。

tr \  \\n<input.txt|sort|uniq -c|sort -n|tail -10

其重量为 50 个字符,包括脚本末尾的换行符。

还有两个观察结果(从其他人的答案中提取):

  1. tail 本身相当于“tail -10”,
  2. 在这种情况下,数字和字母排序是等效的,

这可以进一步缩小 7 个字符(包括尾随换行符至 43 个字符):

tr \  \\n<input.txt|sort|uniq -c|sort|tail

使用“xargs -n1”(没有给出命令前缀)而不是“tr”非常聪明; 它处理前导、尾随和多个嵌入空格(此解决方案不处理)。

Here's a compressed version of the shell script, observing that for a reasonable interpretation of the input data (no leading or trailing blanks) that the second 'tr' and the 'sed' command in the original do not change the data (verified by inserting 'tee out.N' at suitable points and checking the output file sizes - identical). The shell needs fewer spaces than humans do - and using cat instead of input I/O redirection wastes space.

tr \  \\n<input.txt|sort|uniq -c|sort -n|tail -10

This weighs in at 50 characters including newline at end of script.

With two more observations (pulled from other people's answers):

  1. tail on its own is equivalent to 'tail -10', and
  2. in this case, numeric and alpha sorting are equivalent,

this can be shrunk by a further 7 characters (to 43 including trailing newline):

tr \  \\n<input.txt|sort|uniq -c|sort|tail

Using 'xargs -n1' (with no command prefix given) instead of 'tr' is extremely clever; it deals with leading, trailing and multiple embedded spaces (which this solution does not).

vim 38 适用于所有输入

:%!xargs -n1|sort|uniq -c|sort -n|tail

vim 38 and works for all input

:%!xargs -n1|sort|uniq -c|sort -n|tail
神经大条 2024-08-02 18:00:32

Python 2.6,104 个字符:

l=open("input.txt").read().split()
for c,n in sorted(set((l.count(w),w) for w in l if w))[-10:]:print c,n

Python 2.6, 104 chars:

l=open("input.txt").read().split()
for c,n in sorted(set((l.count(w),w) for w in l if w))[-10:]:print c,n
心不设防 2024-08-02 18:00:32

python 3.1(88 个字符)

import collections
collections.Counter(open('input.txt').read().split()).most_common(10)

python 3.1 (88 chars)

import collections
collections.Counter(open('input.txt').read().split()).most_common(10)
妳是的陽光 2024-08-02 18:00:32

Perl 86 个字符

94,如果算上输入文件名的话。

perl -anE'$_{$_}++for@F;END{say"$_{$_} $_"for@{[sort{$_{$b}<=>$_{$a}}keys%_]}[0..10]}' test.in

如果您不关心得到多少结果,那么它只有 75 个(不包括文件名)。

perl -anE'$_{$_}++for@F;END{say"$_{$_} $_"for sort{$_{$b}<=>$_{$a}}keys%_}' test.in

Perl 86 characters

94, if you count the input filename.

perl -anE'$_{$_}++for@F;END{say"$_{$_} $_"for@{[sort{$_{$b}<=>$_{$a}}keys%_]}[0..10]}' test.in

If you don't care how many results you get, then it's only 75, excluding the filename.

perl -anE'$_{$_}++for@F;END{say"$_{$_} $_"for sort{$_{$b}<=>$_{$a}}keys%_}' test.in
揽月 2024-08-02 18:00:32

红宝石66B

puts (a=
lt;.read.split).uniq.map{|x|"#{a.count x} "+x}.sort.last 10

Ruby 66B

puts (a=
lt;.read.split).uniq.map{|x|"#{a.count x} "+x}.sort.last 10
千仐 2024-08-02 18:00:32

Ruby

115 个字符

w = File.read($*[0]).split
w.uniq.map{|x| [w.select{|y|x==y}.size,x]}.sort.last(10).each{|z| puts "#{z[1]} #{z[0]}"}

Ruby

115 chars

w = File.read($*[0]).split
w.uniq.map{|x| [w.select{|y|x==y}.size,x]}.sort.last(10).each{|z| puts "#{z[1]} #{z[0]}"}
大姐,你呐 2024-08-02 18:00:32

Windows 批处理文件

这显然不是最小的解决方案,但我还是决定发布它,只是为了好玩。 :) 注意:批处理文件使用名为 $ 的临时文件来存储临时结果。

原始未压缩版本带有注释:

@echo off
setlocal enableextensions enabledelayedexpansion

set infile=%1
set cnt=%2
set tmpfile=$
set knownwords=

rem Calculate word count
for /f "tokens=*" %%i in (%infile%) do (
  for %%w in (%%i) do (

    rem If the word hasn't already been processed, ...
    echo !knownwords! | findstr "\<%%w\>" > nul
    if errorlevel 1 (

      rem Count the number of the word's occurrences and save it to a temp file
      for /f %%n in ('findstr "\<%%w\>" %infile% ^| find /v "" /c') do (
        echo %%n^|%%w >> %tmpfile%
      )

      rem Then add the word to the known words list
      set knownwords=!knownwords! %%w
    )
  )
)

rem Print top 10 word count
for /f %%i in ('sort /r %tmpfile%') do (
  echo %%i
  set /a cnt-=1
  if !cnt!==0 goto end
)

:end
del %tmpfile%

压缩和压缩 模糊版本,317 个字符:

@echo off&setlocal enableextensions enabledelayedexpansion&set n=%2&set l=
for /f "tokens=*" %%i in (%1)do for %%w in (%%i)do echo !l!|findstr "\<%%w\>">nul||for /f %%n in ('findstr "\<%%w\>" %1^|find /v "" /c')do echo %%n^|%%w>>
amp;set l=!l! %%w
for /f %%i in ('sort /r 

如果 echo 已关闭并且命令扩展和延迟变量扩展已打开,则可以将其缩短为 258 个字符:

set n=%2&set l=
for /f "tokens=*" %%i in (%1)do for %%w in (%%i)do echo !l!|findstr "\<%%w\>">nul||for /f %%n in ('findstr "\<%%w\>" %1^|find /v "" /c')do echo %%n^|%%w>>
amp;set l=!l! %%w
for /f %%i in ('sort /r 

用法:

> filename.bat input.txt 10 & pause

输出:

6|DIES
4|FDR
3|Johnson
3|Bush
2|Wilson
2|Washington
2|Truman
2|Roosevelt
2|Reagan
2|Nixon
)do echo %%i&set /a n-=1&if !n!==0 del
amp;exit /b

如果 echo 已关闭并且命令扩展和延迟变量扩展已打开,则可以将其缩短为 258 个字符:


用法:


输出:


)do echo %%i&set /a n-=1&if !n!==0 del 
amp;exit /b

用法:

输出:

)do echo %%i&set /a n-=1&if !n!==0 del
amp;exit /b

如果 echo 已关闭并且命令扩展和延迟变量扩展已打开,则可以将其缩短为 258 个字符:

用法:

输出:

Windows Batch File

This is obviously not the smallest solution, but I decided to post it anyway, just for fun. :) NB: the batch file uses a temporary file named $ for storing temporary results.

Original uncompressed version with comments:

@echo off
setlocal enableextensions enabledelayedexpansion

set infile=%1
set cnt=%2
set tmpfile=$
set knownwords=

rem Calculate word count
for /f "tokens=*" %%i in (%infile%) do (
  for %%w in (%%i) do (

    rem If the word hasn't already been processed, ...
    echo !knownwords! | findstr "\<%%w\>" > nul
    if errorlevel 1 (

      rem Count the number of the word's occurrences and save it to a temp file
      for /f %%n in ('findstr "\<%%w\>" %infile% ^| find /v "" /c') do (
        echo %%n^|%%w >> %tmpfile%
      )

      rem Then add the word to the known words list
      set knownwords=!knownwords! %%w
    )
  )
)

rem Print top 10 word count
for /f %%i in ('sort /r %tmpfile%') do (
  echo %%i
  set /a cnt-=1
  if !cnt!==0 goto end
)

:end
del %tmpfile%

Compressed & obfuscated version, 317 characters:

@echo off&setlocal enableextensions enabledelayedexpansion&set n=%2&set l=
for /f "tokens=*" %%i in (%1)do for %%w in (%%i)do echo !l!|findstr "\<%%w\>">nul||for /f %%n in ('findstr "\<%%w\>" %1^|find /v "" /c')do echo %%n^|%%w>>
amp;set l=!l! %%w
for /f %%i in ('sort /r 

This can be shortened to 258 characters if echo is already off and command extensions and delayed variable expansion are on:

set n=%2&set l=
for /f "tokens=*" %%i in (%1)do for %%w in (%%i)do echo !l!|findstr "\<%%w\>">nul||for /f %%n in ('findstr "\<%%w\>" %1^|find /v "" /c')do echo %%n^|%%w>>
amp;set l=!l! %%w
for /f %%i in ('sort /r 

Usage:

> filename.bat input.txt 10 & pause

Output:

6|DIES
4|FDR
3|Johnson
3|Bush
2|Wilson
2|Washington
2|Truman
2|Roosevelt
2|Reagan
2|Nixon
)do echo %%i&set /a n-=1&if !n!==0 del
amp;exit /b

This can be shortened to 258 characters if echo is already off and command extensions and delayed variable expansion are on:


Usage:


Output:


)do echo %%i&set /a n-=1&if !n!==0 del 
amp;exit /b

Usage:

Output:

)do echo %%i&set /a n-=1&if !n!==0 del
amp;exit /b

This can be shortened to 258 characters if echo is already off and command extensions and delayed variable expansion are on:

Usage:

Output:

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文