Code Golf 7 月 4 日版:计算出现次数最多的 10 个单词
给定以下总统列表,在尽可能小的程序中计算前十个字数:
输入文件
Washington Washington Adams Jefferson Jefferson Madison Madison Monroe Monroe John Quincy Adams Jackson Jackson Van Buren Harrison DIES Tyler Polk Taylor DIES Fillmore Pierce Buchanan Lincoln Lincoln DIES Johnson Grant Grant Hayes Garfield DIES Arthur Cleveland Harrison Cleveland McKinley McKinley DIES Teddy Roosevelt Teddy Roosevelt Taft Wilson Wilson Harding Coolidge Hoover FDR FDR FDR FDR Dies Truman Truman Eisenhower Eisenhower Kennedy DIES Johnson Johnson Nixon Nixon ABDICATES Ford Carter Reagan Reagan Bush Clinton Clinton Bush Bush Obama
以bash 97字符开始
cat input.txt | tr " " "\n" | tr -d "\t " | sed 's/^$//g' | sort | uniq -c | sort -n | tail -n 10
输出 :
2 Nixon 2 Reagan 2 Roosevelt 2 Truman 2 Washington 2 Wilson 3 Bush 3 Johnson 4 FDR 7 DIES
只要你认为合适就断绝关系! 第四个快乐!
对于那些关心有关总统的更多信息的人,可以在此处找到。
Given the following list of presidents do a top ten word count in the smallest program possible:
INPUT FILE
Washington Washington Adams Jefferson Jefferson Madison Madison Monroe Monroe John Quincy Adams Jackson Jackson Van Buren Harrison DIES Tyler Polk Taylor DIES Fillmore Pierce Buchanan Lincoln Lincoln DIES Johnson Grant Grant Hayes Garfield DIES Arthur Cleveland Harrison Cleveland McKinley McKinley DIES Teddy Roosevelt Teddy Roosevelt Taft Wilson Wilson Harding Coolidge Hoover FDR FDR FDR FDR Dies Truman Truman Eisenhower Eisenhower Kennedy DIES Johnson Johnson Nixon Nixon ABDICATES Ford Carter Reagan Reagan Bush Clinton Clinton Bush Bush Obama
To start it off in bash 97 characters
cat input.txt | tr " " "\n" | tr -d "\t " | sed 's/^$//g' | sort | uniq -c | sort -n | tail -n 10
Output:
2 Nixon 2 Reagan 2 Roosevelt 2 Truman 2 Washington 2 Wilson 3 Bush 3 Johnson 4 FDR 7 DIES
Break ties as you see fit! Happy fourth!
For those of you who care more information on presidents can be found here.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(16)
C#,153:
读取
p
处的文件并将结果打印到控制台:如果仅生成列表而不打印到控制台,则为 93 个字符。
C#, 153:
Reads in the file at
p
and prints results to the console:If merely producing the list but not printing to the console, it's 93 characters.
较短的 shell 版本:
如果您想要不区分大小写的排名,请将
uniq -c
更改为uniq -ci
。如果您对排名被颠倒以及由于缺少空格而导致可读性受损感到高兴,那么还可以稍微短一些。 其时钟长度为 46 个字符:(
如果允许您先将输入文件重命名为简单的“i”,您可以将其缩减为 38 个字符。)
观察到,在这种特殊情况下,没有单词出现超过 9 次,我们可以将其删除通过从最终排序中删除“-n”参数,再减少 3 个字符:
这使该解决方案的长度减少到 43 个字符,而无需重命名输入文件。 (或者 35,如果您这样做的话。)
使用
xargs -n1
将文件拆分为每行一个单词优于tr \ \\n
解决方案,因为创建大量空行。 这意味着该解决方案不正确,因为它错过了 Nixon,并且显示了 256 次的空白字符串。 但是,空白字符串不是“单词”。A shorter shell version:
If you want case insensitive ranking, change
uniq -c
intouniq -ci
.Slightly shorter still, if you're happy about the rank being reversed and readability impaired by lack of spaces. This clocks in at 46 characters:
(You could strip this down to 38 if you were allowed to rename the input file to simply "i" first.)
Observing that, in this special case, no word occur more than 9 times we can shave off 3 more characters by dropping the '-n' argument from the final sort:
That takes this solution down to 43 characters without renaming the input file. (Or 35, if you do.)
Using
xargs -n1
to split the file into one word on each line is preferable to thetr \ \\n
solution, as that creates lots of blank lines. This means that the solution is not correct, because it misses out Nixon and shows a blank string showing up 256 times. However, a blank string is not a "word".vim 60
vim 60
Vim 36
Vim 36
Haskell,102 个字符(哇,与原始版本非常接近):
J,只有 55 个字符:
(I还没有弄清楚如何在 J 中优雅地执行文本操作......它在数组结构数据方面要好得多。)
Haskell, 102 characters (wow, so close to matching the original):
J, only 55 characters:
(I've yet to figure out how to elegantly perform text manipulations in J... it's much better at array-structured data.)
Perl:90
Perl:114(包括 perl、命令行开关、单引号和文件名)
Perl: 90
Perl: 114 (Including perl, command-line switches, single quotes and filename)
缺少 AWK 令人不安。
75 个字符。
如果你想获得更多 AWKy,你可以忘记 xargs:
The lack of AWK is disturbing.
75 characters.
If you want to get a bit more AWKy, you can forget xargs:
到目前为止,我对 ruby 的最佳尝试是 166 个字符:
令我惊讶的是,还没有人发布疯狂的 J 解决方案。
My best try with ruby so far, 166 chars:
I am surprised that no one has posted a crazy J solution yet.
这是 shell 脚本的压缩版本,观察到为了对输入数据进行合理解释(没有前导或尾随空格),原始文件中的第二个“tr”和“sed”命令不会更改数据(通过插入验证)在适当的点“tee out.N”并检查输出文件大小 - 相同)。 shell 需要的空间比人类少 - 并且使用 cat 代替输入 I/O 重定向会浪费空间。
其重量为 50 个字符,包括脚本末尾的换行符。
还有两个观察结果(从其他人的答案中提取):
tail
本身相当于“tail -10
”,这可以进一步缩小 7 个字符(包括尾随换行符至 43 个字符):
使用“
xargs -n1
”(没有给出命令前缀)而不是“tr
”非常聪明; 它处理前导、尾随和多个嵌入空格(此解决方案不处理)。Here's a compressed version of the shell script, observing that for a reasonable interpretation of the input data (no leading or trailing blanks) that the second 'tr' and the 'sed' command in the original do not change the data (verified by inserting 'tee out.N' at suitable points and checking the output file sizes - identical). The shell needs fewer spaces than humans do - and using cat instead of input I/O redirection wastes space.
This weighs in at 50 characters including newline at end of script.
With two more observations (pulled from other people's answers):
tail
on its own is equivalent to 'tail -10
', andthis can be shrunk by a further 7 characters (to 43 including trailing newline):
Using '
xargs -n1
' (with no command prefix given) instead of 'tr
' is extremely clever; it deals with leading, trailing and multiple embedded spaces (which this solution does not).vim 38 适用于所有输入
vim 38 and works for all input
Python 2.6,104 个字符:
Python 2.6, 104 chars:
python 3.1(88 个字符)
python 3.1 (88 chars)
Perl
86 个字符94,如果算上输入文件名的话。
如果您不关心得到多少结果,那么它只有 75 个(不包括文件名)。
Perl
86 characters94, if you count the input filename.
If you don't care how many results you get, then it's only 75, excluding the filename.
红宝石66B
Ruby 66B
Ruby
115 个字符
Ruby
115 chars
Windows 批处理文件
这显然不是最小的解决方案,但我还是决定发布它,只是为了好玩。 :) 注意:批处理文件使用名为 $ 的临时文件来存储临时结果。
原始未压缩版本带有注释:
压缩和压缩 模糊版本,317 个字符:
用法:
输出:
)do echo %%i&set /a n-=1&if !n!==0 delamp;exit /b如果 echo 已关闭并且命令扩展和延迟变量扩展已打开,则可以将其缩短为 258 个字符:
用法:
输出:
Windows Batch File
This is obviously not the smallest solution, but I decided to post it anyway, just for fun. :) NB: the batch file uses a temporary file named $ for storing temporary results.
Original uncompressed version with comments:
Compressed & obfuscated version, 317 characters:
Usage:
Output:
)do echo %%i&set /a n-=1&if !n!==0 delamp;exit /bThis can be shortened to 258 characters if echo is already off and command extensions and delayed variable expansion are on:
Usage:
Output: