使用自动机按长度对行进行排序?
我有一个 文本文档(1MB,TXT 文件),内容略多于 17,500 行。我希望能够做的是按字符长度对这些行进行排序,并将其输出到同一文件(然后保存)或完全新的文件。只要我提前知道,任何一种都可以正常工作。
如果我能通过 OS X 中的 Automator 以某种方式做到这一点,那就加分了,因为我的编码/终端能力……缺乏。
I have a text document (1MB, TXT file) with a little more than 17,500 lines. What I'm hoping to be able to do is to sort those lines by character length and have it output to either the same file (which is then saved) or a new file entirely. Either one works fine as long as I know ahead of time.
Bonus points if I could do it through Automator in OS X in some way as my coding/terminal abilities are... Lacking.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我将文件转换为 XML,然后使用 XSLT 根据字符串长度对条目进行排序。虽然路途遥远,但很有效。
I converted the file to XML then used XSLT to order the entries based on string length. It was a really long way around but it worked.
awk '{printf "%7d %s\n", length($0), $0}' 文件 |排序 -n | sed -e 's/^....... //' > newfile
在 8 个字符字段中打印每一行及其前面的长度
对输出进行数字排序
从每行前面删除 8 个字符
如果文件的每行字符少于 10M,则此方法有效。由于您的文件小于 1MB,这一定是真的。
awk '{printf "%7d %s\n", length($0), $0}' file | sort -n | sed -e 's/^....... //' > newfile
print each line with its length before it in a 8 character field
sort that output numerically
strip off the 8 characters from the front of each line
This works if each line of your file has fewer than 10M characters. Since your file is less than 1MB that must be true.