当前位置：文江博客话题详情

Emacs：如何为文档生成单词列表？

发布于 2024-11-29 17:13:13 字数 221 浏览 10 评论 0原文

我想使用 RefTex 为 LaTex 文档生成索引，遵循 RefTex 手册中的建议：

“...您可能希望从文档的单词列表开始，并删除所有不应索引的单词。” （-> 收集索引短语文件的短语）。

现在我问自己：如何为我的多文件 LaTex 文档生成这样的单词列表？我在 Emacs 手册或网络上没有找到答案。但 Emacs 一定能够做到这一点，对吧？

感谢您的任何提示。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

晌融 2024-12-06 17:13:13

快速入门方法（在命令行，而不是 emacs）：

sed 's/ */\n/g' sed 's/ */\n/g' sed 's/ */\n/g' 我的文档.txt |排序 -f |优衣库> wordListToEdit.txt

回复收藏 0 原文

风铃鹿 2024-12-06 17:13:13

我找到了一个独立于 Emacs 的解决方案，但它生成一个包含文档中找到的所有标记的文件。
我只是在 Emacs Dired 中标记了 LaTeX 项目中的所有 .tex 文件，然后用于

! myshellscript

在所有文件上运行以下脚本。
您可以在此处找到有关 nltk 和 Python 的更多信息：http://www.nltk.org/

#!/usr/bin/env bash
echo $0
echo $1

python -c "\
from __future__ import division;\
import nltk, re, pprint;\
f = open('$1');\
raw = f.read();\
print nltk.word_tokenize(raw)\
" >> tok

I found a solution that is independent from Emacs, but it produces a file with all tokens found in the document(s).
I just marked all the .tex files in my LaTeX project in Emacs Dired, and then used

! myshellscript

to run the following script on all of them.
You find more Information about nltk and Python here: http://www.nltk.org/

#!/usr/bin/env bash
echo $0
echo $1

python -c "\
from __future__ import division;\
import nltk, re, pprint;\
f = open('$1');\
raw = f.read();\
print nltk.word_tokenize(raw)\
" >> tok

回复收藏 0 原文

~没有更多了~