从 tex 中提取文本,删除乳胶标签

发布于 2024-07-19 04:41:47 字数 147 浏览 3 评论 0原文

我有一些 .tex 文件,我想从中接收没有任何乳胶标签的纯文本,例如 \section{...} 或 \newpage。
有人知道如何实现这一目标吗? 我也有 .pdf 文件,但是当我从那里复制代码时,一些单词会被连接起来,这真的很糟糕。
有你知道的工具吗?

I have some .tex files from which I want to receive the plain text without any latex tags such as \section{...} or \newpage.
Does anybody have any idea on how to achieve this?
I also have the .pdf file but when I just copy the code from there, some words get concatenated which is real bad.
Is there any tool you know?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

青春有你 2024-07-26 04:41:47

detex(1)

请参阅OpenDetex GitHub 页面以获取最新版本的OpenDetex。 它是我原来的 DeTeX 的更现代的衍生版本。

我的遗留DeTeX 主页可用< a href="https://www.cs.purdue.edu/homes/trinkle/detex/index-legacy.html" rel="noreferrer">此处。

如果您只是想要旧版detex-2.8.tar< /a> 源,您可以在此处获取它。

detex(1):

Please see the OpenDetex GitHub page for the latest version of OpenDetex. It is a more modern, derivative version of my original DeTeX.

My legacy DeTeX home page is available here.

If you just want the legacy detex-2.8.tar source, you can get it here.

七分※倦醒 2024-07-26 04:41:47

opendetex 适用于 Windows 和 Linux

从此处下载程序 opendetex
http://opendetex.googlecode.com/files/opendetex-2.8.1 .tar.bz2
http://code.google.com/p/opendetex/downloads/list

用法:
http://code.google.com/p/opendetex/wiki/Usage

将其解压到您选择的任何目录。
假设您将其解压到下载目录。

在其中创建任何名称的另一个目录(可选。但如果你创建它就很好)。 假设目录名称是“my_paper”。 将您的论文放入“my_paper”目录中。 假设您的论文名称是project.tex

浏览路径

cd ~/Downloads/opendetex

运行命令

detex -n my_paper/project.tex  > out.txt

通用形式

detex -n full_path_to_tex_file.tex > output_text_file.txt

opendetex is available both for windows and Linux

download the program opendetex from here
http://opendetex.googlecode.com/files/opendetex-2.8.1.tar.bz2
http://code.google.com/p/opendetex/downloads/list

Usage:
http://code.google.com/p/opendetex/wiki/Usage

extract it to any directory of your choice.
Say u extract it to Downloads directory.

make another directory of any name in that (optional. but its good if u create). say the directory name is “my_paper”. Put your paper in the “my_paper” directory. say your paper name is project.tex

Navigate through the path

cd ~/Downloads/opendetex

Run the command

detex -n my_paper/project.tex  > out.txt

generic form

detex -n full_path_to_tex_file.tex > output_text_file.txt
凉城 2024-07-26 04:41:47

也许不是 100% OP 所要求的,但也许有一些帮助。

poppler-utilspdftotext >。 这可以通过以下方式将PDF文件转换为TXT文件

pdftotext yourPDF.pdf

当然这会产生安装这个包的开销,但我认为它可以忽略不计,因为如果我没记错的话,它是在Linux上渲染PDF的标准库,所以如果你有一个PDF查看器已安装(Think Evince 或 Okular),它将已经安装。

查找 这里有更多说明。

Maybe not 100% what the OP requested, but maybe it is of some help.

There is pdftotext in poppler-utils. This can convert a PDF file to a TXT file via

pdftotext yourPDF.pdf

Of course this incurs the overhead of installing this package, but I think it's neglible, since it is the standard library to render PDF on Linux if I remember correctly, so if you have a PDF viewer installed (Think Evince or Okular), it will be installed already.

Find here some more instructions.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文