为什么 Tex/Latex 在后续运行中没有加速?

发布于 2024-08-30 12:06:04 字数 439 浏览 10 评论 0原文

我真的很想知道,为什么即使是最近的 Tex/Latex 系统也不使用任何缓存来加速以后的运行。每次我修复一个逗号*时,调用 Latex 都会花费大约相同的时间,因为它需要加载和转换每个图片文件。

(* 我知道即使更改一个小逗号也可能会影响整个结构,但当然,编写良好的缓存格式可以看到其影响。此外,在某些情况下,只要速度快,就不需要 100% 正确性.)

Tex 的语言中是否有什么东西使得这个过程变得复杂或不可能完成,或者只是在 Tex 的原始实现中,不需要这样做(因为它会很慢)无论如何在那些大型计算机上)?

但另一方面,为什么这不会让其他人烦恼太多,以至于他们启动了一个具有某种缓存(或将 Tex 文件透明转换为更快的格式)的 fork来解析)?

我可以做些什么来加快 Latex 的后续运行速度吗?除了将所有内容放入 ChapterXX.tex 文件然后将它们注释掉之外?

I really wonder, why even recent systems of Tex/Latex do not use any caching to speed up later runs. Every time that I fix a single comma*, calling Latex costs me about the same amount of time, because it needs to load and convert every single picture file.

(* I know that even changing a tiny comma could affect the whole structure but of course, a well-written cache format could see the impact of that. Also, there might be situations where 100% correctness is not needed as long as it’s fast.)

Is there something in the language of Tex which makes this complicated or impossible to accomplish or is it just that in the original implementation of Tex, there was no need for this (because it would have been slow anyway on those large computers)?

But then on the other hand, why doesn’t this annoy other people so much that they’ve started a fork which has some sort of caching (or transparent conversion of Tex files to a format which is faster to parse)?

Is there anything I can do to speed up subsequent runs of Latex? Except from putting all the stuff into chapterXX.tex files and then commenting them out?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

国际总奸 2024-09-06 12:06:04

让我们尝试了解 TeX 是如何工作的。当你写下下面的内容时会发生什么?

tex.exe myfile.tex

TeX 逐字节读取文件。首先,TeX 将每个字符转换为 对。每个字符都有类别代码和ascii代码。类别代码表示该字符是左大括号({)或进入数学模式($)、符号宏(~、例如)或字母(AZ,az)。

如果 TeX 得到类别代码为 11(字母)或 12(其他符号:数字、逗号、句点)的字符,TeX 将开始一个段落。您想要缓存所有段落。

假设您更改了文档中的某些内容。 TeX 如何检查更改后的所有段落是否相同?可能是您更改了某些字符的类别。我是你改变了一些宏的含义。或者您在某处删除了 } ,从而更改了当前字体。


要确保段落相同,您必须确保段落中的所有字符相同,所有字符类别相同,当前字体相同,所有数学字体相同,并且某些值的值相同。内部变量是相同的(例如,\hsize\vsize\pretolerance\tolerance、<代码>\hypenpenalty,exhyphenpenalty\widowpenalty\spaceskip,...,........ )

您只能确定更改之前的所有段落都是相同的。但在这种情况下,您必须保留每个段落后的所有状态。

您的系统 SuperCachedTeX 非常复杂。不是吗?

Let's try to understand how TeX works. What happens when you write the following?

tex.exe myfile.tex

TeX reads your file byte by byte. First of all, TeX converts each char to pair <category, ascii-code>. Each character has category code and ascii code. Category code means that the character is an opening brace ({) or entrance into the mathematical mode ($), symbol-macro (~, for example) or letter (A-Z,a-z).

If TeX gets chars with category code 11 (letters) or 12 (other symbols: digits, comma, period) TeX starts a paragraph. You want to cache all paragraphs.

Suppose you changed something in your document. How can TeX check that all paragraphs after your changes is the same? May be you changed the category of some char. Me be you changed the meaning of some macro. Or you have removed } somewhere and thus changed the current font.


To be sure that the paragraph is the same you must be sure that all characters in the paragraph is the same, that all character categories is the same, the current font is the same, all math fonts is the same, and the value of some internal variables is the same (for example, \hsize, \vsize, \pretolerance, \tolerance, \hypenpenalty, exhyphenpenalty, \widowpenalty, \spaceskip, ..., ........)

You can be sure only that all paragraphed before your changes is the same. But in this case you must keep all states after each paragraph.

Your system SuperCachedTeX is very complicated. Isn't it?

弥枳 2024-09-06 12:06:04

如果您使用的是 pdftex,则可以在第一次运行时在命令行上使用 --draftmode。这指示 pdftex 不要生成 PDF。

当然,很多东西都可以被缓存(例如图形信息),但是 TeX 的工作方式使它很难做到。 TeX 启动时有一个相当复杂的初始化,并且一次 TeX 运行总是意味着恰好写出一份 PDF。为了进行缓存,您需要将数据保存在内存中(以提高效率)。

您可以使用 IPC 并与守护进程对话来获取缓存的信息。但这将涉及大量编程。 TeX 对于正常用途来说速度非常快,因此并没有真正获得太多好处。但另一方面,这是一个很好的问题,因为我已经看到 LaTeX 运行(在当前硬件上)运行 >缓存本来可以节省 10 个小时。

If you're using pdftex, then you can use --draftmode on the command line for the first runs. This instructs pdftex not to generate a PDF.

Of course lots of things could be cached (like graphics information, for instance), but the way TeX works makes it hard to do. There is a rather complex initialization of TeX when it starts up, and one TeX run always means exactly one PDF written out. In order to do caching, you need to keep the data in memory (to be efficient).

You could use IPC and talk to a daemon to get the cached information. But that would involve lots programming. TeX is for normal purposes so blazingly fast, that this does not really gain a lot. But on the other hand, this is a good question, as I have seen LaTeX runs (on currend hardware) that run > 10 hours that would have benefited from caching.

雨落□心尘 2024-09-06 12:06:04

另一个答案,并不严格相关:

您可以使用 LaTeX 宏 \include{...} 并使用 \includeonly{} 您可以仅针对子集重新运行文档。但这不是缓存,也不会为您提供完整的文档。

Yet another answer, not strictly related:

You can use the LaTeX macro \include{...} and with \includeonly{} you can rerun your document for a subset only. But this is not caching, nor does it give you the complete document.

你在我安 2024-09-06 12:06:04

有一些解决方案,例如preview-latex,它可以将内容预编译成专用格式文件以提高速度。您需要记住 TeX 在本地基础上优化页面。在引擎级别没有将材料固定在特定页面上的概念,因此您不能只是“重新 TeX 一页”。

There are solutions such as preview-latex, which pre-compile stuff into a dedicated format file for speed purposes. You need to remember that TeX optimises pages on a local basis. There is no concpet at the engine level of material being fixed on a particular page, so you can't just "re-TeX one page".

仅一夜美梦 2024-09-06 12:06:04

实际上,正确的答案是(IMO):LaTeX 已经在其输出文件(.aux,其他包的附加文件)中缓存了信息。因此,如果添加逗号,则会重复使用此信息,因此排版运行速度比没有此 .aux 文件要快得多。

Actually, the correct answer is (IMO): LaTeX already caches information in its output file (.aux, additional files for other packages). So if you add a comma, this information is reused and thus the typeset run is much faster then without this .aux file.

无所谓啦 2024-09-06 12:06:04

Tex确实有一个缓存设施,命名格式文件,并且我认为,根据 Alexey 对代表 Tex 状态的问题的宝贵总结,应该可以使用它们来允许在任何页面弹出后恢复编辑。

主要问题是分页符会影响段落或浮动,这些可能不会发生在文本中的特定点,但可能会发生在宏的执行中,这些宏的调用取决于调用它们时传递给它们的瞬态状态。

因此,为了使创建“断点”的想法发挥作用,需要破解 Tex 内部结构以转储其他信息(超出通常转储到格式文件中的信息),并将它们与辅助文件的状态打包在一起。考虑到 Joseph 对 Tex 片段预览器的评价,为什么有人会费心破解 Tex 来做到这一点呢?

Tex does have a caching facility, named format files, and I think, pace Alexey's valuable summary of the problems representing Tex's state, it should be possible to use them to allow resumption of editing after any page eject.

The major issue is that pagebreaks will affect paragraphs or floats, and these may not occur at a particular point in the text, but may be occur in the execution of macros that were invoked dependent on the transient state passed to them when they were invoked.

So to make the idea of creating "breakpoints" work, one would need to hack Tex internals to dump additional information, beyond that normaally dumped in format files, and package them up with the state of the auxiliary files. Given what Joseph says about Tex fragment previewers, why would anyone bother hacking Tex to do this?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文