具有数十种语言的多语言乳胶文档

发布于 2025-02-05 16:45:07 字数 2535 浏览 3 评论 0原文

我是一位技术作家,试图将Python-Sphinx网站输出到.pdf通过乳胶。该手册具有安全法规和环境合规性部分,其中包含大约40多种语言。这些语言都在基本文件中显示为IS - 和.rst文件具有与.txt相同的Unicode支持,因此,如果保加利亚语在基本文件中适当地在Cyrillic中适当地呈现,则我假设它正确地编码了。

我已经知道要使用lualatex或Xelatex正确地渲染Unicode,并且我已经发现从sphinx/.rst ext rualatex下汇编的TEX文件。即便如此,在lualatex下,希腊语和西里尔(Cyrillic)根本不会呈现(也不是重音字母,但出于某种原因,日耳曼语eth/ð确实会呈现)。

我在多语言支持上看到的所有内容都涉及几个软件包之一,这些软件包需要您将每个部分括起来\ begin {Russian}之类的内容,但适用于所有40多种语言。由于基本文件的格式不同,并且要自动生成的.TEX文件,每当我更新手册时,它将节省我所做的所有工作。

对我来说,最好的解决方案是将所有多语言支持放在标题中,然后说“嘿,愚蠢……只是呈现Unicode文本AS-IS”。实际上,自动生成的前机和TOC是不令人满意的,因此我将标题保存在单独的文档中,并且我粘贴了更好的标题。通过定义标题中的所有内容,前负载的多语言支持绝对是最理想的解决方案。

任何帮助都很好。

以下是Python-Sphinx提供的标题,并进行了较小的调整:

%% Generated by Sphinx.
\def\sphinxdocclass{report}
\documentclass[letterpaper,10pt,english]{sphinxmanual}

\ifdefined\pdfpxdimen
   \let\sphinxpxdimen\pdfpxdimen\else\newdimen\sphinxpxdimen
\fi \sphinxpxdimen=.75bp\relax
\ifdefined\pdfimageresolution
    \pdfimageresolution= \numexpr \dimexpr1in\relax/\sphinxpxdimen\relax
\fi
%% let collapsible pdf bookmarks panel have high depth per default
\PassOptionsToPackage{bookmarksdepth=5}{hyperref}

\PassOptionsToPackage{warn}{textcomp}
\usepackage[utf8]{inputenc}
\ifdefined\DeclareUnicodeCharacter
% support both utf8 and utf8x syntaxes
  \ifdefined\DeclareUnicodeCharacterAsOptional
    \def\sphinxDUC#1{\DeclareUnicodeCharacter{"#1}}
  \else
    \let\sphinxDUC\DeclareUnicodeCharacter
  \fi
  \sphinxDUC{00A0}{\nobreakspace}
  \sphinxDUC{2500}{\sphinxunichar{2500}}
  \sphinxDUC{2502}{\sphinxunichar{2502}}
  \sphinxDUC{2514}{\sphinxunichar{2514}}
  \sphinxDUC{251C}{\sphinxunichar{251C}}
  \sphinxDUC{2572}{\textbackslash}
\fi

\usepackage{cmap}
\usepackage[T1]{fontenc}
\usepackage{amsmath,amssymb,amstext}
\usepackage{babel}

\usepackage{tgtermes}
\usepackage{tgheros}
\renewcommand{\ttdefault}{txtt}

\usepackage[Bjarne]{fncychap}
\usepackage{sphinx}

\fvset{fontsize=auto}
\usepackage{geometry}

% Include hyperref last.
\usepackage{hyperref}

% Fix anchor placement for figures with captions.
\usepackage{hypcap}% it must be loaded after hyperref.

% Set up styles of URL: it should be placed after hyperref.
\urlstyle{same}

\usepackage{sphinxmessages}

\title{...}
\date{\today}
\release{...}
\author{...}

\makeindex
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{document}

The document is almost entirely in English except for one dang section near but not at the end:

- Това е българско
- Αυτό είναι ελληνικό
- Tohle je česky
- Bu türkçe
- Þetta er íslenskt

\end{document}

I'm a Technical Writer trying to output a Python-Sphinx website into a .pdf via LaTeX. The manual has a safety regulations and environmental compliance section with about 40+ languages in it. These languages all appear as-is in the base file - and .rst files have the same unicode support as .txt, so if Bulgarian renders appropriately in Cyrillic in the base file I'm assuming it's encoded correctly.

I already know to use either LuaLaTeX or XeLaTeX to render unicode properly, and I've already found that TeX files compiled from Sphinx/.rst render better under LuaLaTeX. Even so, under LuaLaTeX, the Greek and Cyrillic don't render at all (nor do accented letters, but for some reason Germanic eth/ð does render).

Everything I've seen on multi-language support involves one of several packages that require you to bracket each section with something like \begin{Russian}, but for all 40+ languages. With the base file being in a different format and the .tex file being generated automatically, every time I update the manual it would save over all the work I've done.

The best solution for me would be to put all the multi-language support in the header, and just say "hey dumb dumb... just render the unicode text as-is". As it is, the auto-generated frontspiece and ToC is unsatisfactory, so I'm keeping the header saved in a separate document and I'm pasting the better header in. Front-loading multi-language support by defining everything in the header is definitely the most ideal solution.

Any help would be good.

The following is the header provided by Python-Sphinx, with minor adjustments:

%% Generated by Sphinx.
\def\sphinxdocclass{report}
\documentclass[letterpaper,10pt,english]{sphinxmanual}

\ifdefined\pdfpxdimen
   \let\sphinxpxdimen\pdfpxdimen\else\newdimen\sphinxpxdimen
\fi \sphinxpxdimen=.75bp\relax
\ifdefined\pdfimageresolution
    \pdfimageresolution= \numexpr \dimexpr1in\relax/\sphinxpxdimen\relax
\fi
%% let collapsible pdf bookmarks panel have high depth per default
\PassOptionsToPackage{bookmarksdepth=5}{hyperref}

\PassOptionsToPackage{warn}{textcomp}
\usepackage[utf8]{inputenc}
\ifdefined\DeclareUnicodeCharacter
% support both utf8 and utf8x syntaxes
  \ifdefined\DeclareUnicodeCharacterAsOptional
    \def\sphinxDUC#1{\DeclareUnicodeCharacter{"#1}}
  \else
    \let\sphinxDUC\DeclareUnicodeCharacter
  \fi
  \sphinxDUC{00A0}{\nobreakspace}
  \sphinxDUC{2500}{\sphinxunichar{2500}}
  \sphinxDUC{2502}{\sphinxunichar{2502}}
  \sphinxDUC{2514}{\sphinxunichar{2514}}
  \sphinxDUC{251C}{\sphinxunichar{251C}}
  \sphinxDUC{2572}{\textbackslash}
\fi

\usepackage{cmap}
\usepackage[T1]{fontenc}
\usepackage{amsmath,amssymb,amstext}
\usepackage{babel}

\usepackage{tgtermes}
\usepackage{tgheros}
\renewcommand{\ttdefault}{txtt}

\usepackage[Bjarne]{fncychap}
\usepackage{sphinx}

\fvset{fontsize=auto}
\usepackage{geometry}

% Include hyperref last.
\usepackage{hyperref}

% Fix anchor placement for figures with captions.
\usepackage{hypcap}% it must be loaded after hyperref.

% Set up styles of URL: it should be placed after hyperref.
\urlstyle{same}

\usepackage{sphinxmessages}

\title{...}
\date{\today}
\release{...}
\author{...}

\makeindex
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{document}

The document is almost entirely in English except for one dang section near but not at the end:

- Това е българско
- Αυτό είναι ελληνικό
- Tohle je česky
- Bu türkçe
- Þetta er íslenskt

\end{document}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

箹锭⒈辈孓 2025-02-12 16:45:07

警告: 这不会给出正确的连字符和其他特殊语言设置(例如,法语间距的标点符号),但它将显示文本。 ,则必须处理babel polyglossia 。


如果您也需要这些其他功能 还使用具有良好符号覆盖的字体。

例如,使用noto serif font:

% !TeX TS-program = lualatex
%% Generated by Sphinx.
\def\sphinxdocclass{report}
\documentclass[letterpaper,10pt,english]{sphinxmanual}

\ifdefined\pdfpxdimen
   \let\sphinxpxdimen\pdfpxdimen\else\newdimen\sphinxpxdimen
\fi \sphinxpxdimen=.75bp\relax
\ifdefined\pdfimageresolution
    \pdfimageresolution= \numexpr \dimexpr1in\relax/\sphinxpxdimen\relax
\fi
%% let collapsible pdf bookmarks panel have high depth per default
\PassOptionsToPackage{bookmarksdepth=5}{hyperref}

\PassOptionsToPackage{warn}{textcomp}
\usepackage[utf8]{inputenc}
\ifdefined\DeclareUnicodeCharacter
% support both utf8 and utf8x syntaxes
  \ifdefined\DeclareUnicodeCharacterAsOptional
    \def\sphinxDUC#1{\DeclareUnicodeCharacter{"#1}}
  \else
    \let\sphinxDUC\DeclareUnicodeCharacter
  \fi
  \sphinxDUC{00A0}{\nobreakspace}
  \sphinxDUC{2500}{\sphinxunichar{2500}}
  \sphinxDUC{2502}{\sphinxunichar{2502}}
  \sphinxDUC{2514}{\sphinxunichar{2514}}
  \sphinxDUC{251C}{\sphinxunichar{251C}}
  \sphinxDUC{2572}{\textbackslash}
\fi

\usepackage{cmap}
\usepackage[T1]{fontenc}
\usepackage{amsmath,amssymb,amstext}
\usepackage{babel}

\usepackage{tgtermes}
\usepackage{tgheros}
\renewcommand{\ttdefault}{txtt}

\usepackage[Bjarne]{fncychap}
\usepackage{sphinx}

\fvset{fontsize=auto}
\usepackage{geometry}

% Include hyperref last.
\usepackage{hyperref}

% Fix anchor placement for figures with captions.
\usepackage{hypcap}% it must be loaded after hyperref.

% Set up styles of URL: it should be placed after hyperref.
\urlstyle{same}

\usepackage{sphinxmessages}

\title{...}
\date{\today}
\release{...}
\author{...}

\makeindex
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\usepackage{fontspec}
\setmainfont{Noto Serif}

\begin{document}

The document is almost entirely in English except for one dang section near but not at the end:

- Това е българско

- Αυτό είναι ελληνικό

- Tohle je česky

- Bu türkçe

- Þetta er íslenskt


\end{document}

“在此处输入图像说明”

(查看计算机上的哪些字体支持您要使用的字符,您可以使用命令行工具albatross在

Caveat: This won't give correct hyphenation and other special language settings (e.g. French spacing for punctuation marks), but it will show the text. If you want these other features as well, you will have to deal with babel or polyglossia.


The unicode capabilities of xe- and lualatex only fully unfold if you also use a font which does have a good coverage of symbols.

For example with the Noto Serif font:

% !TeX TS-program = lualatex
%% Generated by Sphinx.
\def\sphinxdocclass{report}
\documentclass[letterpaper,10pt,english]{sphinxmanual}

\ifdefined\pdfpxdimen
   \let\sphinxpxdimen\pdfpxdimen\else\newdimen\sphinxpxdimen
\fi \sphinxpxdimen=.75bp\relax
\ifdefined\pdfimageresolution
    \pdfimageresolution= \numexpr \dimexpr1in\relax/\sphinxpxdimen\relax
\fi
%% let collapsible pdf bookmarks panel have high depth per default
\PassOptionsToPackage{bookmarksdepth=5}{hyperref}

\PassOptionsToPackage{warn}{textcomp}
\usepackage[utf8]{inputenc}
\ifdefined\DeclareUnicodeCharacter
% support both utf8 and utf8x syntaxes
  \ifdefined\DeclareUnicodeCharacterAsOptional
    \def\sphinxDUC#1{\DeclareUnicodeCharacter{"#1}}
  \else
    \let\sphinxDUC\DeclareUnicodeCharacter
  \fi
  \sphinxDUC{00A0}{\nobreakspace}
  \sphinxDUC{2500}{\sphinxunichar{2500}}
  \sphinxDUC{2502}{\sphinxunichar{2502}}
  \sphinxDUC{2514}{\sphinxunichar{2514}}
  \sphinxDUC{251C}{\sphinxunichar{251C}}
  \sphinxDUC{2572}{\textbackslash}
\fi

\usepackage{cmap}
\usepackage[T1]{fontenc}
\usepackage{amsmath,amssymb,amstext}
\usepackage{babel}

\usepackage{tgtermes}
\usepackage{tgheros}
\renewcommand{\ttdefault}{txtt}

\usepackage[Bjarne]{fncychap}
\usepackage{sphinx}

\fvset{fontsize=auto}
\usepackage{geometry}

% Include hyperref last.
\usepackage{hyperref}

% Fix anchor placement for figures with captions.
\usepackage{hypcap}% it must be loaded after hyperref.

% Set up styles of URL: it should be placed after hyperref.
\urlstyle{same}

\usepackage{sphinxmessages}

\title{...}
\date{\today}
\release{...}
\author{...}

\makeindex
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\usepackage{fontspec}
\setmainfont{Noto Serif}

\begin{document}

The document is almost entirely in English except for one dang section near but not at the end:

- Това е българско

- Αυτό είναι ελληνικό

- Tohle je česky

- Bu türkçe

- Þetta er íslenskt


\end{document}

enter image description here

(to see which fonts on your computer support the characters you want to use, you can use the command line tool albatross, see e.g. https://stackoverflow.com/a/69721465/2777074)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文