当前位置：文江博客话题详情

如何在 Matlab 图形中使用非 ASCII 字符（用于 LaTeX 文档）？

发布于 2024-10-17 02:40:21 字数 640 浏览 10 评论 0 原文

我正在使用将 Matlab 绘制的图形包含到 LaTeX 中。我通常的工作流程如下：

matlab中的脚本创建图形，
我在可视化图形编辑器中调整我发现需要调整的内容，
图形保存为.fig（用于将来修改）和.eps（用于包含在LaTeX中），
我将.eps文件转换为.pdf，
PDF文件在LaTeX源代码中引用。

重点是：当我尝试在轴标签、图例、标题等中使用非 ASCII 字符（确切地说：波兰国家字符，例如“ą”、“ę”、“ś”、“ć”）编码时Matlab 图形编辑器运行良好，字符显示正常。导出到 .eps 后，它们都是错误的（例如：“Głębokość”变成“G³êbokoœæ”）。

是否有一种方法可以通过调整 Matlab 选项或更改我的工作流程来正确执行此操作？

注意：我发现导出到 .png 或其他非矢量格式可以正确处理字符编码，但我想避免这样做那样 - 我在问一种“保持矢量”的方法。直接导出到 .pdf 会产生与 .eps 相同的效果，例如，它会产生错误的结果。

附言。 Matlab是R2008a，.latex文件用pdflatex编译，.eps文件用epstopdf从MikTeX 2.9编译（均在Win7下）。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

失与倦＂ 2024-10-24 02:40:21

你可以看看psfrag，这是我尝试时通常使用的在 LaTeX 中使用 Matlab 图形。基本上，您只需将标签放入 Matlab 中的图形中，然后用 LaTeX 文本替换这些标签。最大的好处是，这允许您在文本和图形中使用相同的符号。

编辑：在寻找 psfrag-URL 时，我找到了一个 Matlab 脚本来简化此操作：
LaPrint。

回复收藏 0 原文

初雪 2024-10-24 02:40:21

另一种可能的解决方案是使用 matlab2tikz。它创建一个 tikz/pgfplot 源文件，可以直接包含在您的 Latex 源中。这意味着它使用 LaTeX 的工具来渲染字体。您可以直接编辑生成的文件来调整标签等。不幸的是，它并不适用于所有 MATLAB 图形。

回复收藏 0 原文

空城缀染半城烟沙 2024-10-24 02:40:21

char(2048) will be shown by `print -depsc` as 'à ',
char(5064) as 'á',
char(28808) as 'ç',
char(37000) as 'é',
char(32904) as 'è', ...

对于 latin1 字符集中的其他字符，请看：

for j=0:4*64;clf;subplot(1,1,1);plot(eye(2));leg='';for i=4*(j+1)-1:-1:max(1,4*j);
str=['     ',num2str(i*64)];leg(i,:)=[str(end-4:end),':',char(64*i+(0:63))];
end;
title(leg,'interpreter','none');print('-depsc',['ascii',num2str(j),'.ps']);
end;

我正在使用 pdflatex，因此 psfrag 不是一个选项，并且 pdfrack 似乎已损坏。

char(2048) will be shown by `print -depsc` as 'à ',
char(5064) as 'á',
char(28808) as 'ç',
char(37000) as 'é',
char(32904) as 'è', ...

For other characters in latin1 charset, Look at:

for j=0:4*64;clf;subplot(1,1,1);plot(eye(2));leg='';for i=4*(j+1)-1:-1:max(1,4*j);
str=['     ',num2str(i*64)];leg(i,:)=[str(end-4:end),':',char(64*i+(0:63))];
end;
title(leg,'interpreter','none');print('-depsc',['ascii',num2str(j),'.ps']);
end;

I am using pdflatex, so psfrag is not an option, and pdfrack seems to be broken.

回复收藏 0 原文

烂人 2024-10-24 02:40:21

对于导出具有非 ASCII ISO-8859-1 字符的 Matlab 图形，在 Windows 上没有问题，但在具有 UTF-8 语言环境的 Linux 上，存在 Matlab 错误和解决方法。这里的问题针对的是 ISO-8859-1 之外的字符，这比较棘手。这是我在相关问题上发布的解决方案。

如果所需的字符数小于 256（8 位格式）并且理想情况下采用标准编码集，则一种解决方案是：

将八进制代码转换为 Unicode 字符；
将文件保存为目标编码标准（8位格式）；
添加目标编码集的编码向量。

例如，如果要导出波兰语文本，则需要将文件转换为 ISO-8859-2。这是使用Python（多平台）的实现：

#!/usr/bin/python
# -*- coding: utf-8 -*-
import sys,codecs
input = sys.argv[1]
fo = codecs.open(input[:-4]+'_latin2.eps','w','latin2')
with codecs.open(input,'r','string_escape') as fi:
    data = fi.readlines()
with open('ISOLatin2Encoding.ps') as fenc:
    for line in data:
        fo.write(line.decode('utf-8').replace('ISOLatin1Encoding','MyEncoding'))
        if line.startswith('%%EndPageSetup'):
            fo.write(fenc.read())
fo.close()

另存为eps_lat2.py；然后运行命令python eps_lat2.py file.eps（其中file.eps是Matlab创建的eps），使用Latin-2编码创建file_latin2.eps。文件 ISOLatin2Encoding.ps 包含编码向量：

/MyEncoding
% The first 144 entries are the same as the ISO Latin-1 encoding.
ISOLatin1Encoding 0 144 getinterval aload pop
% \22x
    /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef
    /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef
% \24x
    /nbspace /Aogonek /breve /Lslash /currency /Lcaron /Sacute /section
    /dieresis /Scaron /Scedilla /Tcaron /Zacute /hyphen /Zcaron /Zdotaccent
    /degree /aogonek /ogonek /lslash /acute /lcaron /sacute /caron
    /cedilla /scaron /scedilla /tcaron /zacute /hungarumlaut /zcaron /zdotaccent
% \30x
    /Racute /Aacute /Acircumflex /Abreve /Adieresis /Lacute /Cacute /Ccedilla
    /Ccaron /Eacute /Eogonek /Edieresis /Ecaron /Iacute /Icircumflex /Dcaron
    /Dcroat /Nacute /Ncaron /Oacute /Ocircumflex /Ohungarumlaut /Odieresis /multiply
    /Rcaron /Uring /Uacute /Uhungarumlaut /Udieresis /Yacute /Tcedilla /germandbls
% \34x
    /racute /aacute /acircumflex /abreve /adieresis /lacute /cacute /ccedilla
    /ccaron /eacute /eogonek /edieresis /ecaron /iacute /icircumflex /dcaron
    /dcroat /nacute /ncaron /oacute /ocircumflex /ohungarumlaut /odieresis /divide
    /rcaron /uring /uacute /uhungarumlaut /udieresis /yacute /tcedilla /dotaccent
256 packedarray def

这是另一个实现在使用 Bash 的 Linux 上：

#!/bin/bash
name=$(basename "$1" .eps)
ascii2uni -a K "$1" > /tmp/eps_uni.eps
iconv -t ISO-8859-2 /tmp/eps_uni.eps -o "$name"_latin2.eps
sed -i -e '/%EndPageSetup/ r ISOLatin2Encoding.ps' -e 's/ISOLatin1Encoding/MyEncoding/' "$name"_latin2.eps

另存为 eps_lat2；然后运行命令 sh eps_lat2 file.eps 使用 Latin-2 编码创建 file_latin2.eps。

通过更改脚本中的编码向量和 iconv（或 codecs.open）参数，它可以轻松适应其他 8 位编码标准。

For exporting a Matlab figure with non-ASCII ISO-8859-1 characters, there is no problem on Windows, but on Linux with a UTF-8 locale there is a Matlab bug and a workaround. The question here targets characters that are not in ISO-8859-1, which is more tricky. Here is a solution that I posted on a related question.

If the number of characters needed is less than 256 (8-bit format) and ideally in a standard encoding set, then one solution is to:

Convert the octal code into the Unicode character;
Save the file into the target encoding standard (in a 8-bit format);
Add the encoding vector for the target encoding set.

For example, if you want to export Polish text, you need to convert the file into ISO-8859-2. Here is an implementation with Python (multi-platform):

#!/usr/bin/python
# -*- coding: utf-8 -*-
import sys,codecs
input = sys.argv[1]
fo = codecs.open(input[:-4]+'_latin2.eps','w','latin2')
with codecs.open(input,'r','string_escape') as fi:
    data = fi.readlines()
with open('ISOLatin2Encoding.ps') as fenc:
    for line in data:
        fo.write(line.decode('utf-8').replace('ISOLatin1Encoding','MyEncoding'))
        if line.startswith('%%EndPageSetup'):
            fo.write(fenc.read())
fo.close()

saved as eps_lat2.py; then running the command python eps_lat2.py file.eps, where file.eps is the eps created by Matlab, creates file_latin2.eps with Latin-2 encoding. The file ISOLatin2Encoding.ps contains the encoding vector:

/MyEncoding
% The first 144 entries are the same as the ISO Latin-1 encoding.
ISOLatin1Encoding 0 144 getinterval aload pop
% \22x
    /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef
    /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef /.notdef
% \24x
    /nbspace /Aogonek /breve /Lslash /currency /Lcaron /Sacute /section
    /dieresis /Scaron /Scedilla /Tcaron /Zacute /hyphen /Zcaron /Zdotaccent
    /degree /aogonek /ogonek /lslash /acute /lcaron /sacute /caron
    /cedilla /scaron /scedilla /tcaron /zacute /hungarumlaut /zcaron /zdotaccent
% \30x
    /Racute /Aacute /Acircumflex /Abreve /Adieresis /Lacute /Cacute /Ccedilla
    /Ccaron /Eacute /Eogonek /Edieresis /Ecaron /Iacute /Icircumflex /Dcaron
    /Dcroat /Nacute /Ncaron /Oacute /Ocircumflex /Ohungarumlaut /Odieresis /multiply
    /Rcaron /Uring /Uacute /Uhungarumlaut /Udieresis /Yacute /Tcedilla /germandbls
% \34x
    /racute /aacute /acircumflex /abreve /adieresis /lacute /cacute /ccedilla
    /ccaron /eacute /eogonek /edieresis /ecaron /iacute /icircumflex /dcaron
    /dcroat /nacute /ncaron /oacute /ocircumflex /ohungarumlaut /odieresis /divide
    /rcaron /uring /uacute /uhungarumlaut /udieresis /yacute /tcedilla /dotaccent
256 packedarray def

Here is another implementation on Linux with Bash:

#!/bin/bash
name=$(basename "$1" .eps)
ascii2uni -a K "$1" > /tmp/eps_uni.eps
iconv -t ISO-8859-2 /tmp/eps_uni.eps -o "$name"_latin2.eps
sed -i -e '/%EndPageSetup/ r ISOLatin2Encoding.ps' -e 's/ISOLatin1Encoding/MyEncoding/' "$name"_latin2.eps

saved as eps_lat2; then running the command sh eps_lat2 file.eps creates file_latin2.eps with Latin-2 encoding.

It can easily be adapted to other 8-bit encoding standards by changing the encoding vector and the iconv (or codecs.open) parameter in the script.

回复收藏 0 原文

~没有更多了~