OS X 文件名中的元音变音 (perl)

发布于 2024-11-03 10:40:22 字数 322 浏览 0 评论 0原文

我在 OS X 上的文件名中遇到一些元音变音(ü 字符)的问题。我正在从 perl 脚本创建目录。从概念上讲,我正在做的是:

$NAME = "abcüabc";
$PATH = "/Applications/MyProgram/".$NAME."/";
system('ditto', '--rsrc', $FROMPATH, $PATH . $FILENAME);

这将创建名为 "/Applications/MyProgram/abs%9Fabc/" 的文件夹。

有人知道如何修复此问题以创建具有正确字符的目录吗?

I'm having some troubles with umlauts (ü character) in filenames on OS X. I'm creating the directory from a perl script. Conceptually what I'm doing is:

$NAME = "abcüabc";
$PATH = "/Applications/MyProgram/".$NAME."/";
system('ditto', '--rsrc', $FROMPATH, $PATH . $FILENAME);

This creates the folder with the name "/Applications/MyProgram/abs%9Fabc/".

Anyone know how I can fix this to create the directory with the correct characters?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

装纯掩盖桑 2024-11-10 10:40:22

你必须说:

use utf8;

在你的 Perl 源代码中,如果你希望这些字符串被解释为字符而不是二进制。

% uname -a
Darwin arwen 10.4.0 Darwin Kernel Version 10.4.0: Fri Apr 23 18:28:53 PDT 2010; root:xnu-1504.7.4~1/RELEASE_I386 i386

% cat /tmp/makeit 
use utf8;

$name = "abcüabc";
$path = "/tmp/$name";

mkdir($name,0777) || die "can't mkdir $path: $!";

% perl /tmp/makeit

% ls -dF /tmp/abc*
/tmp/abcüabc/

看?如果你这样做的话,效果就很好。


编辑:您正在使用MacRoman!

% macroman 0x9F
MacRoman 0x9F  ⇒  U+00FC  ‹ü›  \N{LATIN SMALL LETTER U WITH DIAERESIS}

无论如何,文件系统中不能有字符 U+00FC,因为它会分解为 "u" 后跟 "\N{COMBINING DIAERESIS}"。您是否真的在 Perl 源代码中输入了 MacRoman 字符?但是你做了吗?请转换为Unicode!! Perl 不知道您的源代码是旧版 MacRoman 中的! U+009F 是一个控制代码,意思是“\N{应用程序命令}”。

在这里,观看:

% cat /tmp/makeit
use utf8;

$name = "abcüabc";
$path = "/tmp/$name";

mkdir($name,0777) || die "can't mkdir $path: $!";

% uniquote /tmp/makeit
use utf8;

$name = "abc\N{U+FC}abc";
$path = "/tmp/$name";

mkdir($name,0777) || die "can't mkdir $path: $!";

% uniquote -v /tmp/makeit
use utf8;

$name = "abc\N{LATIN SMALL LETTER U WITH DIAERESIS}abc";
$path = "/tmp/$name";

mkdir($name,0777) || die "can't mkdir $path: $!";

% uniquote -b /tmp/makeit
use utf8;

$name = "abc\xC3\xBCabc";
$path = "/tmp/$name";

mkdir($name,0777) || die "can't mkdir $path: $!";

% perl /tmp/makeit

% ls -Fd /tmp/abc* | uniquote -v
/tmp/abcu\N{COMBINING DIAERESIS}abc/

您可以从这里获取uniquote程序。它会向您显示文件中的实际内容。您还可以获得macroman脚本

您似乎以某种方式在 Perl 代码中输入了丑陋的旧 MacRoman。请转换为Unicode!

% iconv -f MacRoman -t UTF-8 < input > output

You have to say:

use utf8;

in your Perl source if you expect those strings to be interpreted as characters instead of binary.

% uname -a
Darwin arwen 10.4.0 Darwin Kernel Version 10.4.0: Fri Apr 23 18:28:53 PDT 2010; root:xnu-1504.7.4~1/RELEASE_I386 i386

% cat /tmp/makeit 
use utf8;

$name = "abcüabc";
$path = "/tmp/$name";

mkdir($name,0777) || die "can't mkdir $path: $!";

% perl /tmp/makeit

% ls -dF /tmp/abc*
/tmp/abcüabc/

See? It works just fine if you do that do it.


EDIT: You’re using MacRoman!

% macroman 0x9F
MacRoman 0x9F  ⇒  U+00FC  ‹ü›  \N{LATIN SMALL LETTER U WITH DIAERESIS}

And you cannot have a character U+00FC in the filesystem anyway, because it decomposes to a "u" followed by "\N{COMBINING DIAERESIS}". Did you actually enter MacRoman characters in your Perl source code? However did you do THAT? Please convert to Unicode!! Perl has no idea that your source code is in legacy MacRoman! U+009F is a control code meaning "\N{APPLICATION PROGRAM COMMAND}".

Here, watch:

% cat /tmp/makeit
use utf8;

$name = "abcüabc";
$path = "/tmp/$name";

mkdir($name,0777) || die "can't mkdir $path: $!";

% uniquote /tmp/makeit
use utf8;

$name = "abc\N{U+FC}abc";
$path = "/tmp/$name";

mkdir($name,0777) || die "can't mkdir $path: $!";

% uniquote -v /tmp/makeit
use utf8;

$name = "abc\N{LATIN SMALL LETTER U WITH DIAERESIS}abc";
$path = "/tmp/$name";

mkdir($name,0777) || die "can't mkdir $path: $!";

% uniquote -b /tmp/makeit
use utf8;

$name = "abc\xC3\xBCabc";
$path = "/tmp/$name";

mkdir($name,0777) || die "can't mkdir $path: $!";

% perl /tmp/makeit

% ls -Fd /tmp/abc* | uniquote -v
/tmp/abcu\N{COMBINING DIAERESIS}abc/

You can grab the uniquote program from here. It will show you what is really in the file. You can also get the macroman script.

You seem to have somehow entered ugly old MacRoman in your Perl code. Please please convert to Unicode!

% iconv -f MacRoman -t UTF-8 < input > output
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文