SVN 错误:无法将字符串从本机编码转换为“UTF-8”

发布于 2024-08-18 09:39:12 字数 806 浏览 6 评论 0原文

我有一个提交后挂钩脚本,当提交到存储库时,该脚本会执行工作副本的 SVN 更新。

当用户使用 TortoiseSVN 从 Windows 计算机提交到存储库时,他们会收到以下错误:

post-commit hook failed (exit code 1) with output:
svn: Error converting entry in directory '/home/websites/devel/website/guides/Images' to UTF-8
svn: Can't convert string from native encoding to 'UTF-8':
svn: Teneriffa-S?\195?\188d.jpg

上面有问题的文件是:Teneriffa-Süd.jpg 注意带重音的 u。这是因为该网站是德语的,并且文件是用德语拼写的。

在 Linux 命令行上对工作副本执行更新时,不会遇到错误。仅当 Windows SVN 客户端通过提交执行提交后挂钩时,才会存在上述错误。

问题:

  1. 为什么 SVN 会尝试更改文件的编码?
  2. 文件名是否允许包含 Windows 标准 ASCII 之外的字符?

更新:

事实证明,当从 Windows 计算机(通过 Samba)查看时,相关文件的文件名正确显示为 Teneriffa-Süd.jpg,但当我从 Linux 服务器查看文件名(使用 SSH 和PuTTY) 文件所在位置我得到 Teneriffa-Süd.jpg

I've got a post-commit hook script that performs a SVN update of a working copy when commits are made to the repository.

When users commit to the repository from their Windows machines using TortoiseSVN they get the following error:

post-commit hook failed (exit code 1) with output:
svn: Error converting entry in directory '/home/websites/devel/website/guides/Images' to UTF-8
svn: Can't convert string from native encoding to 'UTF-8':
svn: Teneriffa-S?\195?\188d.jpg

The file in question above is: Teneriffa-Süd.jpg notice the accented u. This is because the site is German and the files have been spelt in German.

When executing a update on the working copy at the Linux command-line no errors are encountered. The above error only exists when the post-commit hook is executed via a commit by a Windows SVN client.

Questions:

  1. Why would SVN try to change the encoding of a file?
  2. Are filenames allowed to contain chars that are outside the Windows standard ASCII ones?

Update:

It turns out that the file in question's filename correctly displays as Teneriffa-Süd.jpg when viewed from a Windows machine (via Samba) but when I view the filename from the Linux server (using SSH and PuTTY) where the file resides I get Teneriffa-Süd.jpg

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(11

感情洁癖 2024-08-25 09:39:12

另一个例子:

$ svn update
svn: Error converting entry in directory '.' to UTF-8
svn: Can't convert string from native encoding to 'UTF-8':

$ export LC_CTYPE=en_US.UTF-8

$ svn update

(...现在一切都很好)

Yet another example:

$ svn update
svn: Error converting entry in directory '.' to UTF-8
svn: Can't convert string from native encoding to 'UTF-8':

$ export LC_CTYPE=en_US.UTF-8

$ svn update

(... and all is fine now)

勿忘初心 2024-08-25 09:39:12
  1. 它不会更改文件的编码。它改变了文件名的编码(希望每个客户端都能理解)。
  2. 经谁允许? NTFS 使用 16 位代码点,Windows 可以根据您的要求以各种编码公开文件名(它会尝试将它们转换为您要求的编码)。现在...这一点(你怎么问)取决于你使用的特定 svn 客户端。在我看来,这就像 TortoiseSVN 中的一个错误。

编辑添加:

呃。我误解了症状。 svn 服务器以 utf-8 存储所有内容(看起来它成功地做到了这一点)。

post-commit 挂钩是无法从 UTF-8 转换的位。如果我正确理解你的意思,服务器上的提交后挂钩会触发对共享驱动器的 svn 更新(因此 svn 服务器会启动一个 svn 客户端......)?这意味着需要修复的配置是服务器上客户端的配置。
检查执行 svn 服务器的环境中的 LANG / LC_ALL。。碰巧,挂钩在 真空环境(参见提示)。所以你应该在钩子本身中设置变量。

另请参阅此页面了解有关 svn 如何处理本地化的信息

  1. It does not change the encoding of the file. It changes the encoding of the filename (to something that every client can hopefully understand).
  2. Allowed by whom ? NTFS uses 16-bit code points, and Windows can expose the file names in various encodings, based on how you ask for it (it will try to convert them to the encoding you ask for). Now... That bit (how you ask) depends on the specific svn client you use. It sounds to me like a bug in TortoiseSVN.

Edit to add:

Ugh. I misunderstood the symptoms. the svn server stores everything in utf-8 (and it seems that it did that successfully).

The post-commit hook is the bit that fails to convert from UTF-8. If I understand what you're saying correctly, the post-commit hook on the server triggers an svn update to a shared drive (the svn server therefore starts an svn client to itself...) ? This means that the configuration that needs to be fixed is the one for the client on the server.
Check the LANG / LC_ALL on the environment executing the svn server.. As it happens, the hooks are run in a vacuum environment (see Tip). So you should set the variable in the hook itself.

See also this page for info on how svn handles localisation

滥情哥ㄟ 2024-08-25 09:39:12

如果错误是 -

[abc@288832-web3 public_html]$ svn update
svn: Error converting entry in directory 'images' to UTF-8
svn: Valid UTF-8 data
(hex: 46 65 6e 65 72 62 61 68)
followed by invalid UTF-8 sequence
(hex: e7 65 2b 46)

则执行此操作。

[abc@288832-web3 public_html]$ printf "\x46\x65\x6e\x65\x72\x62\x61\x68\n"
Fenerbah  

(这意味着系统在该文件夹中存在一些以“Fenerbah”开头的文件名。)

[abc@288832-web3 public_html]$ cd  images
[abc@288832-web3 images]$ rm -rf Fenerbahçe+Forma+2.jpg

因此您可以看到该名称中存在特殊字符,并且 SVN 不支持该字符。

If Error is -

[abc@288832-web3 public_html]$ svn update
svn: Error converting entry in directory 'images' to UTF-8
svn: Valid UTF-8 data
(hex: 46 65 6e 65 72 62 61 68)
followed by invalid UTF-8 sequence
(hex: e7 65 2b 46)

Then do this.

[abc@288832-web3 public_html]$ printf "\x46\x65\x6e\x65\x72\x62\x61\x68\n"
Fenerbah  

(This means that the system has some file name starting with "Fenerbah" in that folder.)

[abc@288832-web3 public_html]$ cd  images
[abc@288832-web3 images]$ rm -rf Fenerbahçe+Forma+2.jpg

So you can see that there is a special character in the name and it is not supported by SVN.

旧人 2024-08-25 09:39:12

不要忘记在您的系统中生成这些区域设置
(作为根用户)

例如 Ru

locale-gen ru_RU.CP1251
locale-gen ru_RU.UTF-8
dpkg-reconfigure locales

Don't forget to generate those locales in your system
(as root)

example for Ru

locale-gen ru_RU.CP1251
locale-gen ru_RU.UTF-8
dpkg-reconfigure locales
岁月流歌 2024-08-25 09:39:12

把它放在你的提交后
导出 LANG=xxxxx(您的语言)

put this in your post-commit
export LANG=xxxxx (your lang)

温柔一刀 2024-08-25 09:39:12

只需在执行任何 svn 命令之前在脚本中使用以下行即可。
用户适当的语言代码,在下面的示例中我使用日语

export LC_ALL=ja_JP.UTF8

Just use the following line in your script before executing any svn command.
User appropriate language codes, in following example I used japanese

export LC_ALL=ja_JP.UTF8
猫七 2024-08-25 09:39:12

似乎所有 LC_ 变量末尾都需要 .UTF8。例如,我碰巧定义了 LC_ALL、LC_TIME 和 LC_CTYPE。设置 LC_CTYPE 后问题没有解决,所以我还需要输入 LC_ALL 然后它就工作了:

LC_ALL=en_US.UTF-8
LC_TIME=en_DK.UTF-8
LC_CTYPE=en_US.UTF-8

为了避免再次出现问题,我将文件复制到另一个名称,从 svn 中删除旧的,添加新的到svn,并向协作者发送消息不要这样做。

It seems that all LC_ varables need .UTF8 at the end. For example, I happened to have LC_ALL, LC_TIME, and LC_CTYPE defined. After setting LC_CTYPE the problem was not solved, so I needed to type LC_ALL as well and then it worked:

LC_ALL=en_US.UTF-8
LC_TIME=en_DK.UTF-8
LC_CTYPE=en_US.UTF-8

In order to avoid the problem again, I copied the file to a different name, removed the old one from svn, added new one to svn, and send a message to a collaborator not to do this.

噩梦成真你也成魔 2024-08-25 09:39:12
  1. 它将编码更改为位置中立编码,以防有人使用不同的编码检查它。

  2. 当然。但它不是“Windows”ASCII(Windows 实际上使用一些奇怪的编码,如 CP1251 等)。

解决此问题的最佳方法是确保您的系统尽可能使用 UTF-8(检查 $LANG)。

  1. It changes the encoding to a location-neutral encoding in case someone with a different encoding checks it out.

  2. Of course. But it's not "Windows" ASCII (Windows actually uses some strange encoding like CP1251 or so).

The best way to fix this is to make sure that your system uses UTF-8 whenever possible (check $LANG).

无声无音无过去 2024-08-25 09:39:12

在目录上运行“svn add”时我遇到了类似的问题,但解决方案不同。我无法使用 printf 看到“十六进制”数字(实际上 svn 没有显示十六进制输出),但是这个命令允许我查看结果,并修复它:

LC_ALL=C svn add probealign

我认为,一般来说,在命令之前粘贴 LC_ALL=C允许您查看有问题的文件...并且比粘贴大量 \x72 内容(显然可能不可用)容易得多。

I got a similar problem when running "svn add" on a directory, but the solution was different. I couldn't see the "hex" digits using printf (actually no hex output was shown by svn), but this command allowed me to see the results, and fix it:

LC_ALL=C svn add probealign

I think, in general, sticking LC_ALL=C before your command allows you to see the offending files... and is a lot easier than pasting in a lot of \x72 stuff (which apparently may not be available).

兮子 2024-08-25 09:39:12

有关信息,

当我的存储库 URL 为:

http://xxxx/svn/myrepos

我更改了存储库的 URL:

svn://xxxx/myrepos

现在一切都很完美。

我认为这些信息对某些人有用。

For information, I got this error on commit native encoding to 'UTF-8'with a windows client tortoise svn,

when my URL of repository was :

http://x.x.x.x/svn/myrepos

I changed my URL of repository for :

svn://x.x.x.x/myrepos

and now all is perferct.

I think this information will be useful to some.

厌味 2024-08-25 09:39:12

就我而言,我在 ~/.subversion/config 中进行了如下设置
<代码>
日志编码 = ...
评论

它有效。

In my case, I had the setting in ~/.subversion/config as below

log-encoding = ...

Commenting it worked.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文