MySQL 中的 Unicode(十六进制)字符文字

发布于 2024-10-04 12:03:42 字数 346 浏览 12 评论 0原文

有没有办法在 MySQL 中指定 Unicode 字符文字?

我想用 Ascii 字符替换 Unicode 字符,如下所示:

Update MyTbl Set MyFld = Replace(MyFld, "ẏ", "y")

但我使用的是大多数字体中不可用的更晦涩的字符,因此我希望能够使用 Unicode 字符文字,例如

Update MyTbl Set MyFld = Replace(MyFld, "\u1e8f", "y")

This SQL语句是从 PHP 脚本调用的 - 第一种形式不仅不可读,而且实际上不起作用!

Is there a way to specify Unicode character literals in MySQL?

I want to replace a Unicode character with an Ascii character, something like the following:

Update MyTbl Set MyFld = Replace(MyFld, "ẏ", "y")

But I'm using even more obscure characters which are not available in most fonts, so I want to be able to use Unicode character literals, something like

Update MyTbl Set MyFld = Replace(MyFld, "\u1e8f", "y")

This SQL statement is being invoked from a PHP script - the first form is not only unreadable, but it doesn't actually work!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

岛徒 2024-10-11 12:03:42

您可以指定十六进制文字(甚至二进制文字)使用0xx''X''

select  0xC2A2;
select x'C2A2';
select X'C2A2';

但是请注意返回类型是二进制字符串,因此每个字节都会被考虑一个角色。您可以使用 char_length 验证这一点:

select char_length(0xC2A2)

<代码>2

如果您想要 UTF-8 字符串,您需要使用转换

select convert(0xC2A2 using utf8mb4)

我们可以看到 C2 A2 在 UTF-8 中被视为 1 个字符:

select char_length(convert(0xC2A2 using utf8mb4))

<代码>1


此外,您不必担心无效字节,因为 convert 会自动删除它们:

select char_length(convert(0xC1A2 using utf8mb4))

<代码>0

可以看出,输出为 0,因为 C1 A2 是无效的 UTF-8 字节序列。

You can specify hexadecimal literals (or even binary literals) using 0x, x'', or X'':

select  0xC2A2;
select x'C2A2';
select X'C2A2';

But be aware that the return type is a binary string, so each and every byte is considered a character. You can verify this with char_length:

select char_length(0xC2A2)

2

If you want UTF-8 strings instead, you need to use convert:

select convert(0xC2A2 using utf8mb4)

And we can see that C2 A2 is considered 1 character in UTF-8:

select char_length(convert(0xC2A2 using utf8mb4))

1


Also, you don't have to worry about invalid bytes because convert will remove them automatically:

select char_length(convert(0xC1A2 using utf8mb4))

0

As can be seen, the output is 0 because C1 A2 is an invalid UTF-8 byte sequence.

遗失的美好 2024-10-11 12:03:42

感谢您的建议,但我认为问题出在系统上。

有很多级别需要取消,但据我所知,(至少在这台服务器上)该命令

set names utf8

使 utf-8 处理正常工作,反之则

set character set utf8

不然。

在我的环境中,这些是使用 PDO 从 PHP 调用的,这可能会产生什么差异。

无论如何,谢谢!

Thanks for your suggestions, but I think the problem was further back in the system.

There's a lot of levels to unpick, but as far as I can tell, (on this server at least) the command

set names utf8

makes the utf-8 handling work correctly, whereas

set character set utf8

doesn't.

In my environment, these are being called from PHP using PDO, for what difference that may make.

Thanks anyway!

红ご颜醉 2024-10-11 12:03:42

您可以使用十六进制unhex 函数,例如:

update mytable set myfield = unhex(replace(hex(myfield),'C383','C3'))

You can use the hex and unhex functions, e.g.:

update mytable set myfield = unhex(replace(hex(myfield),'C383','C3'))
春夜浅 2024-10-11 12:03:42

MySQL 字符串语法在此处指定,您可以请注意,没有提供数字转义序列。

但是,当您在 PHP 中嵌入 SQL 时,您可以在 PHP 中计算正确的字节。确保放入 SQL 中的字节实际上与您的 客户端字符集

The MySQL string syntax is specified here, as you can see, there is no provision for numeric escape sequences.

However, as you are embedding the SQL in PHP, you can compute the right bytes in PHP. Make sure the bytes you put into the SQL actually match your client character set.

极度宠爱 2024-10-11 12:03:42

There is also the char function that will allow what you wanted (providing byte numbers and a charset name) and getting a char.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文