MySQL 复制中的字符集

发布于 2024-09-05 16:40:09 字数 2162 浏览 5 评论 0原文

我该如何确保复制使用 latin1 而不是 utf-8？

我正在 Linux 系统上的 MySQL 5.1.22 服务器（主服务器）和 FreeBSD 系统上的 MySQL 5.1.42 服务器（从服务器）之间迁移。我的复制效果很好，但是当我的 varchar 中包含非 ASCII 字符时，它们会变得“奇怪”。 Linux/MySQL-5.1.22 显示以下字符集变量：

character_set_client=latin1
character_set_connection=latin1
character_set_database=latin1
character_set_filesystem=binary
character_set_results=latin1
character_set_server=latin1
character_set_system=utf8
character_sets_dir=/usr/share/mysql/charsets/
collation_connection=latin1_swedish_ci
collation_database=latin1_swedish_ci
collation_server=latin1_swedish_ci

而 FreeBSD 显示

character_set_client=utf8
character_set_connection=utf8
character_set_database=utf8
character_set_filesystem=binary
character_set_results=utf8
character_set_server=utf8
character_set_system=utf8
character_sets_dir=/usr/local/share/mysql/charsets/
collation_connection=utf8_general_ci
collation_database=utf8_general_ci
collation_server=utf8_general_ci

从 MySQL CLI 设置任何这些变量都无效，在 my.cnf 或命令行中设置它们会使服务器无法启动。

当然，两台服务器都以相同的方式创建有问题的表，在本例中使用 DEFAULT CHARSET=latin1。让我给你举个例子：

CREATE TABLE `test` (
  `test` varchar(5) DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1

当我在主服务器上，在 Latin1 终端上执行“INSERT INTO test VALUES ('æøå')”时，当我从基于 Latin1 的终端上选择它时，这会在从服务器上发生

+--------+
| test   |
+--------+
| Ã¦Ã¸Ã¥    |
+--------+

。在复制从站上基于 8.8 的终端，测试包含：

+--------+
| test   |
+--------+
| æøå    |
+--------+

所以我的结论是它被转换为 utf8，即使表定义是 latin1。这是一个正确的结论吗？

当然，在master上，在latin1终端中，它仍然说：

+------+
| test |
+------+
| æøå  | 
+------+

由于两个系统字符集都是utf-8，如果我将两个终端都设置为utf-8并再次执行“INSERT INTO test VALUES ('æøå')”带有utf-8终端的主站，在带有utf-8的从站上我得到：

+------------+
| test       |
+------------+
| Ã¦Ã¸Ã      |
+------------+

如果我的结论是正确的，那么我所有复制的数据都将转换为utf8（如果是utf8，则将其视为latin1并转换为utf8），而表中的所有旧数据，如 CREATE TABLE 所示，都是 latin1。如果不是因为遗留应用程序依赖于 latin1，我很想将其全部转换为 utf-8，所以我需要在它们仍然存在时将其保留为 latin1。

我该如何确保复制读取 latin1、将其视为 latin1 并将其作为 latin1 写入从属设备？

干杯

尼克

原文

What can I do to ensure that replication will use latin1 instead of utf-8?

I'm migrating between an MySQL 5.1.22 server (master) on a Linux system and a MySQL 5.1.42 server (slave) on a FreeBSD system. My replication works well, but when non-ascii characters are in my varchars, they turn "weird". The Linux/MySQL-5.1.22 shows the following character set variables:

character_set_client=latin1
character_set_connection=latin1
character_set_database=latin1
character_set_filesystem=binary
character_set_results=latin1
character_set_server=latin1
character_set_system=utf8
character_sets_dir=/usr/share/mysql/charsets/
collation_connection=latin1_swedish_ci
collation_database=latin1_swedish_ci
collation_server=latin1_swedish_ci

While the FreeBSD shows

character_set_client=utf8
character_set_connection=utf8
character_set_database=utf8
character_set_filesystem=binary
character_set_results=utf8
character_set_server=utf8
character_set_system=utf8
character_sets_dir=/usr/local/share/mysql/charsets/
collation_connection=utf8_general_ci
collation_database=utf8_general_ci
collation_server=utf8_general_ci

Setting any of these variables from the MySQL CLI has no effect, and setting them in my.cnf or at the command line makes the server not start.

Of course, both servers have the tables in question created the same way, in this case with DEFAULT CHARSET=latin1. Let me give you an example:

CREATE TABLE `test` (
  `test` varchar(5) DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1

When I on the master do, in a Latin1 terminal, "INSERT INTO test VALUES ('æøå')", this becomes on the slave, when I select it from a Latin1 based terminal

+--------+
| test   |
+--------+
| Ã¦Ã¸Ã¥    |
+--------+

On a UTF-8 based terminal on the replication slave, test contains:

+--------+
| test   |
+--------+
| æøå    |
+--------+

So my conclusion is that it is converted to utf8, even though the table definition is latin1. Is this a correct conclusion?

Of course, on the master, in a latin1 terminal, it still says:

+------+
| test |
+------+
| æøå  | 
+------+

Since both system character sets are utf-8, if I set both terminals to utf-8 and do again "INSERT INTO test VALUES ('æøå')" on the master with a utf-8 terminal, on the slave with utf-8 I get:

+------------+
| test       |
+------------+
| Ã¦Ã¸Ã      |
+------------+

If my conclusion is correct, all my replicated data is converted to utf8 (if it is utf8, it is treated as latin1 and converted to utf8), while all the old data in the table is, as the CREATE TABLE suggests, latin1. I'd love to convert it all to utf-8 if it weren't for the fact that legacy applications rely on it being latin1, so I need to keep it in latin1 while they still exist.

What can I do to ensure that the replication reads latin1, treats it as latin1 and writes it on the slave as latin1?

Cheers

Nik

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

花桑 2024-09-12 16:40:09

不支持全局character_set_% 和 collation% 参数不同的服务器之间的复制。

http://dev.mysql.com/doc/ refman/5.6/en/replication-features-charset.html

-- on both servers check the output of...
SHOW VARIABLES LIKE 'char%';
SHOW VARIABLES LIKE 'collat%';

如果设置 & ，不仅复制会失败。排序规则不同，但它可能会导致不同的排序顺序以及转换集/排序规则期间的字符丢失。如果使用基于语句的复制，排序顺序可能会影响插入/更新等操作。

您最好将新服务器配置为使用与旧服务器相同的集和排序规则。这将确保复制正常进行。您还需要确保数据库、表和列在主从服务器之间都具有相同的排序规则。一旦迁移到新服务器，您可以修改设置和设置。与 5.6 在线模式更改或 percona 工具包中的 pt-online-schema-change 等工具进行排序。

我还建议运行 percona 的 pt-table-checksum 以确保您的表在复制或初始导出/导入期间没有分歧。

有关差异影响的更多信息，请参阅此处：

对于任何使用 Amazon RDS 的人，请记住默认 mysql 5.6 设置使用混合 utf8(mb3) 和 latin1（用于服务器和数据库）。如果从非 RDS 到 RDS 或从 RDS 进行复制（匹配源/目标服务器），您应该使用自定义参数组覆盖这些参数。

replication between servers where global character_set_% and collation% parameters are different isn't supported.

http://dev.mysql.com/doc/refman/5.6/en/replication-features-charset.html

-- on both servers check the output of...
SHOW VARIABLES LIKE 'char%';
SHOW VARIABLES LIKE 'collat%';

not only can replication fail if sets & collations are different, but it can result in different sort orders and character loss during conversion sets/collations. sort order can impact things like insert / update if using statement based replication.

you're best off configuring the new server to use the same sets and collations as the old server. this will ensure replication works properly. you'll also want to make sure that database, tables and columns all have the same collations between master and slave. once you migrate to the new server you can modify set & collation with tools like 5.6 online schema change or pt-online-schema-change from percona toolkit.

i also recommend running percona's pt-table-checksum to make sure your tables haven't diverged during replication or initial export/import.

see here for more information about impact of differences:

to anyone who is using Amazon RDS, keep in mind the default mysql 5.6 settings use mixed utf8(mb3) and latin1 (for server and database). you should override those with a custom parameter group if replication from non-RDS to/from RDS (matching source/destination servers).

回复收藏 0 原文