“字符串值不正确”当尝试从 Rails 将 UTF-8 插入 MySQL 时

发布于 2025-01-10 03:54:32 字数 365 浏览 0 评论 0原文

在调试我的 Rails 应用程序时,我在日志文件中发现以下消息:

(0.1ms)  ROLLBACK
Completed 500 Internal Server Error in 25ms (ActiveRecord: 4.2ms)
ActiveRecord::StatementInvalid (Mysql2::Error: Incorrect string value: '\xF0\x9F\x98\x89 u...' for column 'description' at row 1: INSERT INTO `course` (`title`, `description`) VALUES ('sometitle', '<p>Description containing 
              

While debugging my Rails App I found the following messages in the log file:

(0.1ms)  ROLLBACK
Completed 500 Internal Server Error in 25ms (ActiveRecord: 4.2ms)
ActiveRecord::StatementInvalid (Mysql2::Error: Incorrect string value: '\xF0\x9F\x98\x89 u...' for column 'description' at row 1: INSERT INTO `course` (`title`, `description`) VALUES ('sometitle', '<p>Description containing ???? and stuff</p>')

This seems to stem from my database being MySQL with not-quite-utf-8:

CREATE TABLE `course` (
  `id` int NOT NULL AUTO_INCREMENT,
  `title` varchar(250) DEFAULT NULL,
  `description` text,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=2080 DEFAULT CHARSET=utf8;

According to the answers to this question CHARSET=utf8 is only capable of handling 3 byte characters, not 4 byte characters.

The Emoticon ???? needs four bytes - see \xF0\x9F\x98\x89 in the log file.

I am wary of converting the whole database. I would rather forbid the use of emoticons and other 4 byte characters - they are really not necessary on my site.

What is the best way to do this in Rails?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

甜扑 2025-01-17 03:54:32

基于这些答案的正则表达式我实现了一个验证器:

# file /lib/three_byte_validator.rb
# forbid characters that use more than three byte
class ThreeByteValidator < ActiveModel::EachValidator
  def validate_each(record, attribute, value)
    if value =~ /[\u{10000}-\u{10FFFF}]/
      record.errors.add attribute, (options[:message] || 'Keine Emoticons, keine UTF-8 Zeichen mit 4 Byte')
    end
  end
end

现在我可以在第一个出现问题的模型上使用这个验证器:

class Course < ApplicationRecord
  validates :title, length: { in: 3..100 }, three_byte: true
  validates :description, length: { minimum: 50 }, three_byte: true

也可以在其他模型上使用:

class Person < ApplicationRecord
  validates :firstname, :lastname, :country, :city, :address, three_byte: true

Building on the regular expressions from these answers I implemented a validator:

# file /lib/three_byte_validator.rb
# forbid characters that use more than three byte
class ThreeByteValidator < ActiveModel::EachValidator
  def validate_each(record, attribute, value)
    if value =~ /[\u{10000}-\u{10FFFF}]/
      record.errors.add attribute, (options[:message] || 'Keine Emoticons, keine UTF-8 Zeichen mit 4 Byte')
    end
  end
end

Now I can use this validator on the model that first had the problem:

class Course < ApplicationRecord
  validates :title, length: { in: 3..100 }, three_byte: true
  validates :description, length: { minimum: 50 }, three_byte: true

and also on other models:

class Person < ApplicationRecord
  validates :firstname, :lastname, :country, :city, :address, three_byte: true
樱娆 2025-01-17 03:54:32

在 MySQL 中,眨眼表情(以及大多数其他表情符号)需要 utf8mb4 而不是 utf8

In MySQL, the Winking Face (and most other Emoji) needs utf8mb4 instead of utf8.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文