Perl 解码西里尔字母字符串
我遇到以下字符串问题:
$str="this is \321\213\321\213\321\213\321\213\321\213 \321\201\320\277\320\260\321\200\321\202\320\260\321\200";
该字符串位于 ascii 文本文件中,我想存储在 Mysql 数据库(utf8)中。 \321\231 ... 是西里尔字母符号。
我该怎么做才能使 \321\213 看起来像 Mysql db 中的西里尔字符
这应该在 RFC2047 中描述,最终看起来像是 utf7 到 utf8 的转换.. 不知道确切。 它的“unicode escape”
工作变体:
use Encode::Escape;
$var1='\321\213';
print decode 'unicode-escape', $var1;
#correct mysql view in phpmyadmin
$dbh = DBI->connect('DBI:mysql:database=test', 'testuser', 'testpass', { mysql_enable_utf8 => 1});
I've got a problem with the following string:
$str="this is \321\213\321\213\321\213\321\213\321\213 \321\201\320\277\320\260\321\200\321\202\320\260\321\200";
This string is located in an ascii text file and I want to store in a Mysql db (utf8). \321\231 ... are cyrillic symbols.
What can I do to make \321\213 look like cyrillic characters in Mysql db
This should be described in RFC2047, end look like it was utf7 to utf8 conversion.. dont know excatly.
its "unicode escape"
working variant:
use Encode::Escape;
$var1='\321\213';
print decode 'unicode-escape', $var1;
#correct mysql view in phpmyadmin
$dbh = DBI->connect('DBI:mysql:database=test', 'testuser', 'testpass', { mysql_enable_utf8 => 1});
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
这根本不可引用打印。这是一系列八位位组的 Perl 带引号的字符串表示形式,也称为
PERLQQ
。数字是八进制的。这些字节大部分采用 UTF-8 编码,但数据包含两个错误。看起来每个角色的一半都不知何故脱落了。我在下面用箭头标记了它。
这在 UTF-8 中无效,但可以修复。我们放置Unicode 替换字符。
现在可以像往常一样简单地将这个字符串插入到数据库中。 DBI 或 DBIx::Class 的
connect
调用中的 DSN 必须包含属性mysql_enable_utf8
。This is not quoted-printable at all. This is Perl quoted string representation, also know as
PERLQQ
, of a series of octets. The numbers are octal.These bytes encode UTF-8 for the most part, but the data contain two errors. Looks like one half of a character each somehow fell off. I have marked it with arrows just below.
This invalid in UTF-8, but can be repaired. We put the Unicode replacement character.
This character string can now be simply inserted into the database as usual. The DSN in the
connect
call for DBI or DBIx::Class must include the attributemysql_enable_utf8
.您需要将代码显式转换为字符。为此,您需要知道输入编码是什么。我想它是 iso-8859-5,但也可能是 windows-1252 或其他。
我刚刚看到你的源字符串确实是QP,所以你需要从QP转换为字节;这很简单,只需使用 MIME::QuotedPrint:
You need to convert explicitly the codes to characters. For that you need to know what's the input encoding. I suppose it's iso-8859-5, but it could be windows-1252 or something else.
I've just seen that your source string is indeed QP, so you need to convert from QP to bytes; that's easy, simply use MIME::QuotedPrint:
问题是:perl 不知道字符串是 UTF-8,因此您必须显式打开标志。
编码::_utf8_on($str);
Problem is: perl does not know that the string is UTF-8, so you must turn flag explicitly on.
Encode::_utf8_on($str);