CakePHP编码问题:存储大写S并在顶部加减号,保存在数据库中但在蛋糕处理时导致错误
所以我在一个提供楔形文字板信息的网站上工作。我们使用闪族字符进行音译。
在我的脚本中,我根据平板电脑的音译创建了一个术语列表。
我的问题是,对于 Š,我的脚本创建了两个不同的术语,因为它认为单词中存在空格,因为 cake 处理特殊字符的方式。
示例:
平板电脑的部分内容:
- utu-DIŠ-nu-il2
由我的脚本处理时来自平板电脑的术语:
utu-DIŠ,-nu-il2
应该是:
utu-DIŠ-nu-il2
当我在处理内容的过程中打印数组的内容时,我看到以下内容:
- utu-DI�-nu-il2
因此,这意味着对文本的不正确解析会创建一个空格,该空格在我的脚本中被解释为 2 个单词而不是 1 个单词。
在数据库中,文本很好...
我也收到这些错误:
警告(512):SQL 错误:1366:字符串值不正确:第 1 行“term”列的“\xC5”[CORE\cake\libs\model\datasources\dbo_source.php,第 684 行]
查询:INSERT INTO
terms
(term
,lft
,rght
) VALUES ('utu-DI� ', 449, 450)查询:INSERT INTO
terms
(term
,lft
,rght
) VALUES ('A�', 449, 450)查询:INSERT INTO
terms
(term
,lft
,rght
) VALUES ('xDI�', 449, 450)
有人知道我可以做些什么来使这项工作成功吗?
谢谢 !
添加信息:
$terms=$this->data['Tablet']['translit'];
$terms= str_replace(array('\r\n', '\r', '\n','\n\r','\t'), ' ', $terms);
$terms = trim($terms, chr(173));
print_r($terms);
$terms = preg_replace('/\s+/', ' ', $terms);
$terms = explode(" ", $terms);
$terms=array_map('trim', $terms);
$anti_terms = array('@tablet','1.','2.','3.','4.','5.','6.','7.','7.','9.','10.','11.','12.','13.','14.','15.','16.','17.','18.','19.','20.','Rev.',
'Obv.','@tablet','@obverse','@reverse','C1','C2','C3','C4','C5','C6','C7','C8','C9', '\r', '\n','\r\n', '\t',''. ' ', null, chr(173), 'x', '[x]','[...]' );
foreach($terms as $key => $term) {
if(in_array($term, $anti_terms) || is_numeric($term)) {
unset($terms[$key]);
}
}
如果我将 print_r 放在预浸料之前,则 S 很好,如果我在预浸料之前放置,则它们会显示为黑色菱形。所以我猜 preg 函数就是问题所在!
刚刚发现这个: http://www.php.net/manual/fr/function .preg-replace.php#84385
但似乎
mb_ereg_replace()
会导致与 preg_replace() 相同的问题....
解决方案:
mb_internal_encoding("UTF-8");
mb_regex_encoding("UTF-8");
$terms = mb_ereg_replace('\s+', ' ', $terms);
错误消失了...!
So I am working in a site that sores cuneiform tablets info. We use semitic chars for transliteration.
In my script, I create a term list from the translittaration of a tablet.
My problem is that with the Š, my script created two different terms because it thinks there is a space in the word because of the way cake treats the special char.
Exemple :
Partial contents of a tablet :
- utu-DIŠ-nu-il2
Terms from the tablet when treated by my script :
utu-DIŠ, -nu-il2
it should be :
utu-DIŠ-nu-il2
When I print the contents of my array in course of treatment of the contents, I see this :
- utu-DI� -nu-il2
So this means the uncorrect parsing of the text creates a space that is interpreted in my script as 2 words instead of one.
In the database, the text is fine...
I also get these errors :
Warning (512): SQL Error: 1366: Incorrect string value: '\xC5' for column 'term' at row 1 [CORE\cake\libs\model\datasources\dbo_source.php, line 684]
Query: INSERT INTO
terms
(term
,lft
,rght
) VALUES ('utu-DI�', 449, 450)Query: INSERT INTO
terms
(term
,lft
,rght
) VALUES ('A�', 449, 450)Query: INSERT INTO
terms
(term
,lft
,rght
) VALUES ('xDI�', 449, 450)
Anybody knows what I could do to make this work ?
Thanks !
Added info :
$terms=$this->data['Tablet']['translit'];
$terms= str_replace(array('\r\n', '\r', '\n','\n\r','\t'), ' ', $terms);
$terms = trim($terms, chr(173));
print_r($terms);
$terms = preg_replace('/\s+/', ' ', $terms);
$terms = explode(" ", $terms);
$terms=array_map('trim', $terms);
$anti_terms = array('@tablet','1.','2.','3.','4.','5.','6.','7.','7.','9.','10.','11.','12.','13.','14.','15.','16.','17.','18.','19.','20.','Rev.',
'Obv.','@tablet','@obverse','@reverse','C1','C2','C3','C4','C5','C6','C7','C8','C9', '\r', '\n','\r\n', '\t',''. ' ', null, chr(173), 'x', '[x]','[...]' );
foreach($terms as $key => $term) {
if(in_array($term, $anti_terms) || is_numeric($term)) {
unset($terms[$key]);
}
}
If I put my print_r before the preg, the S are good, if I do it after, they display with the black lozenge. So I guess the preg function is the problem !
just found this :
http://www.php.net/manual/fr/function.preg-replace.php#84385
But it seems that
mb_ereg_replace()
causes the same problem as preg_replace() ....
Solutuion :
mb_internal_encoding("UTF-8");
mb_regex_encoding("UTF-8");
$terms = mb_ereg_replace('\s+', ' ', $terms);
and error is gone ... !
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)