Drupal编码和节点插入

发布于 2024-10-18 08:40:50 字数 2636 浏览 2 评论 0原文

我有一个 CCK 类型用于存储提及(社交媒体搜索提及)。我相信其中提到的一些内容是 ASCII(我对这些东西的了解很少)。

我从 API 检索数据,然后使用 node_save 将其保存到 Drupal。

我的问题是,我应该使用什么来安全地将我收到的任何内容转换为 Drupal 和 MySQL 满意的格式?

我得到的特定 db_query 错误是无益的“Warning in test1\includes\common.inc on line 3538”。好的。我已追踪它是编码,因为我使用以下代码来使输入安全,但它不适用于所有输入。

$node->title = htmlentities($item['title'], ENT_COMPAT, 'UTF-8');

它适用于某些 ASCII 字符,例如那些方形字符 [] 等,但不适用于“行けなくてもずっとユーミンは聴きつづけます”。

我真的被困住了。 :(

更新:我从 PHP 得到的确切错误是“第 3538 行 D:\sites\test1\includes\common.inc 中的警告”,该行读取“if (db_query($query, $values)) {”更新

:我已经确认我收到的数据的编码是UTF8。这现在确实没有意义,并且我已经确认数据库中的排序规则是utf8_general_ci。

2 是:Facebook 粉丝要花多少钱? 1.07 美元

输出:

var_export(array_map('ord', str_split($node->title))

给了我字符 160 来表示有趣的问号(这是一个像 eclipse 中的 [] 的正方形)。

更新 4:MySQL 版本是 5.1.41,列上的排序规则是 utf8_general_ci。

更新 5:我设法让 Drupal 使用 db_queryd 打印查询。有趣的是,现在我收到了确切的错误消息,而不是“警告”,但 Drupal 仍然没有此错误。 WTF ,所以确切的 sql 是:

INSERT INTO node (vid, type, language, title, uid, status, created, changed, comment, promote, moderate, sticky, tnid, translate) VALUES (0, 'sm_mention', '', 'How Much Does A Facebook Fan Cost?� $1.07 (Geoffrey A. Fowler/Digits)', 1, 1, 1298395302, 1298395302, 0, 0, 0, 0, 0, 0)

给出的错误是: 第 1 行的列 'title' 的字符串值不正确:'\xA0 $1.0...'

老实说,这听起来像是不喜欢扩展的 ASCII 字符更新

6:

 SHOW CREATE TABLE node: 

   CREATE TABLE `node` (
  `nid` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `vid` int(10) unsigned NOT NULL DEFAULT '0',
  `type` varchar(32) NOT NULL DEFAULT '',
  `language` varchar(12) NOT NULL DEFAULT '',
  `title` varchar(255) NOT NULL DEFAULT '',
  `uid` int(11) NOT NULL DEFAULT '0',
  `status` int(11) NOT NULL DEFAULT '1',
  `created` int(11) NOT NULL DEFAULT '0',
  `changed` int(11) NOT NULL DEFAULT '0',
  `comment` int(11) NOT NULL DEFAULT '0',
  `promote` int(11) NOT NULL DEFAULT '0',
  `moderate` int(11) NOT NULL DEFAULT '0',
  `sticky` int(11) NOT NULL DEFAULT '0',
  `tnid` int(10) unsigned NOT NULL DEFAULT '0',
  `translate` int(11) NOT NULL DEFAULT '0',
  PRIMARY KEY (`nid`),
  UNIQUE KEY `vid` (`vid`),
  KEY `node_changed` (`changed`),
  KEY `node_created` (`created`),
  KEY `node_moderate` (`moderate`),
  KEY `node_promote_status` (`promote`,`status`),
  KEY `node_status_type` (`status`,`type`,`nid`),
  KEY `node_title_type` (`title`,`type`(4)),
  KEY `node_type` (`type`(4)),
  KEY `uid` (`uid`),
  KEY `tnid` (`tnid`),
  KEY `translate` (`translate`)
) ENGINE=InnoDB AUTO_INCREMENT=1700 DEFAULT CHARSET=utf8

I have a CCK type for storing mentions (Social Media search mentions). Some of the mentions I believe are ASCII (My knowledge of this stuff is little).

I retrieve data from API's, which I then using node_save to save to Drupal.

My question is, what should I use to safely convert whatever I am getting into a format Drupal and MySQL are happy with?

The particular db_query error I get is unhelpfull "Warning in test1\includes\common.inc on line 3538". Nice. I have traced it to be encoding, as I used the following code to make the input safe, but it is not working with all input.

$node->title = htmlentities($item['title'], ENT_COMPAT, 'UTF-8');

It worked well for some ASCII characters, like those square ones [] etc, but not for this "行けなくてもずっとユーミンは聴きつづけます".

I'm really stuck. :(

UPDATE: The EXACT error I get from PHP is "Warning in D:\sites\test1\includes\common.inc on line 3538", and the line reads "if (db_query($query, $values)) {".

UPDATE 2: I've confirmed that the encoding of the data I am receiving is UTF8. This really doesn't make sense now, and I've confirmed that the collation in the db is utf8_general_ci.

UPDATE 3: One of the title's is: How Much Does A Facebook Fan Cost?� $1.07

The output of:

var_export(array_map('ord', str_split($node->title))

gave me the character 160 for the funny question mark (which is a square like [] in eclipse).

UPDATE 4: MySQL version is 5.1.41, and the collation on the columns is utf8_general_ci.

UPDATE 5: I managed to get Drupal to print the query with db_queryd. Funny thing is now I get the exact error message and not "Warning in", but Drupal still doesn't have this error in the log! WTF. So the exact sql is:

INSERT INTO node (vid, type, language, title, uid, status, created, changed, comment, promote, moderate, sticky, tnid, translate) VALUES (0, 'sm_mention', '', 'How Much Does A Facebook Fan Cost?� $1.07 (Geoffrey A. Fowler/Digits)', 1, 1, 1298395302, 1298395302, 0, 0, 0, 0, 0, 0)

And the error given is: Incorrect string value: '\xA0 $1.0...' for column 'title' at row 1

This honestly sounds like something doesn't like extended ascii characters.

UPDATE 6:

 SHOW CREATE TABLE node: 

   CREATE TABLE `node` (
  `nid` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `vid` int(10) unsigned NOT NULL DEFAULT '0',
  `type` varchar(32) NOT NULL DEFAULT '',
  `language` varchar(12) NOT NULL DEFAULT '',
  `title` varchar(255) NOT NULL DEFAULT '',
  `uid` int(11) NOT NULL DEFAULT '0',
  `status` int(11) NOT NULL DEFAULT '1',
  `created` int(11) NOT NULL DEFAULT '0',
  `changed` int(11) NOT NULL DEFAULT '0',
  `comment` int(11) NOT NULL DEFAULT '0',
  `promote` int(11) NOT NULL DEFAULT '0',
  `moderate` int(11) NOT NULL DEFAULT '0',
  `sticky` int(11) NOT NULL DEFAULT '0',
  `tnid` int(10) unsigned NOT NULL DEFAULT '0',
  `translate` int(11) NOT NULL DEFAULT '0',
  PRIMARY KEY (`nid`),
  UNIQUE KEY `vid` (`vid`),
  KEY `node_changed` (`changed`),
  KEY `node_created` (`created`),
  KEY `node_moderate` (`moderate`),
  KEY `node_promote_status` (`promote`,`status`),
  KEY `node_status_type` (`status`,`type`,`nid`),
  KEY `node_title_type` (`title`,`type`(4)),
  KEY `node_type` (`type`(4)),
  KEY `uid` (`uid`),
  KEY `tnid` (`tnid`),
  KEY `translate` (`translate`)
) ENGINE=InnoDB AUTO_INCREMENT=1700 DEFAULT CHARSET=utf8

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

蓝礼 2024-10-25 08:40:50

\xA0 不是 UTF8 序列的有效开头。

具有 Unicode 代码点 0x00A0 的称为 NO-BREAK SPACE 的字符应在 中编码为 0xC2A0 UTF8

因此,您的输入字符串已损坏,它不是有效的 UTF8

\xA0 is not a valid start of a UTF8 sequence.

The character known as NO-BREAK SPACE having the Unicode codepoint 0x00A0 should be encoded as 0xC2A0 in UTF8.

Thus said, your input string is broken, it's not a valid UTF8.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文