MySQL 数据库中的特殊字符(例如大撇号)正在破坏我的 XML
我有一个包含报纸文章的 MySQL 数据库。有一个卷表、一个期表和一个文章表。我有一个 PHP 文件,它生成一个属性列表,然后由 iPhone 应用程序拉入并读取。 plist 将每篇文章作为每个问题中的字典,并将每个问题作为每个卷中的字典。该 plist 实际上并不包含整篇文章——仅包含标题和 URL。
有些文章标题包含特殊字符,例如大撇号。查看生成的 XML plist,每当遇到特殊字符时,它都会不可预测地吞噬掉一大堆文本,从而使 XML 损坏且无法读取。
(……无论如何,在 Chrome 中,我猜是在 iPhone 上。Firefox 实际上处理得很好,在黑色菱形中显示一个白色的 ? 来代替任何特殊字符,并且不会吞噬任何东西。)
示例很好-formed plist snippet:
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Rows</key>
<array>
<dict>
<key>Title</key>
<string>Vol. 133 (2003-2004)</string>
<key>Children</key>
<array>
<dict>
<key>Title</key>
<string>No. 18 (Apr 2, 2004)</string>
<key>Children</key>
<array>
<dict>
<key>Title</key>
<string>Basketball concludes historic season</string>
<key>URL</key>
<string>http://orient.bowdoin.edu/orient/article_iphone.php?date=2004-04-02&section=1&id=1</string>
</dict>
<!-- ... -->
</array>
</dict>
</array>
</dict>
</array>
</dict>
</plist>
当它遇到大写撇号时会发生什么的示例: 这是来自 Chrome。根据 MS Word 的统计,这次它吃了 5,998 个字符,在开头跳到了一个披萨故事的标题;如果我重新加载,它的行为会有所不同,吃一些其他的量。正确的标题是:歌手兼作曲家 Farrell '05 发现成功超越泡沫
<dict>
<key>Title</key>
<string>Singer-songwriter Farrell ing>Students embrace free pizza, College objects to solicitation</string>
<key>URL</key>
<string>http://orient.bowdoin.edu/orient/article_iphone.php?date=2009-09-18&section=1&id=9</string>
</dict>
在 MySQL 中,标题存储为(二进制):
53 69 6E 67 |65 72 2D 73 |6F 6E 67 77 |72 69 74 65
72 20 46 61 |72 72 65 6C |6C 20 C2 92 |30 35 20 66
69 6E 64 73 |20 73 75 63 |63 65 73 73 |20 62 65 79
6F 6E 64 20 |74 68 65 20 |62 75 62 62 |6C
有什么想法如何正确编码/解码事物吗?如果没有,知道如何以其他方式解决这个问题吗?
我不知道我在说什么,哈哈;如果有什么办法可以帮助您,请告诉我。 :) 非常感谢!
I have a MySQL database of newspaper articles. There's a volume table, an issue table, and an article table. I have a PHP file that generates a property list that is then pulled in and read by an iPhone app. The plist holds each article as a dictionary inside each issue, and each issue as a dictionary inside each volume. The plist doesn't actually hold the whole article -- just a title and URL.
Some article titles contain special characters, like curly apostrophes. Looking at the generated XML plist, whenever it hits a special character, it unpredictably gobbles up a whole bunch of text, leaving the XML mangled and unreadable.
(...in Chrome, anyway, and I'm guessing on the iPhone. Firefox actually handles it pretty well, showing a white ? in a black diamond in place of any special characters and not gobbling anything.)
Example well-formed plist snippet:
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Rows</key>
<array>
<dict>
<key>Title</key>
<string>Vol. 133 (2003-2004)</string>
<key>Children</key>
<array>
<dict>
<key>Title</key>
<string>No. 18 (Apr 2, 2004)</string>
<key>Children</key>
<array>
<dict>
<key>Title</key>
<string>Basketball concludes historic season</string>
<key>URL</key>
<string>http://orient.bowdoin.edu/orient/article_iphone.php?date=2004-04-02§ion=1&id=1</string>
</dict>
<!-- ... -->
</array>
</dict>
</array>
</dict>
</array>
</dict>
</plist>
Example of what happens when it hits a curly apostrophe: This is from Chrome. This time it ate 5,998 characters, by MS Word's count, skipping down to midway through the opening the title of a pizza story; if I reload it'll behave differently, eating some other amount. The proper title is: Singer-songwriter Farrell ’05 finds success beyond the bubble
<dict>
<key>Title</key>
<string>Singer-songwriter Farrell ing>Students embrace free pizza, College objects to solicitation</string>
<key>URL</key>
<string>http://orient.bowdoin.edu/orient/article_iphone.php?date=2009-09-18§ion=1&id=9</string>
</dict>
In MySQL that title is stored as (in binary):
53 69 6E 67 |65 72 2D 73 |6F 6E 67 77 |72 69 74 65
72 20 46 61 |72 72 65 6C |6C 20 C2 92 |30 35 20 66
69 6E 64 73 |20 73 75 63 |63 65 73 73 |20 62 65 79
6F 6E 64 20 |74 68 65 20 |62 75 62 62 |6C
Any ideas how I can encode/decode things properly? If not, any idea how I can get around the problem some other way?
I don't have a clue what I'm talking about, haha; let me know if there's any way I can help you help me. :) And many thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这里有一些选项,
htmlentities()
对特殊字符进行编码尝试在标题周围使用 CDATA ie
here's a few options
htmlentities()
to encode special characters when inserting in the tabletry using CDATA around the titles ie
<string><![CDATA[ BLAH BLAH BLAH ]]></string>