简单的RSS编码问题
考虑以下 PHP 代码,用于在我正在开发的网站上获取 RSS 新闻:
<?php
$url = "http://dariknews.bg/rss.php";
$xml = simplexml_load_file($url);
$feed_title = $xml->channel->title;
$feed_description = $xml->channel->description;
$feed_link = $xml->channel->link;
$item = $xml->channel->item;
function getTheData($item){
for ($i = 0; $i < 4; $i++) {
$article_title = $item[$i]->title;
$article_description = $item[$i]->description;
$article_link = $item[$i]->link;
echo "<p><h3><a href=".$article_link.">". $article_title. "</a></h3></p><small>".$article_description."</small><p>";
}
}
?>
该函数积累的数据应以以下 HTML 格式呈现:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=windows-1251"/>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
<title>Новини от Дарик</title>
</head>
<body>
<?php getTheData($item);?>
</body>
</html>
如您所见,我添加了 windows-1251(西里尔文)和 utf-8 编码,但如果我不将浏览器编码更改为 utf-8,则 RSS 提要无法读取。在我的例子中,默认编码是西里尔语,但我得到的提要不可读。任何使此 RSS 能够以西里尔文(来自保加利亚)可读的帮助将不胜感激。
Consider the following PHP code for getting RSS news on a site I'm developing:
<?php
$url = "http://dariknews.bg/rss.php";
$xml = simplexml_load_file($url);
$feed_title = $xml->channel->title;
$feed_description = $xml->channel->description;
$feed_link = $xml->channel->link;
$item = $xml->channel->item;
function getTheData($item){
for ($i = 0; $i < 4; $i++) {
$article_title = $item[$i]->title;
$article_description = $item[$i]->description;
$article_link = $item[$i]->link;
echo "<p><h3><a href=".$article_link.">". $article_title. "</a></h3></p><small>".$article_description."</small><p>";
}
}
?>
The data accumulated by this function should be presented in the following HTML format:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=windows-1251"/>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
<title>Новини от Дарик</title>
</head>
<body>
<?php getTheData($item);?>
</body>
</html>
As you see I added windows-1251(cyrillic) and utf-8 encoding but the RSS feed is unreadable if I don't change the browser encoding to utf-8. The default encoding in my case is cyrilic but I get unreadable feed. Any help making this RSS readable in cyrilic(it's from Bulgaria) will be greatly appreciated.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我刚刚测试了您的代码,当我删除
charset=windows-1251
元标记并只保留 UTF-8 标记时,保加利亚语字符显示正常。想尝试一下看看是否有效?另外,您可能需要更改
标记以反映您的页面采用保加利亚语的事实,如下所示:
强制 Web 服务器以 UTF-8 形式发送内容:
或者您可能需要通过发送 Content-Type 标头来 请务必在之前包含此内容任何其他内容(甚至空格)都会发送到浏览器。如果不这样做,您将收到 PHP“标头已发送”错误。
I've just tested your code and the Bulgarian characters displayed fine when I removed the
charset=windows-1251
meta tag and just left the UTF-8 one. Want to try that and see if it works?Also, you might want to change your
<html>
tag to reflect the fact that your page is in Bulgarian like this:<html xmlns="http://www.w3.org/1999/xhtml" lang="bg" xml:lang="bg">
Or maybe you need to force the web server to send the content as UTF-8 by sending a Content-Type header:
Just be sure to include this before ANY other content (even whitespace) is sent to the browser. If you don't you'll get the PHP "headers already sent" error.
也许你应该看看 htmlentities。
这样可以将一些字符转换为html。
Maybe you should take a look at htmlentities.
This can convert to html some characters.