问号字符显示在文本中。 为什么是这样?

发布于 2024-07-07 14:43:32 字数 478 浏览 10 评论 0原文

我有一个备份服务器,可以自动备份我的实时站点,包括文件和数据库。

在实时站点上,文本看起来不错,但是当您查看它的镜像版本时,它会显示“?” 在某些文本中。 该文本存储在新闻数据库表中。

这是实时服务器和镜像服务器上的屏幕截图。

将其备份到镜像服务器的过程中会发生什么?

Alt text

实时服务器是 Solaris,镜像服务器为 Linux Red Hat Linux 5.

I have a backup server that automatically backs up my live site, both files and database.

On the live site, the text looks fine, but when you view the mirrored version of it, it displays '?' within some of the text. This text is stored within the news database table.

Here is a screenshot of it being on the live server and of it on the mirrored server.

What could happen within the process of backing it up to the mirrored server?

Alt text

The live server is Solaris, and the mirrored server is Linux Red Hat Linux 5.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(9

疑心病 2024-07-14 14:43:32

以下文章将很有用:

10.3 指定字符集和排序规则

10.4连接字符集和排序规则

连接到数据库后,发出以下命令:

SET NAMES 'utf8';

确保您的网页也使用 UTF-8 编码:

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

PHP 还提供了几个对转换有用的函数:

The following articles will be useful:

10.3 Specifying Character Sets and Collations

10.4 Connection Character Sets and Collations

After you connect to the database, issue the following command:

SET NAMES 'utf8';

Ensure that your web page also uses the UTF-8 encoding:

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

PHP also offers several functions that will be useful for conversions:

有深☉意 2024-07-14 14:43:32

编辑“镜像”服务器(有问题的服务器)上的 Apache 配置文件,并注释掉以下行:

AddDefaultCharset UTF-8

然后重新启动 Apache:

service httpd restart

问题是“AddDefaultCharset UTF-8”行覆盖 Content-类型.html文件中指定; 例如:

<meta http-equiv=Content-Type content="text/html; charset=windows-1252">

最常见的症状是 127 以上的字符代码显示为带有问号的黑色菱形(在 Chrome、Safari 或 Firefox 中),或显示为小方框(在 Internet Explorer 和 Opera)。

Microsoft Word 生成的 HTML 文件通常有很多这样的字符,最常见的是字符代码 160 = 0xA0,相当于“ ” 在 Windows-1252 编码中,通常出现在 span 标签之间,如下所示:

<span style="mso-spacerun: yes">ááá </span>

Edit your Apache configuration file on the "mirror" server (the server with the problem), and comment-out the following line:

AddDefaultCharset UTF-8

Then restart Apache:

service httpd restart

The problem is that the "AddDefaultCharset UTF-8" line overrides the Content-Type specified in the .html files; e.g.:

<meta http-equiv=Content-Type content="text/html; charset=windows-1252">

The most common symptom is that character codes above 127 display as black diamonds with question marks on them (in Chrome, Safari or Firefox), or as little boxes (in Internet Explorer and Opera).

HTML files generated by Microsoft Word usually have many such characters, the most common one being character code 160 = 0xA0, which is equivalent to " " in the Windows-1252 encoding, and is often found between span tags, like this:

<span style="mso-spacerun: yes">ááá </span>
少女的英雄梦 2024-07-14 14:43:32

我来这里寻找浏览器中显示的 JavaScript 解决方案,尽管与数据库没有直接关系...

在我的例子中,我将在 Internet 上找到的一些文本复制并粘贴到 JavaScript 文件中,并使用 Windows 记事本

当使用该 JavaScript 文件的页面输出字符串时,出现问号(如问题中所示的),而不是像重音字母等特殊字符。

我使用 Notepad++。 打开文件后,我看到字符编码设置为 ANSI,如您在以下屏幕截图中看到的那样(鼠标光标位于页脚上):

在此处输入图像描述

要解决此问题,请单击 Notepad++ 中的编码菜单,然后选择以 UTF-8 编码。 你应该可以走了。 :)

I got here looking for a solution for JavaScript displayed in the browser and although not directly related with a database...

In my case I copied and pasted some text I found on the Internet into a JavaScript file and saved it with Windows Notepad.

When the page that uses that JavaScript file output the strings, there were question marks (like the ones shown in the question) instead of the special characters like accented letters, etc.

I opened the file using Notepad++. Right after opening the file I saw that the character encoding was set as ANSI as you can see (mouse cursor on footer) in the following screenshot:

Enter image description here

To solve the issue, click the Encoding menu in Notepad++ and select Encode in UTF-8. You should be good to go. :)

新一帅帅 2024-07-14 14:43:32

这与字符编码有关。

您确定镜像站点在字符编码方面与您的主服务器具有相同的属性吗?

根据您拥有的服务器类型,这可能是服务器进程本身的属性,也可能是环境变量。

例如,如果这是一个 UNIX 环境,也许尝试比较 LANG 或 LC_ALL?

另请参阅此处

This is going to be something to do with character encodings.

Are you sure the mirrored site has the same properties with regards to character encodings as your main server?

Depending on what sort of server you have, this may be a property of the server process itself, or it could be an environment variable.

For example, if this is a UNIX environment, perhaps try comparing LANG or LC_ALL?

See also here

旧人 2024-07-14 14:43:32

Unicode 或其他字符集字符会失败吗?

当文本从电子邮件或其他文档格式(例如单词)复制到文本编辑器中时,我经常在我工作过的网站上看到类似的“奇怪”字符。 编辑器可以显示非 ASCII 字符,但浏览器不能。 对于网站,我建议查找该字符的 HTML 实体代码并插入它......或者切换到更标准的代码。

Unicode or other character set characters falling through?

I have seen similar "strange" characters show up on sites I have worked on often when the text is copied from an email or some other document format (e.g. word) into a text editor. The editor can display the non ASCII characters but the browser can't. For the website, I would suggest looking up the HTML entity code for the character and inserting that instead ... or switch to more standard ones.

不奢求什么 2024-07-14 14:43:32

您的浏览器尚未正确解释页面的编码(因为您将其强制为特定设置,或者页面设置不正确),因此无法显示某些字符。

Your browser hasn't interpreted the encoding of the page correctly (either because you've forced it to a particular setting, or the page is set incorrectly), and thus cannot display some of the characters.

愿与i 2024-07-14 14:43:32

检查镜像服务器发出的字符集。 与主服务器似乎有所不同——实时站点似乎输出 Unicode,而镜像则不然。 此外,清除传入内容中的 Unicode 字符并将其替换为适当的 HTML 实体通常是个好主意。

您的具体问题涉及“智能引号”、“破折号”和“破折号”。 我知道您可以用 替换 em 破折号,用 替换 n-dashes(这应该在数据库的输入端完成); 我不知道智能引号的正确替代品是什么。 (我通常只是将所有大写单引号替换为 ' 并将所有大写双引号替换为 " ...排版极客可能会随意一看到我就向我开枪。)

我应该注意到,对于这个问题,某些浏览器比其他浏览器更宽容 - 互联网Windows 上的资源管理器会自动检测并“修复”此问题;Firefox 和大多数其他浏览器会显示问号。

Check the character set being emitted by your mirrored server. There appears to be a difference from that to the main server -- the live site appears to be outputting Unicode, where the mirror is not. Also, it's usually a good idea to scrub Unicode characters in your incoming content and replace them with their appropriate HTML entities.

Your specific issue regards "smart quotes," "em dashes" and "en dashes." I know you can replace em dashes with and n-dashes with (which should be done on the input side of your database); I don't know what the correct replacement for the smart quotes would be. (I usually just replace all curly single quotes with ' and all curly double quotes with " ... Typography geeks may feel free to shoot me on sight.)

I should note that some browsers are more forgiving than others with this issue -- Internet Explorer on Windows tends to auto-magically detect and "fix" this; Firefox and most other browsers display the question marks.

无悔心 2024-07-14 14:43:32

我遇到了这个问题,所以我只是将所有内容复制/粘贴到 记事本,创建一个新的 PHP 文件,粘贴回来,重新保存并覆盖,然后..成功了!

这确实是 Microsoft Word 编辑的一些遗迹......

I had this issue so I just took all my content, copy/pasted it into Notepad, made a new PHP file, pasted back in, re-saved and overwrote, and.. that worked!

It really was some relic of Microsoft Word editing...

口干舌燥 2024-07-14 14:43:32

我通常会咒骂 MS Word,然后运行以下 Windows Script Host 脚本。

// Replace with path to a file that needs cleaning
PATH = "test.html"

var go = WScript.CreateObject("Scripting.FileSystemObject");
var content = go.GetFile(PATH).OpenAsTextStream().ReadAll();
var out = go.CreateTextFile("clean-"+PATH, true);

// Symbols
content = content.replace(/“/g, '"');
content = content.replace(/”/g, '"');
content = content.replace(/’/g, "'");
content = content.replace(/–/g, "-");
content = content.replace(/©/g, "©");
content = content.replace(/®/g, "®");
content = content.replace(/°/g, "°");
content = content.replace(/¶/g, "<p>");
content = content.replace(/¿/g, "¿");
content = content.replace(/¡/g, '¡');
content = content.replace(/¢/g, '¢');
content = content.replace(/£/g, '£');
content = content.replace(/¥/g, '¥');

out.Write(content);

I usually curse MS Word and then run the following Windows Script Host script.

// Replace with path to a file that needs cleaning
PATH = "test.html"

var go = WScript.CreateObject("Scripting.FileSystemObject");
var content = go.GetFile(PATH).OpenAsTextStream().ReadAll();
var out = go.CreateTextFile("clean-"+PATH, true);

// Symbols
content = content.replace(/“/g, '"');
content = content.replace(/”/g, '"');
content = content.replace(/’/g, "'");
content = content.replace(/–/g, "-");
content = content.replace(/©/g, "©");
content = content.replace(/®/g, "®");
content = content.replace(/°/g, "°");
content = content.replace(/¶/g, "<p>");
content = content.replace(/¿/g, "¿");
content = content.replace(/¡/g, '¡');
content = content.replace(/¢/g, '¢');
content = content.replace(/£/g, '£');
content = content.replace(/¥/g, '¥');

out.Write(content);
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文