修复由于编码问题导致的坏字符
最近我们的系统遇到了编码问题:
如果我们的数据库中有字符串“æ”,那么它在我们的网页上就变成了“¡”。
现在这个问题已经解决了,但问题是现在我们的数据库中有很多“Д:用户没有看到并验证带有这些字符的预填充表单。
我发现如果你用 utf 8 C3A6 阅读,你会得到“æ”,如果你用 ascii 阅读,你会得到“¡”。
这很奇怪,因为如果我执行,
"select convert(varbinary(40),N'æ'),convert(varbinary(40),'æ')"
我不会得到相同的结果...
您知道如何修复我的数据库吗(即将所有“á”更改为“æ”)?
谢谢
Recently we had an encoding problem in our system :
If we had the string "æ" in our db ,it became "æ" on our web pages.
Now this problem is solved, but the problem is that now we have a lot of "æ" in our database : users didn't see and validate pre-filled form with these characters.
I found that If you read in utf 8 C3A6 you'll get "æ", if you read it in ascii you'll get "æ".
It's strange because if I execute
"select convert(varbinary(40),N'æ'),convert(varbinary(40),'æ')"
I don't have the same result...
Do you have any idea on how I can fix my database (ie change all "æ" to "æ") ?
thx
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
据我所知,唯一的修复方法是使用 Replace:
在这种情况下,我假设该列现在是 Unicode(即 nvarchar 或 nchar)。
As far as I know, the only means to fix is to use Replace:
In this case, I'm assuming that the column is now Unicode (i.e. nvarchar or nchar).
ASCII 只将字符分配给字节 00-7F。然而,有几种“扩展 ASCII”编码,其中 C3 A6 代表“¡”,包括流行的西欧编码 ISO-8859-1 和 windows-1252,以及土耳其 ISO-8859-9 和 windows-1254。
要解决编码问题,只需:
ASCII only assigns characters to the bytes 00-7F. There are, however, several "extended ASCII" encodings in which C3 A6 represents "æ", including the popular Western European encodings ISO-8859-1 and windows-1252, and Turkish ISO-8859-9 and windows-1254.
To fix your encoding problem, simply: