在 SQL Server 2008 的 NVarChar 中存储 UTF-8 时遇到问题

发布于 2024-11-02 08:51:00 字数 315 浏览 1 评论 0原文

我使用 System.Net.WebClient 从网站提取数据,当数据返回时,除了带有重音符号的字母外,所有内容都会解析并且看起来都很好。例如,当它返回 é 时,SQL Server 2008 将其保存为 é

只需要弄清楚如何将这些 UTF-8 字符转换为 SQL Server 可以读取的内容。我将其存储在 NVARCHAR(MAX) 数据类型中。

如果您好奇的话,我正在使用 Linq-to-SQL 插入数据库。

关于如何将其转换为正确的格式有什么想法吗?

I'm pulling data using System.Net.WebClient from a web site, and when the data comes back everything parses and looks good except letters with accents. For example, when it returns an é, SQL Server 2008 saves it as é.

Just need to figure out how to convert these UTF-8 characters into something SQL Server can read. I'm storing it in an NVARCHAR(MAX) datatype.

I'm using Linq-to-SQL to insert into the database if you were curious.

Any thoughts on what I could do to convert it to the proper format?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

毁我热情 2024-11-09 08:51:00

想通了!使用 WebClient 类时,我将数据作为字符串下载。

我的原始配置...

System.Net.WebClient wc = new WebClient();
string htmlData = wc.DownloadString(myUri);

我尝试将此数据转换为 UTF-16...从其当前字符串,但由于 Microsoft 以 UTF-16 运行,因此它自己处理了转换。

相反,我改变了从数据中读取实际 byte[] 数组的方法,就像这样......

System.Net.WebClient wc = new WebClient();
string htmlData = UTFConvert(wc.DownloadData(myUri));

private string UTFConvert(byte[] utfBytes)
{
    byte[] isoBytes = Encoding.Convert(Encoding.UTF8, Encoding.Unicode, utfBytes);
    return Encoding.Unicode.GetString(isoBytes);
}

这解决了问题,并且 SQL 现在可以正确地看到所有内容中的重音符号。耶皮。

大家干杯,感谢您的帮助!

Figured it out! When using the WebClient class, I was downloading the data as a string.

My Original Configuration...

System.Net.WebClient wc = new WebClient();
string htmlData = wc.DownloadString(myUri);

I tried to convert this data into a UTF-16...from it's current string, but since Microsoft operates in UTF-16, it had handled the conversion on its own.

Instead, I switched my approach to reading the actual byte[] array from the data like so...

System.Net.WebClient wc = new WebClient();
string htmlData = UTFConvert(wc.DownloadData(myUri));

private string UTFConvert(byte[] utfBytes)
{
    byte[] isoBytes = Encoding.Convert(Encoding.UTF8, Encoding.Unicode, utfBytes);
    return Encoding.Unicode.GetString(isoBytes);
}

This fixed the problem, and SQL correctly see's the accents in everything now. Yippee.

Cheers all, and thanks for your help!

给不了的爱 2024-11-09 08:51:00

在 SQL Server 中存储 UTF-8 数据的说明国际功能中也有对此主题的讨论在 Microsoft SQL Server 2005 中。其要点是:SQL Server 不支持 UTF-8。请随意将请求投票给 添加对在 SQL Server 中本地存储 UTF-8 的支持

但需要注意的是,由于您通过 LINQ 存储 Unicode 字符串,因此这表明问题发生在写入 SQL Server 之前。即您的网络拉取,它是否适当地转换使用 UTF-8 阅读器读取的数据?也就是说,您是否阅读了 WebResponse.GetResponseStream( ) 通过 StreamReader 使用适当的 UTF8Encoding应该创建正确的 Unicode 字符串,然后数据库中的 NVARCHAR 存储(即 UCS-2)应该没问题。

Description of storing UTF-8 data in SQL Server. There is also a discussion of this topic at International Features in Microsoft SQL Server 2005. the gist of it is: SQL Server has no support for UTF-8. Feel free to upvote the request to Add support for storing UTF-8 natively in SQL Server.

As a note though, since you store Unicode string via LINQ, this would point that the problem occurs before writing into SQL Server. Namely your web pulling, does it appropriately convert the data read using an UTF-8 reader? Namely, do you read the WebResponse.GetResponseStream() via a StreamReader constructed with the appropriate UTF8Encoding? That should create the proper Unicode string and then the NVARCHAR storage in the DB (which is UCS-2) should be fine.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文