SQL Server nvarchar N varchar 字段的前缀

发布于 2024-12-08 15:47:31 字数 458 浏览 0 评论 0原文

我使用的是 SQL Server 2005/2008 Express 数据库。对 varchar 字段使用 N 字符串前缀(用于 nvarchar 字段)是否有任何问题?

例如,如果我有一个数据库字段:

CREATE TABLE [dbo].[posts](
    post_title varchar(30)
)

然后我只插入 ascii 数据,但带有 N 前缀:

INSERT INTO [dbo].[posts] ([post_title]) VALUES (N'My Title');

出现问题是因为我想从 PHP 应用程序保存 UTF-8 字符,并且当前无法区分该字段是否正在使用保存为 varchar 或 nvarchar。因此,我只想假设所有字符都是 nvarchar,因为我只会尝试将 ASCII 字符保存到 varchar 字段。

I am using a SQL Server 2005/2008 Express database. Are there any problems with using the N string prefix (used for nvarchar fields) for varchar fields?

e.g. if I have a database field:

CREATE TABLE [dbo].[posts](
    post_title varchar(30)
)

And then I insert just ascii data but with an N prefix:

INSERT INTO [dbo].[posts] ([post_title]) VALUES (N'My Title');

The problem arises because I want to save UTF-8 characters from a PHP application and I can't currently differentiate whether the field it is being saved to is varchar or nvarchar. So I just want to assume that all are nvarchar given that I will only ever try to save ASCII characters to varchar fields.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

感悟人生的甜 2024-12-15 15:47:31

接受的答案具有误导性,但这部分是由于问题本身含糊不清(尽管可能不是故意的)。

是的,任何 Unicode 字符串(即以 N 为前缀的文字,或 XMLN 前缀的变量)在以下​​情况下都会隐式转换为 8 位 ASCII:存储到 CHAR / VARCHAR / TEXT (不要使用这个!)字段中。 但是,在许多情况下,这可能是一个相当重要的区别,只有 U+0000 到 U+007F 范围内的 Unicode 代码点(即 ASCII 值 0 - 127)才能保证正确转换。从 U+0080(即 ASCII 值 128)开始的所有内容都可能会也可能不会转换,具体取决于要插入的字段的排序规则所隐含的代码页。如果该排序规则的代码页没有该符号的映射,那么您会得到一个 ?

要查明代码页到底是什么,首先通过以下两个查询之一找到该字段的排序规则:

SELECT * FROM sys.columns WHERE [object_id] = OBJECT_ID(N'table_name');

-- OR:

EXEC sp_help N'table_name';

然后您可以使用以下方法从排序规则中找到代码页:

SELECT COLLATIONPROPERTY('collation_name', 'CodePage');

然后您可以在多个查询中的任意一个上查找图表站点,基于该代码页编号,将向您显示映射的内容。

排序规则不是按行排序,而是按字段排序。因此,无论字段的排序规则是什么,都决定了非 Unicode 字段的字符集(即 CHAR / VARCHAR / TEXT)。

所以问题是:问题中的术语“ASCII”是什么意思?从技术上讲,它仅指 7 位值(前 128 ;值 0 - 127),但人们经常用它来表示任何可以放入单个字节的值,其中还包括扩展 ASCII 值(第二个 128 ;值) 128 - 255),这取决于代码页。


关于将 VARCHAR 列指向 NVARCHAR 变量和文字的潜在问题:索引不会被忽略,但会产生一些负面影响,并且会因环境而异VARCHAR 列的排序规则。

如果列排序规则是 SQL Server 排序规则(即以 SQL_ 开头的排序规则,例如 SQL_Latin1_General_CP1_CI_AS),那么您可以获得索引扫描,但不能获得查找。

但是,如果列排序规则是 Windows 排序规则(即不以 SQL_ 开头的排序规则,例如 Latin1_General_100_CI_AS),那么您可以获得索引查找。

以下测试显示了此行为:

-- DROP TABLE dbo.VarcharColumnIndex;
CREATE TABLE dbo.VarcharColumnIndex
(
  ID INT IDENTITY(1, 1) NOT NULL CONSTRAINT [PK_VarcharColumnIndex] PRIMARY KEY CLUSTERED,
  SqlServerCollation VARCHAR(50) COLLATE SQL_Latin1_General_CP1_CI_AS,
  WindowsCollation VARCHAR(50) COLLATE Latin1_General_100_CI_AS
);

CREATE NONCLUSTERED INDEX [IX_VarcharColumnIndex_SqlServerCollation]
  ON dbo.VarcharColumnIndex ([SqlServerCollation]);
CREATE NONCLUSTERED INDEX [IX_VarcharColumnIndex_WindowsCollation]
  ON dbo.VarcharColumnIndex ([WindowsCollation]);

INSERT INTO dbo.VarcharColumnIndex ([SqlServerCollation], [WindowsCollation])
  VALUES ('a', 'b');

DECLARE @a NVARCHAR(50) = N'a';
SELECT [SqlServerCollation] FROM dbo.VarcharColumnIndex WHERE [SqlServerCollation] = @a;
-- Index Scan

DECLARE @b NVARCHAR(50) = N'b';
SELECT [WindowsCollation] FROM dbo.VarcharColumnIndex WHERE [WindowsCollation] = @b;
-- Index Seek

The accepted answer is misleading, but that is due, in part, to the question itself being ambiguous (though probably not intentionally).

Yes, any Unicode string (i.e. literal prefixed with N, or XML and N-prefixed variables) will implicitly convert to 8-bit ASCII when stored into a CHAR / VARCHAR / TEXT (don't use this one!) field. BUT, and this can be a rather important distinction in many cases, only Unicode code points in the range of U+0000 to U+007F (i.e. ASCII values 0 - 127) are guaranteed to convert correctly. Everything from U+0080 (i.e. ASCII value 128) on up may or may not convert, depending on the Code Page implied by the Collation of the field being inserted into. If the Code Page of that Collation does not have a mapping for that symbol, then you get a ? instead.

To find out what the Code Page is exactly, first find the Collation of the field via either of the following two queries:

SELECT * FROM sys.columns WHERE [object_id] = OBJECT_ID(N'table_name');

-- OR:

EXEC sp_help N'table_name';

Then you can find the Code Page from the Collation, using:

SELECT COLLATIONPROPERTY('collation_name', 'CodePage');

And then you can find a chart on any one of several sites, based on that code page number, that will show you what is mapped.

And collations are not per-row, they are per-field. So whatever the Collation is for a field determines the character set for non-Unicode fields (i.e. CHAR / VARCHAR / TEXT).

So the question is: what is meant by the term "ASCII" in the Question? It technically refers to just the 7-bit values (the first 128 ; values 0 - 127), but people often use it to mean anything that can fit into a single byte, which also includes the Extended ASCII values (the second 128 ; values 128 - 255) which are dependent on the Code Page.


Regarding the potential issue(s) surrounding having a VARCHAR column to NVARCHAR variables and literals: indexes will not be ignored, but there is some negative impact, and that varies based on the Collation of the VARCHAR column.

If the column Collation is a SQL Server Collation (i.e. one that starts with SQL_, such as SQL_Latin1_General_CP1_CI_AS), then you can get an Index Scan, but not a Seek.

But, if the column Collation is a Windows Collation (i.e. one that does not start with SQL_, such as Latin1_General_100_CI_AS), then you can get an Index Seek.

The following test shows this behavior:

-- DROP TABLE dbo.VarcharColumnIndex;
CREATE TABLE dbo.VarcharColumnIndex
(
  ID INT IDENTITY(1, 1) NOT NULL CONSTRAINT [PK_VarcharColumnIndex] PRIMARY KEY CLUSTERED,
  SqlServerCollation VARCHAR(50) COLLATE SQL_Latin1_General_CP1_CI_AS,
  WindowsCollation VARCHAR(50) COLLATE Latin1_General_100_CI_AS
);

CREATE NONCLUSTERED INDEX [IX_VarcharColumnIndex_SqlServerCollation]
  ON dbo.VarcharColumnIndex ([SqlServerCollation]);
CREATE NONCLUSTERED INDEX [IX_VarcharColumnIndex_WindowsCollation]
  ON dbo.VarcharColumnIndex ([WindowsCollation]);

INSERT INTO dbo.VarcharColumnIndex ([SqlServerCollation], [WindowsCollation])
  VALUES ('a', 'b');

DECLARE @a NVARCHAR(50) = N'a';
SELECT [SqlServerCollation] FROM dbo.VarcharColumnIndex WHERE [SqlServerCollation] = @a;
-- Index Scan

DECLARE @b NVARCHAR(50) = N'b';
SELECT [WindowsCollation] FROM dbo.VarcharColumnIndex WHERE [WindowsCollation] = @b;
-- Index Seek
絕版丫頭 2024-12-15 15:47:31

如果将带有 N 前缀的字符串写入 varchar 字段,它将被隐式转换。没有其他开销,您可以安全地假设“一切都是 nvarchar”

由于数据类型优先级,将 nvarchar 变量与 varchar 列进行比较可能会出现问题。 varchar 列将被转换,并且不会使用任何索引。

If you write strings with the N prefix into a varchar field it will be implicitly converted. There is no other overhead and you can safely assume "everything is nvarchar"

There may be an problem comparing nvarchar variables to varchar columns because of data type precedence. The varchar column will be converted and any indexes won't be used.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文