SQL Server 静默截断存储过程中的 varchar
根据 此论坛讨论,SQL Server(我使用的是 2005,但我认为这也适用于 2000 和 2008)会默默地将您指定为存储过程参数的任何 varchar
截断为 varchar 的长度,即使使用 INSERT
实际上会导致错误。例如。如果我创建此表:
CREATE TABLE testTable(
[testStringField] [nvarchar](5) NOT NULL
)
那么当我执行以下命令时:
INSERT INTO testTable(testStringField) VALUES(N'string which is too long')
我收到错误:
String or binary data would be truncated.
The statement has been terminated.
太棒了。保留数据完整性,并且调用者知道这一点。现在让我们定义一个存储过程来插入:
CREATE PROCEDURE spTestTableInsert
@testStringField [nvarchar](5)
AS
INSERT INTO testTable(testStringField) VALUES(@testStringField)
GO
并执行它:
EXEC spTestTableInsert @testStringField = N'string which is too long'
没有错误,1 行受影响。将在表中插入一行,testStringField
为“strin”。 SQL Server 默默地截断了存储过程的 varchar
参数。
现在,这种行为有时可能很方便,但我认为没有办法将其关闭。这非常烦人,因为如果我向存储过程传递的字符串太长,我希望事情出错。似乎有两种方法可以解决这个问题。
首先,将存储过程的 @testStringField
参数声明为大小 6,并检查其长度是否超过 5。这看起来有点像 hack,并且涉及大量令人恼火的样板代码。
其次,只需将所有存储过程 varchar 参数声明为 varchar(max)
,然后让存储过程中的 INSERT
语句失败。
后者似乎工作正常,所以我的问题是:如果我实际上希望存储过程也失败,那么对 SQL Server 存储过程中的字符串始终使用 varchar(max) 是个好主意吗?传递的字符串有多长?这可能是最佳实践吗?无法禁用的静默截断对我来说似乎很愚蠢。
According to this forum discussion, SQL Server (I'm using 2005 but I gather this also applies to 2000 and 2008) silently truncates any varchar
s you specify as stored procedure parameters to the length of the varchar, even if inserting that string directly using an INSERT
would actually cause an error. eg. If I create this table:
CREATE TABLE testTable(
[testStringField] [nvarchar](5) NOT NULL
)
then when I execute the following:
INSERT INTO testTable(testStringField) VALUES(N'string which is too long')
I get an error:
String or binary data would be truncated.
The statement has been terminated.
Great. Data integrity preserved, and the caller knows about it. Now let's define a stored procedure to insert that:
CREATE PROCEDURE spTestTableInsert
@testStringField [nvarchar](5)
AS
INSERT INTO testTable(testStringField) VALUES(@testStringField)
GO
and execute it:
EXEC spTestTableInsert @testStringField = N'string which is too long'
No errors, 1 row affected. A row is inserted into the table, with testStringField
as 'strin'. SQL Server silently truncated the stored procedure's varchar
parameter.
Now, this behaviour might be convenient at times but I gather there is NO WAY to turn it off. This is extremely annoying, as I want the thing to error if I pass too long a string to the stored procedure. There seem to be 2 ways to deal with this.
First, declare the stored proc's @testStringField
parameter as size 6, and check whether its length is over 5. This seems like a bit of a hack and involves irritating amounts of boilerplate code.
Second, just declare ALL stored procedure varchar parameters to be varchar(max)
, and then let the INSERT
statement within the stored procedure fail.
The latter seems to work fine, so my question is: is it a good idea to use varchar(max)
ALWAYS for strings in SQL Server stored procedures, if I actually want the stored proc to fail when too long a string is passed? Could it even be best practice? The silent truncation that can't be disabled seems stupid to me.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
它只是是。
但我从未注意到问题,因为我的检查之一是确保我的参数与我的表列长度匹配。在客户端代码中也是如此。就我个人而言,我希望 SQL 永远不会看到太长的数据。如果我确实看到了截断的数据,那么造成它的原因就会很明显。
如果您确实觉得需要 varchar(max),请注意由于 数据类型而导致的巨大性能问题优先级。 varchar(max) 的优先级高于 varchar(n)(最长的为最高)。因此,在这种类型的查询中,您将得到扫描而不是查找,并且每个 varchar(100) 值都会被强制转换为 varchar(max)
编辑:
有一个 打开关于此问题的 Microsoft Connect 项目。
它可能值得包含在Erland Sommarkog 的严格设置中(以及匹配的 Connect 项目)。
编辑 2,马丁斯评论后:
It just is.
I've never noticed a problem though because one of my checks would be to ensure my parameters match my table column lengths. In the client code too. Personally, I'd expect SQL to never see data that is too long. If I did see truncated data, it'd be bleeding obvious what caused it.
If you do feel the need for varchar(max) beware a massive performance issue because of datatype precedence. varchar(max) has higher precedence than varchar(n) (longest is highest). So in this type of query you'll get a scan not a seek and every varchar(100) value is CAST to varchar(max)
Edit:
There is an open Microsoft Connect item regarding this issue.
And it's probably worthy of inclusion in Erland Sommarkog's Strict settings (and matching Connect item).
Edit 2, after Martins comment:
一如既往地感谢 StackOverflow 引发了这种深入的讨论。我最近一直在仔细研究我的存储过程,使用标准的事务方法和 try/catch 块使它们更加健壮。我不同意 Joe Stefanelli 的观点“我的建议是让应用程序端负责”,并且完全同意 Jez 的观点:“让 SQL Server 验证字符串长度会更好”。对我来说,使用存储过程的全部意义在于,它们是用数据库本机语言编写的,应该充当最后一道防线。在应用程序方面,255 和 256 之间的差异只是一个毫无意义的数字,但在数据库环境中,最大大小为 255 的字段根本不接受 256 个字符。应用程序验证机制应尽可能地反映后端数据库,但维护很困难,因此如果应用程序错误地允许不合适的数据,我希望数据库能够给我良好的反馈。这就是为什么我使用数据库而不是一堆 CSV 或 JSON 等文本文件。
我很困惑为什么我的一个 SP 抛出 8152 错误,而另一个 SP 则默默地被截断。我终于发现:引发 8152 错误的 SP 有一个参数,该参数允许比相关表列多一个字符。表列设置为 nvarchar(255),但参数为 nvarchar(256)。那么,我的“错误”难道不会解决 gbn 的担忧:“大规模的性能问题”吗?或许我们可以不使用 max,而一致地将表列大小设置为 255,并将 SP 参数设置为仅长一个字符,例如 256。这解决了静默截断问题,并且不会产生任何性能损失。
大概还有一些我没有想到的其他缺点,但对我来说这似乎是一个很好的妥协。
更新:
恐怕这个技术并不一致。进一步的测试表明,我有时会触发 8152 错误,有时数据会被静默截断。如果有人能帮助我找到更可靠的方法来处理这个问题,我将不胜感激。
更新2:
请参阅本页 Pyitoechito 的回答。
Thanks, as always, to StackOverflow for eliciting this kind of in-depth discussion. I have recently been scouring through my Stored Procedures to make them more robust using a standard approach to transactions and try/catch blocks. I disagree with Joe Stefanelli that "My suggestion would be to make the application side responsible", and fully agree with Jez: "Having SQL Server verify the string length would be much preferable". The whole point for me of using stored procedures is that they are written in a language native to the database and should act as a last line of defence. On the application side the difference between 255 and 256 is just a meangingless number but within the database environment, a field with a maximum size of 255 will simply not accept 256 characters. The application validation mechanisms should reflect the backend db as best they can, but maintenance is hard so I want the database to give me good feedback if the application mistakenly allows unsuitable data. That's why I'm using a database instead of a bunch of text files with CSV or JSON or whatever.
I was puzzled why one of my SPs threw the 8152 error and another silently truncated. I finally twigged: The SP which threw the 8152 error had a parameter which allowed one character more than the related table column. The table column was set to nvarchar(255) but the parameter was nvarchar(256). So, wouldn't my "mistake" address gbn's concern: "massive performance issue"? Instead of using max, perhaps we could consistently set the table column size to, say, 255 and the SP parameter to just one character longer, say 256. This solves the silent truncation problem and doesn't incur any performance penalty.
Presumably there is some other disadvantage that I haven't thought of, but it seems a good compromise to me.
Update:
I'm afraid this technique is not consistent. Further testing reveals that I can sometimes trigger the 8152 error and sometimes the data is silently truncated. I would be very grateful if someone could help me find a more reliable way of dealing with this.
Update 2:
Please see Pyitoechito's answer on this page.
在这里可以看到相同的行为:
我的建议是让应用程序端负责在调用存储过程之前验证输入。
The same behavior can be seen here:
My suggestion would be to make the application side responsible for validating the input before calling the stored procedure.
发生这种情况的原因可能是字符串中的第 256 个字符是空格。
VARCHAR
将在插入时截断尾随空白并仅生成警告。因此,您的存储过程会默默地将字符串截断为 256 个字符,并且您的插入会截断尾随空白(带有警告)。当所述字符不是空白时,它将产生错误。也许解决方案是使存储过程的 VARCHAR 具有合适的长度以捕获非空白字符。
VARCHAR(512)
可能足够安全。This is probably occurring because the 256th character in the string is white-space.
VARCHAR
s will truncate trailing white-space on insertion and just generate a warning. So your stored procedure is silently truncating your strings to 256 characters, and your insertion is truncating the trailing white-space (with a warning). It will produce an error when said character is not white-space.Perhaps a solution would be to make the stored procedure's
VARCHAR
a suitable length to catch a non-white-space character.VARCHAR(512)
would probably be safe enough.一种解决方案是:
varchar(max)
这样,您对现有代码的修改将非常少,如下面的示例。
这是原始代码:
这是新代码:
请注意,如果传入参数的长度大于限制,而不是默默地截断字符串,SQL Server 将抛出错误。
One solution would be to:
varchar(max)
This way your modifications to the existing code are going to be very minimal like in the sample below.
This is the original code:
This is the new code:
Note that if the length of the incoming parameters is going to be greater than the limit instead of silently chopping off the string SQL Server will throw off an error.
您始终可以将 if 语句放入 sp 中来检查它们的长度,如果它们大于指定的长度,则会抛出错误。但这相当耗时,并且如果更新数据大小,更新会很痛苦。
You could always throw an if statement into your sp's that check the length of them, and if they're greater than the specified length throw an error. This is rather time consuming though and would be a pain to update if you update the data size.
这不是解决您今天问题的答案,但它包括 MSSQL 考虑添加的功能建议,这将解决此问题。
指出这一点是 MSSQL 的一个缺点,这一点很重要,因此我们可以通过提高人们的认识来帮助他们解决这个问题。
如果您想投票的话,这是正式建议:
https://feedback.azure.com/forums/908035-sql-server/suggestions/38394241-request-for-new-rule-string-truncation-error-for
我和你一样感到沮丧。
在参数上设置字符大小的全部意义是让其他开发人员立即知道
传入数据时的大小限制(通过 Intellisense)是多少。
这就像将您的文档直接写入存储过程的签名中。
看,我明白了,变量赋值期间的隐式转换是罪魁祸首。
尽管如此,没有充分的理由花费如此多的能量来应对各种情况
您被迫解决此功能。
如果你问我,存储过程和函数应该具有相同的引擎规则,
用于分配参数,在填充表时使用。这个要求真的太多了吗?
所有这些使用更大字符限制的建议
然后然后在每个存储过程中添加每个参数的验证是荒谬的。
我知道这是确保避免截断的唯一方法,但真的是 MSSQL 吗?
我不在乎它是 ANSI/ISO 标准还是其他标准,这很愚蠢!
当值太长时 - 我希望我的代码每次都会被破坏。
它应该是:不要通过 go,并修复你的代码。
您可能会遇到多个截断错误,这些错误会持续多年,但永远不会发现它们。
确保数据完整性方面发生了什么?
假设您的 SQL 代码仅在在所有参数均已验证之后才会被调用,这是危险的。
我尝试将相同的验证添加到我的网站和它调用的存储过程中,
我仍然发现我的存储过程中漏掉了网站的错误。这是一次很棒的健全性检查!
如果您想将您的 Sproc 重新用于 WebSite/WebService 并从其他地方调用它,该怎么办
Sprocs/Jobs/Deployment/Ad-Hoc 脚本(没有前端来验证参数)?
MSSQL 需要“
NO_TRUNC
”选项来对任何非最大字符串变量强制执行此操作(甚至那些用作存储过程和函数参数的参数)。
它可以是连接/会话范围:
(就像“
TRANSACTION ISOLATION LEVEL READ UNCOMMITTED”选项如何影响所有查询)
或者专注于单个变量:
(就像“
NOLOCK
”是仅针对 1 个表的表提示)。或者打开跟踪标志或数据库属性以将其应用于数据库中的所有存储过程/函数参数。
我并不是要求颠覆数十年的遗留代码。
只是向 MS 询问更好地管理我们的数据库的选项。
This isn't the Answer that'll solve your problem today, but it includes a Feature Suggestion for MSSQL to consider adding, that would resolve this issue.
It is important to call this out as a shortcoming of MSSQL, so we may help them resolve it by raising awareness of it.
Here's the formal Suggestion if you'd like to vote on it:
https://feedback.azure.com/forums/908035-sql-server/suggestions/38394241-request-for-new-rule-string-truncation-error-for
I share your frustration.
The whole point of setting Character-Size on Parameters is so other Developers will instantly know
what the Size Limits are (via Intellisense) when passing in Data.
This is like having your documentation baked right into the Sproc's Signature.
Look, I get it, Implicit-Conversion during Variable Assignments is the culprit.
Still, there is no good reason to expend this amount of energy battling scenarios
where you are forced to work around this feature.
If you ask me, Sprocs and Functions should have the same engine-rules in place,
for Assigning Parameters, that are used when Populating Tables. Is this really too much to ask?
All these suggestions to use Larger Character-Limits
and then adding Validation for EACH Parameter in EVERY Sproc is ridiculous.
I know it's the only way to ensure Truncation is avoided, but really MSSQL?
I don't care if it's ANSI/ISO Standard or whatever, it's dumb!
When Values are too long - I want my code to break - every time.
It should be: Do not pass go, and fix your code.
You could have multiple truncation bugs festering for years and never catch them.
What happened to ensuring your Data-Integrity?
It's dangerous to assume your SQL Code will only ever be called after all Parameters are Validated.
I try to add the same Validation to both my Website and in the Sproc it calls,
and I still catch Errors in my Sproc that slipped past the website. It's a great sanity-check!
What if you want to re-use your Sproc for a WebSite/WebService and also have it called from other
Sprocs/Jobs/Deployment/Ad-Hoc Scripts (where there is no front-end to Validate Parameters)?
MSSQL Needs a "
NO_TRUNC
" Option to Enforce this on any Non-Max String Variable(even those used as Parameters for Sprocs and Functions).
It could be Connection/Session-Scoped:
(like how the "
TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
" Option affects all Queries)Or focused on a Single Variable:
(like how "
NOLOCK
" is a Table Hint for just 1 Table).Or a Trace-Flag or Database Property you turn on to apply this to All Sproc/Function Parameters in the Database.
I'm not asking to upend decades of Legacy Code.
Just asking MS for the option to better manage our Databases.