t-sql“喜欢”和模式匹配
我发现了一个小烦恼,我想知道如何解决...
在一个简化的示例中,假设我需要返回“TEST B-19”和“TEST B-20”,
我有一个 where 子句,如下所示:
where [Name] LIKE 'TEST B-[12][90]'
它有效...除非有我不想要的“TEST B-10”或“TEST-B29”值。
我宁愿不采取这两种情况,因为在更复杂的情况下,这会变得令人望而却步。
我尝试过:
where [Name] LIKE 'TEST B-[19-20]'
但是这当然行不通,因为它正在寻找单个字符......
有什么想法吗?同样,这是一个非常简单的示例,我正在寻找方法来获取从 16 到 32 或 234 到 459 的范围,而不获取所有可以创建的额外值。
编辑以包括测试示例...
您可能会在该字段中看到“TEXAS 22”或“THX 99-20-110-B6”或“E-19”或“SOUTHERN B”或“122 FLOWERS”。数字的出现是常见的,但不是一个固定的规则,并且连字符、数字、字符、顺序等绝对没有通用模式。
I've found a small annoyance that I was wondering how to get around...
In a simplified example, say I need to return "TEST B-19" and "TEST B-20"
I have a where clause that looks like:
where [Name] LIKE 'TEST B-[12][90]'
and it works... unless there's a "TEST B-10" or "TEST-B29" value that I don't want.
I'd rather not resort to doing both cases, because in more complex situations that would become prohibitive.
I tried:
where [Name] LIKE 'TEST B-[19-20]'
but of course that doesn't work because it is looking for single characters...
Thoughts? Again, this is a very simple example, I'd be looking for ways to grab ranges from 16 to 32 or 234 to 459 without grabbing all the extra values that could be created.
EDITED to include test examples...
You might see "TEXAS 22" or "THX 99-20-110-B6" or "E-19" or "SOUTHERN B" or "122 FLOWERS" in that field. The presense of digits is common, but not a steadfast rule, and there are absolutely no general patterns for hypens, digits, characters, order, etc.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我会将
Name
列分为文本部分和数字部分,并将数字部分转换为整数,然后检查该部分是否在值之间。类似于:当然,如果 [Name] 的可能结构要复杂得多,您必须计算 7 和 2 的值,而不是对它们进行硬编码......
编辑:如果您想过滤掉首先,如果不符合模式,请执行以下操作:
也许使用 CHARINDEX 代替最上面第二行中的 LIKE 会更快,特别是如果您在计算值上放置索引,但是...这只是优化...:)
编辑:测试了程序。给定以下数据:
查询返回以下行:
I would divide the
Name
column into the text parts and the number parts, and convert the number parts into an integer, and then check if that one was between the values. Something like:And, of course, if the possible structure of [Name] is much more complex, you'd have to calculate the values for 7 and 2, not hardcode them....
EDIT: If you want to filter out the ones not conforming to the pattern first, do the following:
Maybe it's faster using
CHARINDEX
in place of the LIKE in the topmost line two, especially if you put an index on the computed value, but... That is only optimization... :)EDIT: Tested the procedure. Given the following data:
The query returns the following rows:
无论是否使用通配符,每次想要更改范围定义时,您仍然必须编辑查询。如果您始终处理一个范围(并且并不总是相同的范围),则可以使用参数。例如:
注意:由于某种原因(这在许多其他帖子中也发生过),当我尝试发布以“declare”开头的代码时,SO 挂起并超时。我已经在元上报告了它,但没有人可以复制它(包括我)。这里又发生了,所以我把“D”取下来,现在它可以工作了。我明天回来,它会让我把“D”放回去。
您应该避免像其他人建议的那样格式化 [NAME](使用其上的函数)——这样,您的搜索可以从其索引中受益。
无论如何——您可能会重新考虑您的表结构。听起来“TEST B-19”是类别(“TEST”)+子类别(“B”)+实例(“19”)的复合(非标准化)值。将其放入具有 4 列的查找表中(id 为第一列),然后在任何需要输出复合值的查询中通过 id 将其连接起来。这将使搜索和索引变得更加容易和更快。
Wildcards or no, you still have to edit the query every time you want to change the range definition. If you're always dealing with a range (and it's not always the same range), you might use parameters. For example:
note: for some reason (this has happened in many other posts as well), when I try to post code beginning with 'declare', SO hangs and times-out. I reported it on meta already, but nobody could reproduce it (including me). Here it's happening again, so I took the 'D' off, and now it works. I'll come back tomorrow, and it will let me put the 'D' back on.
You should avoid formatting [NAME] as others have suggested (using function on it) -- this way, your search can benefit from an index on it.
In any case -- you might re-consider your table structure. It sounds like 'TEST B-19' is a composite (non-normalized) value of category ('TEST') + sub-category ('B') + instance ('19'). Put it in a lookup table with 4 columns (id being the first), and then join it by id in whatever query needs to output the composite value. This will make searching and indexing much easier and faster.
在没有测试数据的情况下,我生成了自己的数据。我刚刚删除了
Test B-
前缀,转换为 int 并做了一个Between
这给了我
这意味着,但是,如果您有不同的命名测试策略,您会必须删除所有不同类型的前缀。
现在,另一方面,如果您的测试编号采用
TEST-Space-TestType-Hyphen-TestNumber
格式,您可以使用 PatIndex 和 SubString这应该产生以下
所有示例似乎都有测试编号在最后。因此,如果您可以创建一个模式表,然后使用 LIKE 语句进行 JOIN,那么您也许可以使其工作。这是一个例子:
这会给你
In the absence of test data, I generated my own. I just removed the
Test B-
prefix, converted to int and did aBetween
This gave me
This means, however, that if you have different strategies for naming tests, you would have to remove all different kinds of prefixes.
Now, on the other hand, if your Test numbers are in the
TEST-Space-TestType-Hyphen-TestNumber
format, you could use PatIndex and SubStringThat should yield the following
All of your examples seem to have the test numbers at the end. So if you can create a table of patterns and then JOIN using a LIKE statement, you may be able make it work. Here is an example:
This will give you
这是可以接受的:
值列表可以来自子查询,例如:
Is this acceptable:
The list of values can come from a subquery, e.g.: