t-sql“喜欢”和模式匹配

发布于 2024-12-01 05:16:26 字数 600 浏览 3 评论 0原文

我发现了一个小烦恼,我想知道如何解决...

在一个简化的示例中,假设我需要返回“TEST B-19”和“TEST B-20”,

我有一个 where 子句,如下所示:

where [Name] LIKE 'TEST B-[12][90]'

它有效...除非有我不想要的“TEST B-10”或“TEST-B29”值。

我宁愿不采取这两种情况,因为在更复杂的情况下,这会变得令人望而却步。

我尝试过:

where [Name] LIKE 'TEST B-[19-20]'

但是这当然行不通,因为它正在寻找单个字符......

有什么想法吗?同样,这是一个非常简单的示例,我正在寻找方法来获取从 16 到 32 或 234 到 459 的范围,而不获取所有可以创建的额外值。

编辑以包括测试示例...

您可能会在该字段中看到“TEXAS 22”或“THX 99-20-110-B6”或“E-19”或“SOUTHERN B”或“122 FLOWERS”。数字的出现是常见的,但不是一个固定的规则,并且连字符、数字、字符、顺序等绝对没有通用模式。

I've found a small annoyance that I was wondering how to get around...

In a simplified example, say I need to return "TEST B-19" and "TEST B-20"

I have a where clause that looks like:

where [Name] LIKE 'TEST B-[12][90]'

and it works... unless there's a "TEST B-10" or "TEST-B29" value that I don't want.

I'd rather not resort to doing both cases, because in more complex situations that would become prohibitive.

I tried:

where [Name] LIKE 'TEST B-[19-20]'

but of course that doesn't work because it is looking for single characters...

Thoughts? Again, this is a very simple example, I'd be looking for ways to grab ranges from 16 to 32 or 234 to 459 without grabbing all the extra values that could be created.

EDITED to include test examples...

You might see "TEXAS 22" or "THX 99-20-110-B6" or "E-19" or "SOUTHERN B" or "122 FLOWERS" in that field. The presense of digits is common, but not a steadfast rule, and there are absolutely no general patterns for hypens, digits, characters, order, etc.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

木格 2024-12-08 05:16:26

我会将 Name 列分为文本部分和数字部分,并将数字部分转换为整数,然后检查该部分是否在值之间。类似于:

where cast(substring([Name], 7, 2) as integer) between 19 and 20

当然,如果 [Name] 的可能结构要复杂得多,您必须计算 7 和 2 的值,而不是对它们进行硬编码......

编辑:如果您想过滤掉首先,如果不符合模式,请执行以下操作:

where [Name] LIKE '%TEST B-__%'
and cast(substring([Name], CHARINDEX('TEST B-', [Name]) + LEN('TEST B-'), 2) as integer) between 19 and 20

也许使用 CHARINDEX 代替最上面第二行中的 LIKE 会更快,特别是如果您在计算值上放置索引,但是...这只是优化...:)

编辑:测试了程序。给定以下数据:

jajajajajajajTEST B-100
jajajajajajajTEST B-85
jajajajjTEST B-100
jajjajajTEST B-100
jajajajajajajTEST B-00
jajajajaTEST B-100
jajajajajajajEST B-99
jajajajajajajTEST B-100
jajajajajajajTEST B-19
jajajajjTEST B-100
jajjajajTEST B-120
jajajajajajajTEST B-00
jajajajaTEST B-150
jajajajajajajEST B-20
TEST B-20asdfh asdfkh

查询返回以下行:

jajajajajajajTEST B-19
TEST B-20asdfh asdfkh

I would divide the Name column into the text parts and the number parts, and convert the number parts into an integer, and then check if that one was between the values. Something like:

where cast(substring([Name], 7, 2) as integer) between 19 and 20

And, of course, if the possible structure of [Name] is much more complex, you'd have to calculate the values for 7 and 2, not hardcode them....

EDIT: If you want to filter out the ones not conforming to the pattern first, do the following:

where [Name] LIKE '%TEST B-__%'
and cast(substring([Name], CHARINDEX('TEST B-', [Name]) + LEN('TEST B-'), 2) as integer) between 19 and 20

Maybe it's faster using CHARINDEX in place of the LIKE in the topmost line two, especially if you put an index on the computed value, but... That is only optimization... :)

EDIT: Tested the procedure. Given the following data:

jajajajajajajTEST B-100
jajajajajajajTEST B-85
jajajajjTEST B-100
jajjajajTEST B-100
jajajajajajajTEST B-00
jajajajaTEST B-100
jajajajajajajEST B-99
jajajajajajajTEST B-100
jajajajajajajTEST B-19
jajajajjTEST B-100
jajjajajTEST B-120
jajajajajajajTEST B-00
jajajajaTEST B-150
jajajajajajajEST B-20
TEST B-20asdfh asdfkh

The query returns the following rows:

jajajajajajajTEST B-19
TEST B-20asdfh asdfkh
多情出卖 2024-12-08 05:16:26

无论是否使用通配符,每次想要更改范围定义时,您仍然必须编辑查询。如果您始终处理一个范围(并且并不总是相同的范围),则可以使用参数。例如:

注意:由于某种原因(这在许多其他帖子中也发生过),当我尝试发布以“declare”开头的代码时,SO 挂起并超时。我已经在元上报告了它,但没有人可以复制它(包括我)。这里又发生了,所以我把“D”取下来,现在它可以工作了。我明天回来,它会让我把“D”放回去。

DECLARE @min varchar(5)
DECLARE @max varchar(5)

SET @min = 'B-19'
SET @max = 'B-20'

SELECT
   ...
WHERE NAME BETWEEN @min AND @max

您应该避免像其他人建议的那样格式化 [NAME](使用其上的函数)——这样,您的搜索可以从其索引中受益。

无论如何——您可能会重新考虑您的表结构。听起来“TEST B-19”是类别(“TEST”)+子类别(“B”)+实例(“19”)的复合(非标准化)值。将其放入具有 4 列的查找表中(id 为第一列),然后在任何需要输出复合值的查询中通过 id 将其连接起来。这将使搜索和索引变得更加容易和更快。

Wildcards or no, you still have to edit the query every time you want to change the range definition. If you're always dealing with a range (and it's not always the same range), you might use parameters. For example:

note: for some reason (this has happened in many other posts as well), when I try to post code beginning with 'declare', SO hangs and times-out. I reported it on meta already, but nobody could reproduce it (including me). Here it's happening again, so I took the 'D' off, and now it works. I'll come back tomorrow, and it will let me put the 'D' back on.

DECLARE @min varchar(5)
DECLARE @max varchar(5)

SET @min = 'B-19'
SET @max = 'B-20'

SELECT
   ...
WHERE NAME BETWEEN @min AND @max

You should avoid formatting [NAME] as others have suggested (using function on it) -- this way, your search can benefit from an index on it.

In any case -- you might re-consider your table structure. It sounds like 'TEST B-19' is a composite (non-normalized) value of category ('TEST') + sub-category ('B') + instance ('19'). Put it in a lookup table with 4 columns (id being the first), and then join it by id in whatever query needs to output the composite value. This will make searching and indexing much easier and faster.

方觉久 2024-12-08 05:16:26

在没有测试数据的情况下,我生成了自己的数据。我刚刚删除了 Test B- 前缀,转换为 int 并做了一个 Between

With Numerals As
(
    Select top 100 row_number() over (order by name)  TestNumeral
    from sys.columns
),
TestNumbers AS
(
    Select 'TEST B-' + Convert (VarChar, TestNumeral) TestNumber
    From Numerals
)
Select * 
From TestNumbers
Where Cast (Replace (TestNumber, 'TEST B-', '') as Integer) between 1 and 16

这给了我

TestNumber
-------------------------------------
TEST B-1
TEST B-2
TEST B-3
TEST B-4
TEST B-5
TEST B-6
TEST B-7
TEST B-8
TEST B-9
TEST B-10
TEST B-11
TEST B-12
TEST B-13
TEST B-14
TEST B-15
TEST B-16

这意味着,但是,如果您有不同的命名测试策略,您会必须删除所有不同类型的前缀。

现在,另一方面,如果您的测试编号采用 TEST-Space-TestType-Hyphen-TestNumber 格式,您可以使用 PatIndex 和 SubString

With Numerals As
(
    Select top 100 row_number() over (order by name)  TestNumeral
    from sys.columns
),
TestNumbers AS
(
    Select 'TEST B-' + Convert (VarChar, TestNumeral) TestNumber
    From Numerals
    Where TestNumeral Between 10 and 19
    UNION
    Select 'TEST A-' + Convert (VarChar, TestNumeral) TestNumber
    From Numerals
    Where TestNumeral Between 20 and 29
)
Select *
From TestNumbers
Where Cast (SubString (TestNumber, PATINDEX ('%-%', TestNumber)+1, Len (TestNumber) - PATINDEX ('%-%', TestNumber)) as Integer) between 16 and 26

这应该产生以下

TestNumber
-------------------------------------
TEST A-20
TEST A-21
TEST A-22
TEST A-23
TEST A-24
TEST A-25
TEST A-26
TEST B-16
TEST B-17
TEST B-18
TEST B-19

所有示例似乎都有测试编号在最后。因此,如果您可以创建一个模式表,然后使用 LIKE 语句进行 JOIN,那么您也许可以使其工作。这是一个例子:

;
With TestNumbers As
(
      select 'E-1' TestNumber
union select 'E-2'
union select 'E-3'
union select 'E-4'
union select 'E-5'
union select 'E-6'
union select 'E-7'
union select 'SOUTHERN B1'
union select 'SOUTHERN B2'
union select 'SOUTHERN B3'
union select 'SOUTHERN B4'
union select 'SOUTHERN B5'
union select 'SOUTHERN B6'
union select 'SOUTHERN B7'
union select 'Southern CC'
union select 'Southern DD'
union select 'Southern EE'
union select 'TEST B-1'
union select 'TEST B-2'
union select 'TEST B-3'
union select 'TEST B-4'
union select 'TEST B-5'
union select 'TEST B-6'
union select 'TEST B-7'
union select 'TEXAS 1'
union select 'TEXAS 2'
union select 'TEXAS 3'
union select 'TEXAS 4'
union select 'TEXAS 5'
union select 'TEXAS 6'
union select 'TEXAS 7'
union select 'THX 99-20-110-B1'
union select 'THX 99-20-110-B2'
union select 'THX 99-20-110-B3'
union select 'THX 99-20-110-B4'
union select 'THX 99-20-110-B5'
union select 'THX 99-20-110-B6'
union select 'THX 99-20-110-B7'
union select 'Southern AA'
union select 'Southern CC'
union select 'Southern DD'
union select 'Southern EE'
),
Prefixes as
(
    Select 'TEXAS ' TestPrefix
    Union Select 'THX 99-20-110-B'
    Union Select 'E-'
    Union Select 'SOUTHERN B'
    Union Select 'TEST B-'
)
Select TN.TestNumber
From TestNumbers TN, Prefixes P
Where 1=1 
And TN.TestNumber Like '%' + P.TestPrefix + '%'
And Cast (REPLACE (Tn.TestNumber, p.TestPrefix, '') AS INTEGER) between 4 and 6

这会给你

TestNumber
----------------
E-4
E-5
E-6
SOUTHERN B4
SOUTHERN B5
SOUTHERN B6
TEST B-4
TEST B-5
TEST B-6
TEXAS 4
TEXAS 5
TEXAS 6
THX 99-20-110-B4
THX 99-20-110-B5
THX 99-20-110-B6

(15 row(s) affected)

In the absence of test data, I generated my own. I just removed the Test B- prefix, converted to int and did a Between

With Numerals As
(
    Select top 100 row_number() over (order by name)  TestNumeral
    from sys.columns
),
TestNumbers AS
(
    Select 'TEST B-' + Convert (VarChar, TestNumeral) TestNumber
    From Numerals
)
Select * 
From TestNumbers
Where Cast (Replace (TestNumber, 'TEST B-', '') as Integer) between 1 and 16

This gave me

TestNumber
-------------------------------------
TEST B-1
TEST B-2
TEST B-3
TEST B-4
TEST B-5
TEST B-6
TEST B-7
TEST B-8
TEST B-9
TEST B-10
TEST B-11
TEST B-12
TEST B-13
TEST B-14
TEST B-15
TEST B-16

This means, however, that if you have different strategies for naming tests, you would have to remove all different kinds of prefixes.

Now, on the other hand, if your Test numbers are in the TEST-Space-TestType-Hyphen-TestNumber format, you could use PatIndex and SubString

With Numerals As
(
    Select top 100 row_number() over (order by name)  TestNumeral
    from sys.columns
),
TestNumbers AS
(
    Select 'TEST B-' + Convert (VarChar, TestNumeral) TestNumber
    From Numerals
    Where TestNumeral Between 10 and 19
    UNION
    Select 'TEST A-' + Convert (VarChar, TestNumeral) TestNumber
    From Numerals
    Where TestNumeral Between 20 and 29
)
Select *
From TestNumbers
Where Cast (SubString (TestNumber, PATINDEX ('%-%', TestNumber)+1, Len (TestNumber) - PATINDEX ('%-%', TestNumber)) as Integer) between 16 and 26

That should yield the following

TestNumber
-------------------------------------
TEST A-20
TEST A-21
TEST A-22
TEST A-23
TEST A-24
TEST A-25
TEST A-26
TEST B-16
TEST B-17
TEST B-18
TEST B-19

All of your examples seem to have the test numbers at the end. So if you can create a table of patterns and then JOIN using a LIKE statement, you may be able make it work. Here is an example:

;
With TestNumbers As
(
      select 'E-1' TestNumber
union select 'E-2'
union select 'E-3'
union select 'E-4'
union select 'E-5'
union select 'E-6'
union select 'E-7'
union select 'SOUTHERN B1'
union select 'SOUTHERN B2'
union select 'SOUTHERN B3'
union select 'SOUTHERN B4'
union select 'SOUTHERN B5'
union select 'SOUTHERN B6'
union select 'SOUTHERN B7'
union select 'Southern CC'
union select 'Southern DD'
union select 'Southern EE'
union select 'TEST B-1'
union select 'TEST B-2'
union select 'TEST B-3'
union select 'TEST B-4'
union select 'TEST B-5'
union select 'TEST B-6'
union select 'TEST B-7'
union select 'TEXAS 1'
union select 'TEXAS 2'
union select 'TEXAS 3'
union select 'TEXAS 4'
union select 'TEXAS 5'
union select 'TEXAS 6'
union select 'TEXAS 7'
union select 'THX 99-20-110-B1'
union select 'THX 99-20-110-B2'
union select 'THX 99-20-110-B3'
union select 'THX 99-20-110-B4'
union select 'THX 99-20-110-B5'
union select 'THX 99-20-110-B6'
union select 'THX 99-20-110-B7'
union select 'Southern AA'
union select 'Southern CC'
union select 'Southern DD'
union select 'Southern EE'
),
Prefixes as
(
    Select 'TEXAS ' TestPrefix
    Union Select 'THX 99-20-110-B'
    Union Select 'E-'
    Union Select 'SOUTHERN B'
    Union Select 'TEST B-'
)
Select TN.TestNumber
From TestNumbers TN, Prefixes P
Where 1=1 
And TN.TestNumber Like '%' + P.TestPrefix + '%'
And Cast (REPLACE (Tn.TestNumber, p.TestPrefix, '') AS INTEGER) between 4 and 6

This will give you

TestNumber
----------------
E-4
E-5
E-6
SOUTHERN B4
SOUTHERN B5
SOUTHERN B6
TEST B-4
TEST B-5
TEST B-6
TEXAS 4
TEXAS 5
TEXAS 6
THX 99-20-110-B4
THX 99-20-110-B5
THX 99-20-110-B6

(15 row(s) affected)
若沐 2024-12-08 05:16:26

这是可以接受的:

WHERE [Name] IN ( 'TEST B-19', 'TEST B-20' )

值列表可以来自子查询,例如:

WHERE [Name] IN ( SELECT [Name] FROM Elsewhere WHERE ... )

Is this acceptable:

WHERE [Name] IN ( 'TEST B-19', 'TEST B-20' )

The list of values can come from a subquery, e.g.:

WHERE [Name] IN ( SELECT [Name] FROM Elsewhere WHERE ... )
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文