Sql Server 中的按位与

发布于 2024-10-26 04:16:52 字数 923 浏览 2 评论 0原文

我有一个非常典型的情况。我们有一个名为 Users 的表,其中有一列名为 Branches (varchar 1000)。

该组织可以有 1000 个分支机构。因此,如果用户有权访问分支 1、5 和 10,分支字符串将如下所示:

1000100001000000000......

(即 1 表示用户根据分支编号有权访问分支的位置)。请不要建议更好的数据存储选项,这是我从跨洲部署的遗留应用程序中得到的。

现在考虑到这一背景(并考虑到可能有 > 10000 个用户),我想搜索有权访问任意一组给定分支的所有用户,例如查找有权访问分支 10、65 的所有用户, 90 或 125。

一个简单的解决方案是将所需的分支集(即 10、65、90、125)转换为分支字符串(00000010100 等),然后使用标量 UDF 迭代两个分支字符串并在以下位置返回 true第一个匹配出现,其中 2 个分支字符串有 1,如果公共位置没有 1,则为 false。

除此之外,我还可以选择在 C# 中的应用程序中进行搜索。其中一些用户具有特权(大约 1000 或更多),并且他们的数据缓存在应用程序中,因为访问非常频繁。但对于其他没有特权的用户来说,数据仅在数据库中。

我这里有两个问题: 1)对于数据库搜索,除了我提到的UDF方法之外,还有更好的方法吗? 2)对于特权用户,在性能方面会更好,在应用程序中搜索(这进一步可以基于分支字符串上的 for 循环,如 UDF 中,或者作为 2 个分支数组上的 Linq Intersect 运算符,即 Linq Intersect在 [1,5,9,50,80,200] 和 [6,90,256,300] 等) 数据库搜索或基于应用程序的搜索会产生更快的结果吗?

考虑在这两种情况下可能还有其他搜索参数,例如姓氏开头。

我当前的方法是首先根据其他参数(例如姓氏开头)过滤数据库中的这两种情况的行。然后使用标量 UDF 根据分支过滤此结果集,然后返回结果。

I have a very typical situation. We have a table called Users which has a column called Branches (varchar 1000).

The organization can have 1000 branches. So if a user has access to branch 1, 5, and 10, the branches string would look like:

1000100001000000000......

(i.e. 1 for a position a User has branch access to based on the branch's number). Please do not advise better data storage options, this is coming to me from a legacy application that is deployed across continents.

Now given this background (and considering that there can be > 10000 users), I want to search for all Users who have access to any one of given set of branches, e.g. Find all users who have access to either branch 10, 65, 90 or 125.

One easy solution is to convert the desired set of branches (i.e. 10, 65, 90, 125) to a branch string (00000010100 etc), then use a scalar UDF to iterate over both the branch strings and return true at first matching occurence where 2 branch strings have 1, and false if there is not a 1 at common position.

Other than that, I also have an option of searching in application in C#. Some of these users are privileged (approx 1000 or more) and their data is cached in application as it is accessed very frequently. But for other users that are not privileged, data is only in db.

I have 2 questions here:
1) For a db search, is there a better way other than the UDF approach I mentioned.
2) For privileged users, what would be better in terms of performance, search in application (which further can be based on a for loop on branch strings like in UDF, or as a Linq Intersect operator on 2 branch arrays, i.e. a Linq Intersect on [1,5,9,50,80,200] and [6,90,256,300] etc.)
Would a db search produce faster results or an application based search?

Consider there might be other parameters for search in both cases, e.g. Last name starts with.

My current approach is to filter rows in db for both situations first on other parameters (like Last name starts with). Then use a scalar UDF to filter this result-set based on branches and then return the results.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

蓝戈者 2024-11-02 04:17:05

更新

对于 1000 位数字来说,这不是一个可行的解决方案。如果选择较少的人看到这篇文章,我将保留它。


您可以对数据库架构进行任何更改吗?如果您可以添加一个计算列,其中包含 varchar 中二进制数的整数表示形式,那么您可以使用按位逻辑来完全快速地在数据库中选择您正在谈论的内容。

以下是我所讨论内容的一个示例:

with temp as
(
   select 1 as BranchNumber -- 1
   union 
   select 2 -- 01
   union
   select 5 -- 101
   union
   select 7 -- 111
   union
   select 15 as number -- 111
)

--Select users that belong to branch 2    
    SELECT * from temp
    where (BranchNumber & 2) = 2

--returns 2,7,15



--Select users that belong to at least branches 1,2 and 3    
    SELECT * from temp
    where (BranchNumber & 7) = 7

--returns 7,15

要将二进制转换为数字,您可能必须创建一个 UDF,您可以调用它来填充新列。我做了一些探索,发现这似乎是一个很好的起点。我还没有测试过,所以要小心:

SET NOCOUNT ON
CREATE TABLE #nums (pos bigint)
DECLARE @cntr int
SET @cntr = 0
WHILE @cntr < 63 BEGIN
   INSERT INTO #nums VALUES (@cntr)
   SET @cntr = @cntr + 1
END

DECLARE @binstring varchar(63)
SET @binstring = '10000010000000001000011000000000'

SELECT
   IntegerVal =
      sum(power(convert(bigint,2),pos)
      * substring(reverse(@binstring),pos+1,1)) -- yeah, implicit conversion
   FROM #nums

DROP TABLE #nums

UPDATE

This is not a feasible solution with a 1000 bit number. I will leave it up incase someone with a smaller number of options comes across this post.


Can you make any changes to the DB schema at all? If you can add a calculated column that holds an integer representation of the binary number you have in your varchar then you can use bitwise logic to select what you are talking about entirely in the DB and pretty quickly.

Here is an example of what I am talking about:

with temp as
(
   select 1 as BranchNumber -- 1
   union 
   select 2 -- 01
   union
   select 5 -- 101
   union
   select 7 -- 111
   union
   select 15 as number -- 111
)

--Select users that belong to branch 2    
    SELECT * from temp
    where (BranchNumber & 2) = 2

--returns 2,7,15



--Select users that belong to at least branches 1,2 and 3    
    SELECT * from temp
    where (BranchNumber & 7) = 7

--returns 7,15

To convert binary to a number you would probably have to create a UDF that you could call to populate the new column. I did a little poking around and found this which appears to be a good starting point. I have not tested it yet, so be careful:

SET NOCOUNT ON
CREATE TABLE #nums (pos bigint)
DECLARE @cntr int
SET @cntr = 0
WHILE @cntr < 63 BEGIN
   INSERT INTO #nums VALUES (@cntr)
   SET @cntr = @cntr + 1
END

DECLARE @binstring varchar(63)
SET @binstring = '10000010000000001000011000000000'

SELECT
   IntegerVal =
      sum(power(convert(bigint,2),pos)
      * substring(reverse(@binstring),pos+1,1)) -- yeah, implicit conversion
   FROM #nums

DROP TABLE #nums
茶花眉 2024-11-02 04:17:05

尽管 1000 位整数不存在,但按位查询的性能可能会优于字符串函数。但是,由于它可以从字符串中派生,因此您可以选择将其分解为特定数量的整数集并跨它们进行查询。您只需要了解每列的有效位,并通过特定常量适当调整输入,或者将其打入一组表示字符串的位并转换为 int。

The bitwise query would probably outperform string functions though the 1000 bit integer doesn't exist. However as it can be derived from the string you can choose to break it into a particular number of integer sets and query across them. You will simply need to understand the significant bits for each column and adjust the input by a particular constant appropriately or just whack it into a set of bit representing strings and convert to int.

懒的傷心 2024-11-02 04:17:04

您可能希望为 LIKE 查询构造 __1_1____1% 形式的字符串,以查找有权访问分支 3、5 和 10 的所有用户。

要构造这些字符串,最简单的方法是从由 _ 字符组成的字符串,其长度与集合中最大的分支编号(或更大)相同,然后将各个 _ 字符替换为 1 字符,然后在末尾附加 %。

至于这是否比在数据库中执行循环或在应用程序中执行循环更快,我认为最好的方法就是测试它。

You might want to construct strings of the form __1_1____1% for a LIKE query to find all users who have access to branches 3, 5, and 10.

To construct these strings the easiest way is to start with a string of _ characters that's as long as the largest branch number in your set (or larger), and then replace individual _ characters with 1 characters, and then append the % at the end.

As for whether this is faster than doing a loop in the database or a loop in your application, I think your best approach is to just test it.

终陌 2024-11-02 04:17:04

按位组成员资格:

从评论中,我假设我们无法使用链接表作为组成员资格。这是一个不使用字符串的按位解决方案。这不是一个可接受的答案,因为位数严重限制了组的数量。但是,通过使用具有显式值比较的整数,数据库可以有效地利用其索引。因此,我添加了它,以应对组/角色/任何内容的数量足够有限以适应的情况。
PS:请原谅二进制十进制混乱,我只是即时插入东西。如果我有任何错误,欢迎评论和指正。

每个组都分配了一个位:

G1: 0001
G2: 0010
G3: 0100
G4: 1000

用户的组成员资格通过按位&计算。以下是一些二进制和十进制等效值的示例:

U1: G1:             0001 (01)
U2: G2:             0010 (02)
U3: G3:             0100 (04)
U4: G4:             1000 (08)
U5: G1 & G2:        0011 (03)
U6: G2 & G3:        0110 (06)
U7: G1 & G3:        0101 (05)
U8: G2 & G4:        1010 (10)
U9: G1 & G2 & G4:   1011 (11)

现在,使用 1-N 的迭代(N 是组数)进行计算,并获取任何特定组可以贡献的所有可能整数值的列表。例如,G1 将出现在任何奇数中:

G1' : 0001 (01), 0011 (03), 0101 (05), 0111 (07), 1001 (09), 1011 (11), 1101 (13), 1111 (15)
G2' : 0010 (02), 0011 (03), 0110 (06), 0111 (07), 1010 (10), 1011 (11), 1110 (14), 1111 (15)
G3' : 0100 (04), 0101 (05), 0110 (06), 0111 (07), 1100 (12), 1101 (13), 1110 (14), 1111 (15)
G4' : 1000 (08), 1001 (09), 1010 (10), 1011 (11), 1100 (12), 1101 (13), 1110 (14), 1111 (15)

您可以通过从 1-1000 的循环以及该组的十进制值 1、2、4、8 等的按位 AND 来实现此目的。< br>
将值保留在内存中,或将它们推入存储组可能的会员身份的表中,例如possible_memberships

Get me users in G1:
   Q: select * from users where group_memberships in (1, 3, 5, 7, 9, 11, 13, 15);
   A: U1, U5, U7, U9

Get me users in G2:
   Q: select * from users where group_memberships in (2, 3, 6, 7, 10, 11, 14, 15);
   A: U2, U5, U6, U8, U9

如果您有一个包含“possible_memberships”列的组表,您可以将值放在那里,从而节省您的时间
不必通过网络发送所有值并允许将子选择缓存在数据库上。
p>

Get me users in G3:
   Q: select * from users where group_memberships in (select possible_memberships from groups where name = 'G3');
   A: U3, U7, U6

Bitwise Group Membership:

From the comments, I'm assuming that we are not able to use a link table for group memberships. Here is a bitwise solution that does not use strings. It cannot be an acceptable answer because the number of bits limits the number of groups quite severely. However, by using integers with explicit value comparisons, the database can make use of its indexes efficiently. So I've added it for the case where the number of groups / roles / whatever is limited enough to fit.
PS: Excuse an binary-decimal mess ups, I just plugged stuff in on the fly. Feel free to comment and correct, if I have any errors.

Each group is assigned a bit:

G1: 0001
G2: 0010
G3: 0100
G4: 1000

Users' group memberships are calculated with bitwise &. Here are some examples with the binary and decimal equivalents:

U1: G1:             0001 (01)
U2: G2:             0010 (02)
U3: G3:             0100 (04)
U4: G4:             1000 (08)
U5: G1 & G2:        0011 (03)
U6: G2 & G3:        0110 (06)
U7: G1 & G3:        0101 (05)
U8: G2 & G4:        1010 (10)
U9: G1 & G2 & G4:   1011 (11)

Now, calculate, using iteration from 1-N (N is number of groups) and get a list of all the possible integer values that any particular group can contribute to. For example, G1 will be present in any odd number:

G1' : 0001 (01), 0011 (03), 0101 (05), 0111 (07), 1001 (09), 1011 (11), 1101 (13), 1111 (15)
G2' : 0010 (02), 0011 (03), 0110 (06), 0111 (07), 1010 (10), 1011 (11), 1110 (14), 1111 (15)
G3' : 0100 (04), 0101 (05), 0110 (06), 0111 (07), 1100 (12), 1101 (13), 1110 (14), 1111 (15)
G4' : 1000 (08), 1001 (09), 1010 (10), 1011 (11), 1100 (12), 1101 (13), 1110 (14), 1111 (15)

You can do this with a loop from 1-1000 with a bitwise AND of the group's decimal value 1,2,4,8, etc.
Keep the values in memory, or push them into the table storing your groups' possible memerships e.g. possible_memberships.

Get me users in G1:
   Q: select * from users where group_memberships in (1, 3, 5, 7, 9, 11, 13, 15);
   A: U1, U5, U7, U9

Get me users in G2:
   Q: select * from users where group_memberships in (2, 3, 6, 7, 10, 11, 14, 15);
   A: U2, U5, U6, U8, U9

If you have a groups table with column 'possible_memberships', you can put the values in there, saving you
from having to send all the values over wire and allowing the subselect to be cached on the database.
p>

Get me users in G3:
   Q: select * from users where group_memberships in (select possible_memberships from groups where name = 'G3');
   A: U3, U7, U6
你与昨日 2024-11-02 04:17:04

使用 LIKE 查询。在 Sql Server 中,LIKE 表达式中使用的 _ 匹配任何单个字符。要获得分支 1,5 和 10 中的用户,您可以这样操作:

SELECT columns FROM Users WHERE BRANCHES LIKE '1___1____1%'

这不是特别有效(不是很可控制),但它应该可以工作,并且可能不会比您的 udf 选项差。

Use a LIKE query. In Sql Server, an _ used in a LIKE expression matches any single character. To get those users that are in branches 1,5, and 10 , you could to it like this:

SELECT columns FROM Users WHERE BRANCHES LIKE '1___1____1%'

This isn't particularly efficient (it's not very sargable), but it should work and it's likely no worse than your udf option.

星星的轨迹 2024-11-02 04:17:03

用 SQL 来做,只会比用 C# 或其他前端做快 100 倍。

使用内置数字表将长字符串分解为多个位置(数字系列最大为 2047)。

示例表

create table users (userid int)
insert users select 1 union all select 2

create table permission (userid int, bigstr varchar(1000))
insert permission
select 1, REPLICATE('0', 56) + '1' -- 57th
        + REPLICATE('0', 32) + '1' -- 90th
        + REPLICATE('0', 64) + '1' -- 155th
        + REPLICATE('0', 845)
insert permission
select 2, REPLICATE('0', 66) + '1' -- 67th
        + REPLICATE('0', 98) + '1' -- 166th
        + REPLICATE('0', 657) + '1' -- 824th
        + REPLICATE('0', 176)

显示列表中所有匹配权限的示例

select *
from users u
inner join permission p on p.userid=u.userid
inner join master..spt_values v on v.type='p'
  and SUBSTRING(p.bigstr,v.number,1) = '1'
  and v.number between 1 and LEN(p.bigstr)  -- or 1000 if it is always 1000
where v.number in (57,90,824)

查找有权访问列表中至少一个分支的用户:

select distinct u.userid
from users u
inner join permission p on p.userid=u.userid
inner join master..spt_values v on v.type='p'
  and SUBSTRING(p.bigstr,v.number,1) = '1'
  and v.number between 1 and LEN(p.bigstr)  -- or 1000 if it is always 1000
where v.number in (57,90,824)

等等。

Do it in SQL, it will be only 100 times faster than doing it in C# or other front end.

Use the built-in numbers table to break the long string into positions (number series goes up to 2047).

Sample tables

create table users (userid int)
insert users select 1 union all select 2

create table permission (userid int, bigstr varchar(1000))
insert permission
select 1, REPLICATE('0', 56) + '1' -- 57th
        + REPLICATE('0', 32) + '1' -- 90th
        + REPLICATE('0', 64) + '1' -- 155th
        + REPLICATE('0', 845)
insert permission
select 2, REPLICATE('0', 66) + '1' -- 67th
        + REPLICATE('0', 98) + '1' -- 166th
        + REPLICATE('0', 657) + '1' -- 824th
        + REPLICATE('0', 176)

Sample showing all the matching permissions against a list

select *
from users u
inner join permission p on p.userid=u.userid
inner join master..spt_values v on v.type='p'
  and SUBSTRING(p.bigstr,v.number,1) = '1'
  and v.number between 1 and LEN(p.bigstr)  -- or 1000 if it is always 1000
where v.number in (57,90,824)

To find users who have at access to at least one branch in the list:

select distinct u.userid
from users u
inner join permission p on p.userid=u.userid
inner join master..spt_values v on v.type='p'
  and SUBSTRING(p.bigstr,v.number,1) = '1'
  and v.number between 1 and LEN(p.bigstr)  -- or 1000 if it is always 1000
where v.number in (57,90,824)

etc..

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文