尝试从 IN 语句生成子集

发布于 2024-11-27 15:35:53 字数 1288 浏览 1 评论 0原文

我正在为用户编写一个解决方案,该解决方案将他们输入的电话号码列表与客户数据库相匹配。

用户需要输入以逗号分隔的电话号码(整数)列表,并且查询需要告诉用户列表中的哪些电话号码不在数据库中。

我想到的唯一方法是首先创建一个子集 NUMBER_LIST,其中包含我可以加入的所有电话号码,然后从我从客户数据库带回的列表中排除该列表。

WITH NUMBER_LIST AS (
    SELECT INTEGERS 
    FROM (
        SELECT level - 1 + 8000000000 INTEGERS
           FROM dual
        CONNECT BY level <= 8009999999-8000000000+1
    )
    WHERE INTEGERS IN (8001231001,8001231003,8001231234,8001231235,...up to 1000 phone numbers)
)

这里的问题是上面的代码可以很好地创建我的子集,对于 800-000-0000 和 800-999-9999 之间的数字。我的列表和客户数据库中的电话号码可以是任何范围(不仅仅是 800 个号码)。我这样做只是为了测试。从该查询生成子集大约需要 6 秒。如果我创建 CONNECT BY LEVEL 以包含从 100-000-0000 到 999-999-9999 的所有数字,那么运行我的查询就会耗尽内存来创建这么大的子集(而且我认为创建一个巨大的列表是荒谬的矫枉过正并使用我的 IN 语句将其分解)。

问题是创建初始子集。我可以处理其余的查询,但我需要能够生成数字子集,以便从 IN 语句针对我的客户数据库进行查询。

有几件事需要记住:

  1. 我无法首先将数字加载到临时表中。用户将自己输入“IN(...,...,...)”语句。
  2. 这需要是单个语句,没有额外的函数或变量声明
  3. 数据库是Oracle 10g,我正在使用SQL Developer来创建查询。
  4. 用户知道他们只能在 IN 语句中输入 1000 个数字。这需要足够强大才能从整个区号范围中选择任何 1000 个号码。
  5. 最终结果是获取数据库中没有的电话号码列表。简单的 NOT IN... 不起作用,因为这将返回哪些数字在数据库中,但不在我的列表中。

我如何才能使此功能适用于 1000000000-9999999999 之间的所有号码(或所有可能的美国 10 位电话号码)。我可能会完全错误地生成我最初的巨大列表,然后排除除 IN 语句之外的所有内容,但我不确定从这里到哪里去。

非常感谢您提前提供的帮助。我从你们所有人身上学到了很多东西。

I am writing a solution for a user that matches a list of phone numbers they enter against a customer database.

The user needs to enter a comma separated list of phone numbers (integers), and the query needs to tell the user which phone numbers from their list are NOT in the database.

The only way I could think to do this is by first creating a subset NUMBER_LIST that includes all of the phone numbers that I can join and then exclude that list from what I bring back from my customer database.

WITH NUMBER_LIST AS (
    SELECT INTEGERS 
    FROM (
        SELECT level - 1 + 8000000000 INTEGERS
           FROM dual
        CONNECT BY level <= 8009999999-8000000000+1
    )
    WHERE INTEGERS IN (8001231001,8001231003,8001231234,8001231235,...up to 1000 phone numbers)
)

The problem here is the above code works fine to create my subset, for numbers between 800-000-0000 and 800-999-9999. The phone numbers in my list and customer database can be ANY range (not just 800 numbers). I did this just as a test. It takes about 6 seconds to generate the subset from that query. If I create the CONNECT BY LEVEL to include all numbers from 100-000-0000 to 999-999-9999 that is running my query out of memory to create a subset that large (and I believe it is ridiculously overkill to create a huge list and break it down using my IN statement).

The problem is creating the initial subset. I can handle the rest of the query, but I need to be able to generate the subset of numbers to query against my customer database from my IN statement.

Few things to remember:

  1. I don't have the ability to load the numbers in a temporary table first. The user will be entering the "IN(...,...,...)" statement themselves.
  2. This needs to be a single statement, no extra functions or variable declarations
  3. The database is Oracle 10g, and I am using SQL Developer to create the query.
  4. The user understands that they can only enter 1000 numbers into the IN statement. This needs to be robust enough to select any 1000 numbers from the entire area code range.
  5. The end result is to get a list of phone numbers that ARE NOT in the database. A simple NOT IN... will not work, because that will bring back which numbers are in the database, but not in my list.

How can I make this work for all numbers between 1000000000-9999999999 (or all U.S. 10-digit phone number possibilities). I may be going about it completely wrong to generate my initial HUGE list and then excluding everything other than my IN statement, but I'm not sure where to go from here.

Thanks so much for your help in advance. I've learned so much from all of you.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

罪#恶を代价 2024-12-04 15:35:53

您可以使用以下命令:

SELECT *
  FROM (SELECT regexp_substr(&x, '[^,]+', 1, LEVEL) phone_number
           FROM dual
         CONNECT BY LEVEL <= length(&x) - length(REPLACE(&x, ',', '')) + 1)
 WHERE phone_number NOT IN (SELECT phone_table.phone_number 
                              FROM phone_table)

第一个查询将构建包含各个电话号码的列表。

You could use the following:

SELECT *
  FROM (SELECT regexp_substr(&x, '[^,]+', 1, LEVEL) phone_number
           FROM dual
         CONNECT BY LEVEL <= length(&x) - length(REPLACE(&x, ',', '')) + 1)
 WHERE phone_number NOT IN (SELECT phone_table.phone_number 
                              FROM phone_table)

The first query will build a list with the individual phone numbers.

太阳公公是暖光 2024-12-04 15:35:53

这个问题与“如何绑定列表中的”问题密切相关,该问题已经在这里出现过几次。我发布了一个答案 Dynamic query with HibernateCritera API & Oracle - 过去的性能

像这样的事情应该可以满足您的要求:

create table phone_nums (phone varchar2(10));

insert into phone_nums values ('12345');

insert into phone_nums values ('23456');

with bound_inlist
  as
  (
  select
    substr(txt,
           instr (txt, ',', 1, level  ) + 1,
           instr (txt, ',', 1, level+1) - instr (txt, ',', 1, level) -1 )
           as token
    from (select ','||:txt||',' txt from dual)
  connect by level <= length(:txt)-length(replace(:txt,',',''))+1
  )
  select *
from bound_inlist a
where  not exists (select null from phone_nums where phone = token); 

这里逗号分隔的电话号码列表已绑定到查询中,因此您可以正确使用绑定变量,并且您将能够输入可能无限数量的电话号码来一次性检查(尽管我会检查 4000 和 32767 个字符边界以确定)。

This problem is very closely related to the 'how do I bind an in list' problem, which has come up on here a few times. I posted an answer Dynamic query with HibernateCritera API & Oracle - performance in the past.

Something like this should do what you want:

create table phone_nums (phone varchar2(10));

insert into phone_nums values ('12345');

insert into phone_nums values ('23456');

with bound_inlist
  as
  (
  select
    substr(txt,
           instr (txt, ',', 1, level  ) + 1,
           instr (txt, ',', 1, level+1) - instr (txt, ',', 1, level) -1 )
           as token
    from (select ','||:txt||',' txt from dual)
  connect by level <= length(:txt)-length(replace(:txt,',',''))+1
  )
  select *
from bound_inlist a
where  not exists (select null from phone_nums where phone = token); 

Here the list of comma separated phone numbers is bound into the query, so you are using bind variables correctly, and you will be able to enter probably an unlimited number of phone numbers to check in one go (although I would check both the 4000 and 32767 character boundaries to be sure).

似狗非友 2024-12-04 15:35:53

你说你不能使用临时表或过程或自定义函数——如果可以的话,这将是一个简单的任务。

用于提交此查询的客户端工具是什么?是否有原因导致您无法从数据库中查询所有电话号码并在客户端上进行比较?

You say you can't use temp tables or procs or custom functions -- it would be a simple task if you could.

What's the client tool being used to submit this query? Is there a reason why you can't query all phone numbers from the database and do the compare on the client?

兮子 2024-12-04 15:35:53

如果您受到限制,必须使用 IN (n1,n2,n3,...,n1000) 来解决该问题,那么您的方法似乎是唯一的解决方案。

正如您所提到的,这是您预先创建的一个大列表。

你能稍微调整一下你的方法吗?

WITH NUMBER_LIST (number) AS (
            SELECT n1    FROM DUAL
  UNION ALL SELECT n2    FROM DUAL
  UNION ALL SELECT n3    FROM DUAL
  ...
  UNION ALL SELECT n1000 FROM DUAL
)

If you are constrained to the point that it MUST be solved with IN (n1,n2,n3,...,n1000), then your approach would appear to be the only solution.

As you mentioned though, that's a big list you're creating up front.

Are you able to adapt your approach slightly?

WITH NUMBER_LIST (number) AS (
            SELECT n1    FROM DUAL
  UNION ALL SELECT n2    FROM DUAL
  UNION ALL SELECT n3    FROM DUAL
  ...
  UNION ALL SELECT n1000 FROM DUAL
)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文