甲骨文分析问题

发布于 2024-07-25 02:06:24 字数 916 浏览 8 评论 0原文

给定一个函数 zipdistance(zipfrom,zipto) ,它计算两个邮政编码和下表之间的距离(以英里为单位):

create table zips_required(
   zip varchar2(5)
);

create table zips_available(
   zip varchar2(5),
   locations number(100)
);

如何构造一个查询,该查询将从 zips_required 表中返回每个邮政编码以及将返回的最小距离产生 sum(locations) >= n。

到目前为止,我们只是对每个半径运行了详尽的循环查询,直到满足条件为止。

--Do this over and over incrementing the radius until the minimum requirement is met
select count(locations) 
from zips_required zr 
left join zips_available za on (zipdistance(zr.zip,za.zip)< 2) -- Where 2 is the radius

对于一个大列表来说,这可能需要一段时间。 感觉这可以通过 Oracle 分析查询来完成,大致如下:

min() over (
  partition by zips_required.zip 
  order by zipdistance( zips_required.zip, zips_available.zip)
  --range stuff here?
) 

我所做的唯一分析查询是基于“row_number over (partition by order by)”,并且我正在涉足未知领域。 非常感谢对此的任何指导。

Given a function zipdistance(zipfrom,zipto) which calculates the distance (in miles) between two zip codes and the following tables:

create table zips_required(
   zip varchar2(5)
);

create table zips_available(
   zip varchar2(5),
   locations number(100)
);

How can I construct a query that will return to me each zip code from the zips_required table and the minimum distance that would produce a sum(locations) >= n.

Up till now we've just run an exhaustive loop querying for each radius until we've met the criteria.

--Do this over and over incrementing the radius until the minimum requirement is met
select count(locations) 
from zips_required zr 
left join zips_available za on (zipdistance(zr.zip,za.zip)< 2) -- Where 2 is the radius

This can take a while on a large list. It feels like this could be done with an oracle analytic query along the lines of:

min() over (
  partition by zips_required.zip 
  order by zipdistance( zips_required.zip, zips_available.zip)
  --range stuff here?
) 

The only analytic queries I have done have been "row_number over (partition by order by)" based, and I'm treading into unknown areas with this. Any guidance on this is greatly appreciated.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

長街聽風 2024-08-01 02:06:24

这就是我想到的:

SELECT zr, min_distance
  FROM (SELECT zr, min_distance, cnt, 
               row_number() over(PARTITION BY zr ORDER BY min_distance) rnk
           FROM (SELECT zr.zip zr, zipdistance(zr.zip, za.zip) min_distance,
                         COUNT(za.locations) over(
                             PARTITION BY zr.zip 
                             ORDER BY zipdistance(zr.zip, za.zip)
                         ) cnt
                    FROM zips_required zr
                   CROSS JOIN zips_available za)
          WHERE cnt >= :N)
 WHERE rnk = 1
  1. 对于每个 zip_required 计算到 zip_available 的距离,并按距离对它们进行排序
  2. 对于每个 zip_requiredcountrange 可以让您知道该距离的半径内有多少个 zip_availables
  3. 过滤器(首先,其中 COUNT(locations) > N)

我用来创建示例数据:

INSERT INTO zips_required
   SELECT to_char(10000 + 100 * ROWNUM) FROM dual CONNECT BY LEVEL <= 5;

INSERT INTO zips_available
   (SELECT to_number(zip) + 10 * r, 100 - 10 * r FROM zips_required, (SELECT ROWNUM r FROM dual CONNECT BY LEVEL <= 9));

CREATE OR REPLACE FUNCTION zipdistance(zipfrom VARCHAR2,zipto VARCHAR2) RETURN NUMBER IS
BEGIN
   RETURN abs(to_number(zipfrom) - to_number(zipto));
END zipdistance;
/

注意:您在问题中使用了 COUNT(locations) 和 SUM(locations),我假设它是 COUNT(locations)

This is what I came up with :

SELECT zr, min_distance
  FROM (SELECT zr, min_distance, cnt, 
               row_number() over(PARTITION BY zr ORDER BY min_distance) rnk
           FROM (SELECT zr.zip zr, zipdistance(zr.zip, za.zip) min_distance,
                         COUNT(za.locations) over(
                             PARTITION BY zr.zip 
                             ORDER BY zipdistance(zr.zip, za.zip)
                         ) cnt
                    FROM zips_required zr
                   CROSS JOIN zips_available za)
          WHERE cnt >= :N)
 WHERE rnk = 1
  1. For each zip_required calculate the distance to the zip_available and sort them by distance
  2. For each zip_required the count with range allows you to know how many zip_availables are in the radius of that distance.
  3. filter (first where COUNT(locations) > N)

I used to create sample data:

INSERT INTO zips_required
   SELECT to_char(10000 + 100 * ROWNUM) FROM dual CONNECT BY LEVEL <= 5;

INSERT INTO zips_available
   (SELECT to_number(zip) + 10 * r, 100 - 10 * r FROM zips_required, (SELECT ROWNUM r FROM dual CONNECT BY LEVEL <= 9));

CREATE OR REPLACE FUNCTION zipdistance(zipfrom VARCHAR2,zipto VARCHAR2) RETURN NUMBER IS
BEGIN
   RETURN abs(to_number(zipfrom) - to_number(zipto));
END zipdistance;
/

Note: you used COUNT(locations) and SUM(locations) in your question, I assumed it was COUNT(locations)

意中人 2024-08-01 02:06:24
SELECT  *
FROM    (
        SELECT  zip, zd, ROW_NUMBER() OVER (PARTITION BY zip ORDER BY rn DESC) AS rn2
        FROM    (
                SELECT  zip, zd, ROW_NUMBER() OVER (PARTITION BY zip ORDER BY zd DESC) AS rn
                FROM    (
                        SELECT  zr.zip, zipdistance(zr.zip, za.zip) AS zd
                        FROM    zips_required zr
                        JOIN    zips_available za
                        )
                )
        WHERE   rn <= n
        )
WHERE   rn2 = 1

对于每个 zip_required,这将选择适合 N zip_available 的最小距离,或者如果 zip_available 的数量为最大距离 小于 N

SELECT  *
FROM    (
        SELECT  zip, zd, ROW_NUMBER() OVER (PARTITION BY zip ORDER BY rn DESC) AS rn2
        FROM    (
                SELECT  zip, zd, ROW_NUMBER() OVER (PARTITION BY zip ORDER BY zd DESC) AS rn
                FROM    (
                        SELECT  zr.zip, zipdistance(zr.zip, za.zip) AS zd
                        FROM    zips_required zr
                        JOIN    zips_available za
                        )
                )
        WHERE   rn <= n
        )
WHERE   rn2 = 1

For each zip_required, this will select the minimal distance into which fit N zip_available's, or maximal distance if the number of zip_available's is less than N.

烟花易冷人易散 2024-08-01 02:06:24

我解决了同样的问题,方法是在给定邮政编码的平方半径内创建邮政编码的子集(简单数学:<或> NSWE radius ),然后迭代子集中的每个条目以查看它是否在所需的半径内。 工作起来非常有魅力而且速度非常快。

I solved the same problem by creating a subset of ZIP's within a square radius from the given zip (easy math: < or > NSWE radius ), then iterating through each entry in the subset to see if it was within the needed radius. Worked like a charm and was very fast.

清醇 2024-08-01 02:06:24

我在我的一个旧项目中有部分类似的要求......计算美国两个邮政编码之间的距离。 为了解决这个问题,我充分利用了美国空间数据。 基本上,该方法是获取源邮政编码(纬度,经度)和目的地邮政编码(纬度,经度)。
现在我应用了一个函数来根据上述内容获取距离。 有助于进行此计算的基本公式可在 以下网站
我还通过参考 此站点...

注意:但是,这将提供大概的距离,因此人们可以相应地使用它。 优点是一旦构建就可以超快地获取结果。

I had partly similar requirements in one of my old projects... to calculate distance between 2 zipcodes in the US. To solve the same I had made great use of US Spatial Data. Basically the approach was to get the Source Zipcode(Latitude, Longitude) and Destination Zipcode(Latitude, Longitude).
Now then I had applied a function to get the distance based on the above. The base formula that helps in doing this calculation is available in the following site
I had also validated the outcome by referring to this site...

Note: However this will provide approximate distances, so one can use this accordingly. Benefits are once constructed its superfast to fetch the results.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文