对相似记录进行分组 Sql Server 2008

发布于 2024-08-09 06:19:40 字数 1064 浏览 5 评论 0原文

我将编写一个查询来实现如下所示的功能:

TableName: Application

AppId (PK,int) AppType (bit)
1               0
2               0
3               0  
4               0
5               1
6               0
7               0
8               0
9               1
10              1
11              0
12              0
13              1
14              0
15              1

我必须根据 App_type 按顺序对 App_Id 进行分组并创建一批记录。需要注意的重要一点是,我在创建批次时必须维护 AppId 的序列。一个批次可以拥有的最大记录数取决于批次大小参数(假设批次大小目前设置为 3)。创建批次后,将详细信息插入到名为 ApplicationBatch 的不同表中。所以我想要一个类似的输出:

TableName: ApplicationBatch

BatchId  MinAppId  MaxAppId  AppType
001      1         3         0
002      4         4         0
003      5         5         1
004      6         8         0
005      9         10        1
006      11        12        0
007      13        13        1
008      14        14        0
009      15        15        1

我还必须以最有效和优化的方式设计查询,因为应用程序表中可以有超过一百万条记录。

更新:

目前我有应用程序表(在上面的原始问题中定义),我想根据应用程序表中的数据填充 ApplicationBatch 表。

I will be writing a query to achieve something like below:

TableName: Application

AppId (PK,int) AppType (bit)
1               0
2               0
3               0  
4               0
5               1
6               0
7               0
8               0
9               1
10              1
11              0
12              0
13              1
14              0
15              1

I have to sequentially group App_Id based on App_type and create a batch of records. The important point to note is that I have to maintain the SEQUENCE of AppId while creating the batches. The maximum number of records a batch can have depends on batch size parameter (say batch size set to 3 for now). Once the batch is created, insert the details in a different table say called ApplicationBatch. So I want an output something like:

TableName: ApplicationBatch

BatchId  MinAppId  MaxAppId  AppType
001      1         3         0
002      4         4         0
003      5         5         1
004      6         8         0
005      9         10        1
006      11        12        0
007      13        13        1
008      14        14        0
009      15        15        1

One more thing I have to design the query in a best efficient and optimized way because Application table can have more than million records in it.

Update:

Currently I have Application table (defined in my original question above) and I want to populate ApplicationBatch table based on data from Application table.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

杀お生予夺 2024-08-16 06:19:40

试试这个。相当复杂但有效。我没有在这么多行上测试它,但它只迭代表一次。

首先你必须做一些预处理。

-- create temporary table
CREATE TABLE #tmpApp(AppId INT, AppType BIT , BatchId INT)

INSERT INTO #tmpApp(AppId,AppType)
SELECT AppId, AppType FROM Application

-- declare variables
DECLARE @curId INT
DECLARE @oldCurId INT

DECLARE @appType INT
DECLARE @oldAppType INT

DECLARE @batchNo INT
DECLARE @itemsInBatch INT

SET @oldCurId = 0
SET @batchNo = 1
SET @itemsInBatch = 0

SELECT TOP 1 
    @curId = AppId, 
    @appType = AppType 
FROM #tmpApp
WHERE AppId > @oldCurId
ORDER BY AppId 

WHILE @curId IS NOT NULL
BEGIN
    IF @oldAppType <> @appType OR @itemsInBatch >= 3
    BEGIN
        SET @batchNo = @batchNo + 1
        SET @itemsInBatch = 0
    END

    SET @itemsInBatch = @itemsInBatch + 1

    UPDATE #tmpApp 
    SET batchId = @batchNo 
    WHERE AppId = @curId

    SET @oldCurId = @curId
    SET @oldAppType = @appType

    SET @curId = NULL
    SELECT TOP 1 
        @curId = AppId,
        @appType = AppType  
    FROM #tmpApp
    WHERE AppId > @oldCurId
    ORDER BY AppId 
END

并执行查询:

-- the final query
SELECT 
    BatchId, 
    MIN(AppId) AS MinAppId, 
    MAX(AppId) AS MaxAppId, 
    AppType
FROM #tmpApp
GROUP BY BatchId, AppType
ORDER BY BatchId

Try this. Pretty complicated but works. I didn't test it on so many rows, but it iterates through the table only once.

First you have to do some preporocesing.

-- create temporary table
CREATE TABLE #tmpApp(AppId INT, AppType BIT , BatchId INT)

INSERT INTO #tmpApp(AppId,AppType)
SELECT AppId, AppType FROM Application

-- declare variables
DECLARE @curId INT
DECLARE @oldCurId INT

DECLARE @appType INT
DECLARE @oldAppType INT

DECLARE @batchNo INT
DECLARE @itemsInBatch INT

SET @oldCurId = 0
SET @batchNo = 1
SET @itemsInBatch = 0

SELECT TOP 1 
    @curId = AppId, 
    @appType = AppType 
FROM #tmpApp
WHERE AppId > @oldCurId
ORDER BY AppId 

WHILE @curId IS NOT NULL
BEGIN
    IF @oldAppType <> @appType OR @itemsInBatch >= 3
    BEGIN
        SET @batchNo = @batchNo + 1
        SET @itemsInBatch = 0
    END

    SET @itemsInBatch = @itemsInBatch + 1

    UPDATE #tmpApp 
    SET batchId = @batchNo 
    WHERE AppId = @curId

    SET @oldCurId = @curId
    SET @oldAppType = @appType

    SET @curId = NULL
    SELECT TOP 1 
        @curId = AppId,
        @appType = AppType  
    FROM #tmpApp
    WHERE AppId > @oldCurId
    ORDER BY AppId 
END

And execute the query:

-- the final query
SELECT 
    BatchId, 
    MIN(AppId) AS MinAppId, 
    MAX(AppId) AS MaxAppId, 
    AppType
FROM #tmpApp
GROUP BY BatchId, AppType
ORDER BY BatchId
柏拉图鍀咏恒 2024-08-16 06:19:40

问题没说清楚,不过我明白了下。

您想要一个应用程序序列,并且这取决于插入到另一个表中的行数?

再见。

the question is not clear, but I understand the next.

you want to have a sequence for aplication and that depends of the number of lines inserted in anohter table?

see you.

聚集的泪 2024-08-16 06:19:40

如果没有 CURSORS 或 SQLCLR,很难做到这一点。您是否会考虑用 C# 编写表值函数并将程序集嵌入 SQL Server 中? (SQLCLR)
这就是我要做的,然后我会做一个 while 循环按顺序处理记录。

It's hard to do this without CURSORS or SQLCLR. Would you consider writting a table-valued function in c# and ambedding the assembly in SQL Server? (SQLCLR)
That's what I would do, and then I would do a while loop processing the records sequantially.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文