按时间块查询记录并分组

发布于 2024-11-07 16:24:56 字数 1211 浏览 0 评论 0原文

我有一个每天可能运行多次的应用程序。每次运行都会产生写入表的数据,以报告发生的事件。主报告表看起来像这样:

Id    SourceId    SourceType    DateCreated
5048  433         FILE          5/17/2011 9:14:12 AM
5049  346         FILE          5/17/2011 9:14:22 AM
5050  444         FILE          5/17/2011 9:14:51 AM
5051  279         FILE          5/17/2011 9:15:02 AM
5052  433         FILE          5/17/2011 12:34:12 AM
5053  346         FILE          5/17/2011 12:34:22 AM
5054  444         FILE          5/17/2011 12:34:51 AM
5055  279         FILE          5/17/2011 12:35:02 AM

我可以看出有两次运行,但我想要一种能够查询日期范围、进程运行次数的方法。我想要一个查询,以得出进程启动的时间和组中的文件数量。这个查询可以让我得到我想要的东西,因为我可以看到运行的日期和时间以及有多少文件,但不完全是我想要的。例如,它不适合从 8:58 到 9:04 运行。例如,它还会对 9:02 和 9:15 开始的跑步进行分组。

Select dateadd(day,0,datediff(day,0,DateCreated)) as [Date], datepart(hour, DateCreated) as [Hour], Count(*) [File Count]
From   MyReportTable
Where DateCreated between '5/4/2011' and '5/18/2011'
    and SourceType = 'File'
Group By dateadd(day,0,datediff(day,0,DateCreated)), datepart(hour, DateCreated)
Order By dateadd(day,0,datediff(day,0,DateCreated)), datepart(hour, DateCreated)

我知道任何距离很近的跑步都可能会被分组在一起,我对此很满意。我只希望得到一个粗略的分组。

谢谢!

I have an application that may be run several times a day. Each run results in data that is written to a table to report on events that occurred. The main report table looks something like this:

Id    SourceId    SourceType    DateCreated
5048  433         FILE          5/17/2011 9:14:12 AM
5049  346         FILE          5/17/2011 9:14:22 AM
5050  444         FILE          5/17/2011 9:14:51 AM
5051  279         FILE          5/17/2011 9:15:02 AM
5052  433         FILE          5/17/2011 12:34:12 AM
5053  346         FILE          5/17/2011 12:34:22 AM
5054  444         FILE          5/17/2011 12:34:51 AM
5055  279         FILE          5/17/2011 12:35:02 AM

I can tell that there were two runs, but I would like a way to be able to query for a date range, the number of times the process was run. I would like to have a query that results in the time the process started and the number of files in the group. This query sort of gets me what I want in terms of I can see what day and hour and how many files were run, but not exactly how I would like. And it would not accomodate runs that ran from 8:58 to 9:04 for example. It also would group runs that started at 9:02 and 9:15 for example.

Select dateadd(day,0,datediff(day,0,DateCreated)) as [Date], datepart(hour, DateCreated) as [Hour], Count(*) [File Count]
From   MyReportTable
Where DateCreated between '5/4/2011' and '5/18/2011'
    and SourceType = 'File'
Group By dateadd(day,0,datediff(day,0,DateCreated)), datepart(hour, DateCreated)
Order By dateadd(day,0,datediff(day,0,DateCreated)), datepart(hour, DateCreated)

I understand that any runs that are close together will likely get grouped together, and I'm fine with that. I only expect to get a rough grouping.

Thanks!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

演多会厌 2024-11-14 16:24:56

如果您确定这些运行是连续的并且不重叠,您应该能够使用 Id 字段来分解您的组。查找相距仅 1 的 Id 字段以及相距大于某个阈值的创建日期字段。根据您的数据,运行中的记录似乎是在最多一分钟内输入的,因此安全阈值可能是一分钟或更长时间。

这将为您提供开始时间(

SELECT mrtB.Id, mrtB.DateCreated
FROM MyReportTable AS mrtA
INNER JOIN MyReportTable AS mrtB
    ON (mrtA.Id + 1) = mrtB.Id
WHERE DateDiff(mi, mrtA.DateCreated, mrtB.DateCreated) >= 1

我将其称为 DataRunStarts)。

现在您可以使用它来获取有关组开始和结束位置的信息

SELECT drsA.Id AS StartID, drsA.DateCreated, Min(drsB.Id) AS ExcludedEndId
FROM DataRunStarts AS drsA, DataRunStarts AS drsB
WHERE (((drsB.Id)>[drsA].[id]))
GROUP BY drsA.Id, drsA.DateCreated

(我将其称为 DataRunGroups)。我将最后一个字段称为“排除”,因为它所保存的 id 将用于定义将被拉取的 id 集的结束边界。

现在我们可以使用 DataRunGroups 和 MyReportTable 来获取计数(

SELECT DataRunGroups.StartID, Count(MyReportTable.Id) AS CountOfRecords
FROM DataRunGroups, MyReportTable
WHERE (((MyReportTable.Id)>=[StartId] And (MyReportTable.Id)<[ExcludedEndId]))
GROUP BY DataRunGroups.StartID;

我将其称为 DataRunCounts)

现在我们可以将 DataRunGroups 和 DataRunCounts 放在一起来获取开始时间和计数。

SELECT DataRunGroups.DateCreated, DataRunCounts.CountOfRecords
FROM DataRunGroups
INNER JOIN DataRunCounts
    ON DataRunGroups.StartID = DataRunCounts.StartID;

根据您的设置,您可能需要在一个查询中完成所有这些操作,但您已经明白了。另外,第一次和最后一次运行不会包含在其中,因为第一次运行没有开始 id,最后一次运行也没有结束 id。要包含这些,您只需对这两个范围进行查询,并将它们与旧的 DataRunGroups 查询联合在一起以创建新的 DataRunGroups。使用 DataRunGroups 的其他查询将按照上述方式工作。

If you're certain these runs are contiguous and don't overlap, you should be able to use the Id field to break up your groups. Look for Id fields that are only 1 apart AND datecreated fields that are greater than some threshold apart. From your data, it looks like records within a run are entered within at most a minute of each other, so a safe threshold could be a minute or more.

This would get you your start times

SELECT mrtB.Id, mrtB.DateCreated
FROM MyReportTable AS mrtA
INNER JOIN MyReportTable AS mrtB
    ON (mrtA.Id + 1) = mrtB.Id
WHERE DateDiff(mi, mrtA.DateCreated, mrtB.DateCreated) >= 1

I'll call that DataRunStarts

Now you can use that to get info about where the groups started and ended

SELECT drsA.Id AS StartID, drsA.DateCreated, Min(drsB.Id) AS ExcludedEndId
FROM DataRunStarts AS drsA, DataRunStarts AS drsB
WHERE (((drsB.Id)>[drsA].[id]))
GROUP BY drsA.Id, drsA.DateCreated

I'll call that DataRunGroups. I called that last field "Excluded" because the id it holds is just going to be used to define the end boundary for the set of ids that will be pulled.

Now we can use DataRunGroups and MyReportTable to get the counts

SELECT DataRunGroups.StartID, Count(MyReportTable.Id) AS CountOfRecords
FROM DataRunGroups, MyReportTable
WHERE (((MyReportTable.Id)>=[StartId] And (MyReportTable.Id)<[ExcludedEndId]))
GROUP BY DataRunGroups.StartID;

I'll call that DataRunCounts

Now we can put DataRunGroups and DataRunCounts together to get start times and counts.

SELECT DataRunGroups.DateCreated, DataRunCounts.CountOfRecords
FROM DataRunGroups
INNER JOIN DataRunCounts
    ON DataRunGroups.StartID = DataRunCounts.StartID;

Depending on your setup, you may need to do all of this on one query, but you get the idea. Also, the very first and very last runs wouldn't be included in this, because there'd be no start id to go by for the very first run, and no end id to go by for the very last run. To include those, you would make queries for just those two ranges, and union them together along with the old DataRunGroups query to create a new DataRunGroups. The other queries that use DataRunGroups would work just as described above.

浮生未歇 2024-11-14 16:24:56

再进一步:

SELECT
    Count(Id), 
    DATEPART(year, DateCreated) As yr, 
    DATEPART(month, DateCreated) As mth, 
    DATEPART(day, DateCreated) As day, 
    DATEPART(Hour, DateCreated) as hr, 
    DATEPART(minute, DateCreated) as mnt
FROM 
    MyReportTable
WHERE DateCreated between '5/4/2011' and '5/18/2011'
    and SourceType = 'File'
GROUP BY 
    DATEPART(year, DateCreated), 
    DATEPART(month, DateCreated), 
    DATEPART(day, DateCreated), 
    DATEPART(Hour, DateCreated),
    DATEPART(minute, DateCreated)
ORDER BY 
    DATEPART(year, DateCreated),
    DATEPART(month, DateCreated), 
    DATEPART(day, DateCreated), 
    DATEPART(Hour, DateCreated),
    DATEPART(minute, DateCreated)

编辑

要达到 15 分钟的分辨率,请将最后一列更改为

(DATEPART(minute, DateCreated)/15)

(在选择中添加 +1 以获得 1,2,3,4)。

Take it a few steps farther:

SELECT
    Count(Id), 
    DATEPART(year, DateCreated) As yr, 
    DATEPART(month, DateCreated) As mth, 
    DATEPART(day, DateCreated) As day, 
    DATEPART(Hour, DateCreated) as hr, 
    DATEPART(minute, DateCreated) as mnt
FROM 
    MyReportTable
WHERE DateCreated between '5/4/2011' and '5/18/2011'
    and SourceType = 'File'
GROUP BY 
    DATEPART(year, DateCreated), 
    DATEPART(month, DateCreated), 
    DATEPART(day, DateCreated), 
    DATEPART(Hour, DateCreated),
    DATEPART(minute, DateCreated)
ORDER BY 
    DATEPART(year, DateCreated),
    DATEPART(month, DateCreated), 
    DATEPART(day, DateCreated), 
    DATEPART(Hour, DateCreated),
    DATEPART(minute, DateCreated)

Edit

To get to a 15 minute resolution, change the last column to

(DATEPART(minute, DateCreated)/15)

(add +1 to that in the select to get 1,2,3,4).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文