复杂数据查看的替代方案?

发布于 2024-10-07 18:24:48 字数 1169 浏览 3 评论 0原文

我有一个表,其中包含对查看我的应用程序的用户没有用的部分数据。查看者希望看到一个类似的表,其中计算出所有值,并希望能够快速查询该数据。通常,这将是使用视图的完美位置。不幸的是,计算的复杂性限制了我对视图的使用,因此我需要一个替代解决方案。我正在考虑做类似以下的事情...

表 A 包含后端数据。每当更新此表时,都会触发更新表 B 的触发器,该表显示了这些计算的结果。此时可以快速查询表B。

我唯一担心的是计算过程有点冗长,而且表A有可能突发更新多次。是否有任何解决方案几乎类似于“选择之前”触发器?那么基本上A表可以连续更新很多次,但是只有在查询B表时才会进行计算?

下面是一个时间线示例:

  1. 表 A 已更新
  2. 表 A 再次更新 [不要运行 SP,因为还没有人需要表 B 数据]
  3. 从表 B 请求数据 [在获取数据之前运行 SP 以更新表 B]
  4. 表 A 已更新
  5. 数据从表 B 请求 [在获取数据之前运行 SP 以更新表 B]
  6. 从表 B 请求数据 [不要运行存储过程,因为表 A 在请求表 B 数据之间未更新]

所以我的问题是:

  1. 是否存在有什么像我上面描述的那样存在吗?
  2. 如果没有,是否有办法让触发器推迟直到完成一批更新/插入?表 A 不会经常修改,因此我可以忍受更新时的缓慢触发。唯一的问题是,当表 A 更新时,通常一次会包含数百行,我不希望每次都运行缓慢的触发器。

感谢您提供任何解决方案/见解!

编辑 - 关于为什么(至少我认为)触发器实现会很慢的更多细节:

  1. 将发送更新/插入/删除语句的应用程序正在使用 LINQ-to-SQL,这对于批处理来说不是特别好运营。因此,如果我想删除一堆记录,它会发送一堆删除语句而不是批量删除语句。有什么方法可以对删除语句进行分组并让触发器在之后运行吗? (也许我在这里离题太远了)。
  2. 我所说的“计算”涉及一些递归函数和一些决策过程。我实际处理的数据是调度数据。因此表 A 包含可能有也可能没有计划启动的任务。如果没有定义计划开始时间,则必须从其前一个计划开始时间+其前一个持续时间中派生出来。在某些情况下,前驱也可能没有该信息,因此递归查询会不断挖掘,直到找到结果。它并不是慢得令人痛苦,但如果它必须在每次插入/更新/删除时运行,它就会到达那里。我上面提到的“表B”基本上是同一个表,但它已经包含了计算出的预定启动数据(需要在表A更新时更改)。

I have a table that contains partial data that is of no use to the user viewing my application. The viewer wants to see a similar table that has all the values calculated out, and wants to be able to quickly query THAT data. Normally, this would be the perfect place to use a View. Unfortunately the complex nature of the calculations limit my use of Views, so I need an alternate solution. I was thinking of doing something like the following...

Table A contains back-end data. Any time this table is updated, a trigger fired that updates Table B, which shows the result of these calculations. Table B can be queried quickly at this point.

My only concern is that the calculation procedure is somewhat lengthy, and Table A has the potential to be updated many times in bursts. Are there any solutions that are almost like a "Before Select" trigger? So basically Table A could be updated many times in a row, but the calculation would only occur when Table B was queried?

Here's an example timeline:

  1. Table A is Updated
  2. Table A is Updated Again [Don't run SP since nobody needs Table B data yet]
  3. Data is requested from Table B [Run SP to update Table B before getting data]
  4. Table A is Updated
  5. Data is requested from Table B [Run SP to update Table B before getting data]
  6. Data is requested from Table B [Don't run stored procedure because Table A wasn't updated between requests for Table B data]

So my questions are:

  1. Is there anything existing like what I described above?
  2. If not, is there a way to make triggers hold off until the completion of a batch of updates/inserts? Table A won't be modified very often, so I can live with a slow trigger when it is updated. The only problem is that when Table A does get updated, it is normally hundreds of rows at a time and I don't want a slow trigger to run every time.

Thanks for any solutions/insight!

Edit - A little more detail on why (I think at least) the trigger implementation would be slow:

  1. The application that will be sending the Update/Insert/Delete statements is using LINQ-to-SQL, which isn't particularly good with batch operations. So if I wanted to delete a bunch of records, it would send a bunch of delete statements instead of a batch delete statement. Is there any way to group delete statements and have the trigger run after that? (Maybe I'm getting too far off topic here).
  2. The "calculations" I'm speaking of involve some recursive functions and some decision-making processes. The data I'm actually dealing with is scheduling data. So Table A contains tasks which may or may not have a scheduled start. If there is no scheduled start defined, it has to derive it from its predecessors scheduled start + its predecessors duration. In some cases the predecessor might not have that information either, hence the recursive query that keeps digging until it finds the result. It's not painfully slow, but if it had to run on every insert/update/delete, it would get there. The "Table B" I refer to above is basically the same table, but it already has the calculated scheduled start data within (which needs to change on update of Table A).

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

哭了丶谁疼 2024-10-14 18:24:48

我认为物化视图能够解决这个问题,并且运行得相当快。然而,OP 没有说明“计算”是什么,只是说它们运行缓慢。因此,这是您最好的选择:

向主表 A 添加一个触发器,在该触发器中将受影响行的 ID 插入到新的“需要完成的工作”表中。由于计算尚未完成,因此开销非常低。

创建一个每 x 分钟运行一次的作业来进行工作计算并清除“需要完成的工作”表。这项工作将填充并保持同步您的表 B,其中包含最终答案。

仅通过存储过程调用访问“查询”,在此过程中,运行作业使用的处理(将确保表 B 中的数据是最新的),然后它将使用表 B 运行查询。

I would think a materialized view would be able to solve this, and run fairly fast. However the OP doesn't say what the "calculations" are and only that they run slow. As a result here is you best option:

Add a trigger to your main Table A, in that trigger insert the ID of the affected rows into a new "work needs to be done" table. This would have very low overhead, as the calculations are not done yet.

create a job that runs every x minutes to do the work calculations and clear the "work needs to be done" table. This job would would populate and keep in sync your Table B, which has the final answers.

give access to the "query" only via a stored procedure call, in this procedure, run the processing used by the job (will make sure that the data in Table B is up to date) then it will run the query using Table B.

离笑几人歌 2024-10-14 18:24:48

触发器对插入或更新的整批记录进行操作,而不是一次操作一行。

如果只有几百个新记录,计算花费的时间太长,那么计算本身可能需要进行性能调整。

如果您在触发器中进行计算,那么它们是每个事务的一部分,并且它们将在允许另一个事务之前完成,因此可以想象它会减慢插入速度。如果您正确编写触发器来处理数据集而不是逐行处理,那么它可能不会导致问题。对于大多数计算来说,数百行是微不足道的。如果您要批量插入数百万行,我会更关心触发器性能对其他进程的干扰。

您可以创建一个过程来进行计算,并安排它每十分钟左右运行一次,然后在需要报告时再次运行。这样,它可以提前完成大部分预先计算,并在报告时捕获最后几个新记录。

向我们提供计算示例和潜在触发因素,我们可以更好地为您提供帮助。

如果单个事务进入一组触发器运行,则无法将一组事务分组。如果您需要同时处理批次,也许您应该停止使用 LINQ 来执行此操作并发送 abatch 语句。如果您一次处理多个记录,基于集合的操作通常更可取。

Triggers operate on the entire batch of records that are inserted or updated, not one row at a time.

If the calculations are taking too long with a only a few hundred new records, then the calculation itself probably needs to be performance tuned.

If you do the calculations in a trigger, then they are part of each transaction and they will finish before another transaction is allowed, so it could conceivably slow down inserts. If you write the trigger properly to deal with sets of data and not row-by-agonizing-row, it probably will not cause an issue. Hundreds of rows are a trivial amount to do most calulations on. If you were inserting millions of rows in a batch, I would be more concerned about the trigger performance interfering with other processes.

You could create a proc to do the calculations and schedule it to run every ten minutes or so and then run again when the report is called for. This way it might do most of the precalcualting in advance and just catch the last couple of new records at the report time.

Give us a sample of the calculation and the potential trigger and we can better help you.

There is no way to group a bunch if individual transactions into one group trigger run. Perhaps you should stop using LINQ for this and and senda abatch stament if you need batches to process together. Set-based operations are usually preferable if you are working with more than one record at a time.

妳是的陽光 2024-10-14 18:24:48

如果可能(当然可能不适合您的具体情况),您可以通过存储过程调用从表 B 返回数据,而不是从视图/表中进行 SELECT。这样,您的过程仅在被专门调用时(即在表 A 更新后)才执行繁重的工作。

If possible (which of course it might not be in your specific situation), you could return your data from table B via a stored procedure call, rather than SELECTing from a view/table. That way your procedure does the heavy lifting only when it is specifically called (i.e. after Table A has been updated).

碍人泪离人颜 2024-10-14 18:24:48

这是已经讨论过的内容的一个更简单的变体...

如果您的两个表上都有 DeltaTs/EditTs 列,您可以:

通过存储过程返回查询数据,该存储过程将检查 max(Deltats) 与 max(deltats) ) 在两个表上,并根据需要更新表 b。更新表 A 后第一次选择时,您会受到一些影响,但之后会很好而且很快。

This is a simpler variation on what's already been discussed...

If you had DeltaTs/EditTs columns on both of your tables, you could:

Return the query data via a stored procedure, which would check the max(Deltats) vs max(deltats) on both tables and update table b as necessary. The first time you select after updating table A, you'd take a bit of a hit, but it would be nice and fast afterwards.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文