在数据库级别应用业务规则
我正在开展一个项目,在该项目中,我们需要确定大量人员的某些类型的状态,并将其存储在数据库中。用于确定这些状态的业务规则相当复杂并且可能会发生变化。
例如,
if a person is part of group X
and (if they have attribute O) has either attribute P or attribute Q,
or (if they don't have attribute O) has attribute P but not Q,
and don't have attribute R,
and aren't part of group Y (unless they also are part of group Z),
then status A is true.
乘以几十个状态以及可能数百个组和属性。人员、组和属性都在数据库中。
尽管这将由 Java 应用程序使用,但我们还希望能够直接针对数据库运行报告,因此最好在数据级别提供计算状态集。
那么,我们当前的设计计划是为每个人提供一个由一组布尔标志(hasStatusA?hasStatusB?hasStatusC?)组成的表或视图。这样,如果我想查询所有具有状态 C 的人,我不必知道计算状态 C 的所有规则;我只是检查标志。
(请注意,在现实生活中,标志将具有更有意义的名称:isEligibleForReview?、isPastDueForReview?等)。
那么 a) 这是一种合理的方法吗? b) 如果是的话,计算这些标志的最佳方法是什么?
我们正在考虑计算标志的一些选项:
将标志集作为一个视图,并使用 SQL 或 PL-SQL(这是 Oracle DB)从基础数据实时计算标志值。这样,值总是准确的,但性能可能会受到影响,并且规则必须由开发人员维护。
使标志集由静态数据组成,并使用某种类型的规则引擎在底层数据发生变化时使这些标志保持最新。这样可以更轻松地维护规则,但标记在给定时间点可能不准确。 (如果我们采用这种方法,是否有一个规则引擎可以通过这种方式轻松操作数据库中的数据?)
I'm working on a project in which we will need to determine certain types of statuses for a large body of people, stored in a database. The business rules for determining these statuses are fairly complex and may change.
For example,
if a person is part of group X
and (if they have attribute O) has either attribute P or attribute Q,
or (if they don't have attribute O) has attribute P but not Q,
and don't have attribute R,
and aren't part of group Y (unless they also are part of group Z),
then status A is true.
Multiply by several dozen statuses and possibly hundreds of groups and attributes. The people, groups, and attributes are all in the database.
Though this will be consumed by a Java app, we also want to be able to run reports directly against the database, so it would be best if the set of computed statuses were available at at the data level.
Our current design plan, then, is to have a table or view that consists of a set of boolean flags (hasStatusA? hasStatusB? hasStatusC?) for each person. This way, if I want to query for everyone who has status C, I don't have to know all of the rules for computing status C; I just check the flag.
(Note that, in real life, the flags will have more meaningful names: isEligibleForReview?, isPastDueForReview?, etc.).
So a) is this a reasonable approach, and b) if so, what's the best way to compute those flags?
Some options we're considering for computing flags:
Make the set of flags a view, and calculate the flag values from the underlying data in real time using SQL or PL-SQL (this is an Oracle DB). This way the values are always accurate, but performance may suffer, and the rules would have to be maintained by a developer.
Make the set of flags consist of static data, and use some type of rules engine to keep those flags up-to-date as the underlying data changes. This way the rules can be maintained more easily, but the flags could potentially be inaccurate at a given point in time. (If we go with this approach, is there a rules engine that can easily manipulate data within a database in this way?)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
在这种情况下,我建议应用沃德·坎宁安(Ward Cunningham)的问题——问自己“最简单可行的方法是什么?”。
在这种情况下,最简单的事情可能是提出一个视图,该视图查看存在的数据,并进行计算以生成您关心的所有字段。现在,加载您的数据库并尝试一下。够快吗?如果是这样,那就好——你做了最简单的事情,结果很好。如果它不够快,那么很好 - 第一次尝试没有成功,但是您已经在视图代码中制定了规则。现在您可以继续尝试“最简单的事情”的下一个迭代 - 也许您编写一个后台任务来监视插入和更新,然后跳入重新计算标志。如果有效的话,那就太好了。如果没有,则进入下一次迭代……依此类推。
分享并享受。
In a case like this I suggest applying Ward Cunningham's question- ask yourself "What's the simplest thing that could possibly work?".
In this case, the simplest thing might be to come up with a view that looks at the data as it exists and does the calculations and computations to produce all the fields you care about. Now, load up your database and try it out. Is it fast enough? If so, good - you did the simplest possible thing and it worked out fine. If it's NOT fast enough, good - the first attempt didn't work, but you've got the rules mapped out in the view code. Now you can go on to try the next iteration of "the simplest thing" - perhaps your write a background task that watches for inserts and updates and then jumps in to recompute the flags. If that works, fine and dandy. If not, go to the next iteration...and so on.
Share and enjoy.
我建议不要将状态设置为列名称,而是使用状态 ID 和值。例如包含 ID 和 Value 列的客户状态表。
我有两种更新状态的方法。一个存储过程要么具有所有逻辑,要么调用单独的存储过程来找出每个状态。您可以通过为每个状态评估提供一个函数来使所有这些动态化,然后一个存储过程可以调用每个函数。第二种方法是拥有更新用户信息的任何存储过程,调用存储过程来根据当前数据更新所有用户状态。这两种方法将允许您实时更新已更改的数据,并且如果添加新状态,您可以调用该方法以使用新逻辑更新所有状态。
希望您对用户数据有一点更新,例如用户更新存储过程,并且您可以将状态更新存储过程调用放在该过程中。这还可以节省每 n 秒安排一个任务来更新状态的时间。
I would advise against making the statuses as column names but rather use a status id and value. such as a customer status table with columns of ID and Value.
I would have two methods for updating statuses. One a stored procedure that either has all the logic or calls separate stored procs to figure out each status. you could make all this dynamic by having a function for each status evaluation, and the one stored proc could then call each function. The 2nd method would be to have whatever stored proc(s), that updates user info, call a stored proc to go update all the users statuses based upon the current data. These two methods would allow you to have both realtime updates for the data that changed and if you add a new status, you can call the method to update all statuses with new logic.
Hopefully you have one point of updates to the user data, such as a user update stored proc, and you can put the status update stored proc call in that procedure. This would also save having to schedule a task every n seconds to update statuses.
我考虑的一个选项是每个标志都由一个确定性函数支持,该函数返回给定相关数据的最新值。
但是,如果您一次为多行调用该函数(例如用于报告),该函数可能执行得不够好。因此,如果您使用的是 Oracle 11g,则可以通过添加 虚拟列(搜索“虚拟列”)根据函数关联到相关表。 结果缓存功能应该可以改进功能的表现也是如此。
An option I'd consider would be for each flag to be backed by a deterministic function that returns the up-to-date value given the relevant data.
The function might not perform well enough, however, if you're calling it for many rows at a time (e.g. for reporting). So, if you're on Oracle 11g, you can solve this by adding virtual columns (search for "virtual column") to the relevant tables based on the function. The Result Cache feature should improve the performance of the function as well.