复杂查询的视图或存储过程?
我有一个带有多个(嵌套)子查询的有点复杂的查询,我希望将其提供给应用程序开发人员。该查询是通用的,会生成一个视图,其中包含数据集集合上的计算值,并且开发人员预计只需要查询返回的一些记录(即,它们将限制某些实体的 ID 或日期范围或某些这样的)。
我可以看到 3 种实现此目的的方法:
- 让开发人员将查询嵌入到每个应用程序中,并根据需要添加自己的
WHERE
子句。 - 创建一个存储过程,该过程接受我期望开发人员需要的所有条件作为参数(为了论证,可以说我可以预测在可预见的未来将需要什么),并且该过程将运行复杂的查询并对其进行过滤根据传递的参数。
- 将查询实现为具有多个子视图的视图(因为 MySQL 不允许在视图中使用子查询),并让开发人员将其用作表并使用
WHERE
让每个应用程序应用他们的过滤器需要。目前我正在查看 3 个额外的子视图,主要是因为某些子查询被多次使用,并将它们作为子视图来防止重复 - 否则情况可能会更糟;-)。
什么会带来更好的性能? (假设所有索引在所有情况下都是等效的)如果可以的话,考虑最坏的情况。
您认为在代码维护方面什么会更好?
I have a somewhat complex query with multiple (nested) sub-queries, which I want to make available for the applications developers. The query is generic and generates a view with computed values over a collection of data sets, and the developer is expected to need only some records from what the query returns (i.e. they will limit the result for some entity's ID or a date range or some such).
I can see 3 ways to implement this:
- Let the developers embed the query into each application and add their own
WHERE
clauses as needed. - Create a stored procedure that accepts as parameters all the conditions I expect developers to need (for the sake of the argument lets say that I can predict what will be needed for the foreseeable future), and the procedure will run the complex query and filter it according to the parameters passed.
- Implement the query as a view with several sub views (because MySQL doesn't allow sub-queries in views) and have the developers use this as a table and use
WHERE
to have each application applies the filters they need. Currently I'm looking at 3 additional sub-views, mostly because some sub-queries are used multiple times and doing them as sub-views prevents duplication - otherwise it could have been worse ;-).
What will be better performance wise? (assuming all indexing is equivalent in all cases) Go for worst case scenarios, if you may.
what will be better in code maintenance terms, do you think?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我喜欢定义“好”的问题 - 您特别询问了性能和可维护性,这使得答案可以讨论这种权衡。
从性能的角度来看,我认为这 3 个选项之间可能没有任何区别,只要查询和数据符合您的预期场景。我会使用 100 倍以上的数据进行测试,并可能扩大“where”子句以查看会发生什么,但是索引结构等更有可能影响性能,而不是通过存储过程从存储过程执行相同的 SQL视图,或来自客户端应用程序。
回答这个问题的最好方法就是测试它——当然,有许多具体细节可能会使我们溢出者给出的一般“我期望 x、y 或 z”答案无效。如果性能是一个关键问题,请使用数据库填充工具(Redgate make on,我过去使用过 DBMonster)并尝试所有 3 个选项。
从维护的角度来看,我会提供选项 4,在我看来,它是迄今为止最好的。
选项 4:构建一个数据访问库,封装对数据的访问。让库公开方法和参数来优化记录的选择。考虑使用规范模式 (http://en.wikipedia.org/wiki/Specification_pattern)。使用库中最好的任何查询,并且不要用实现细节来打扰开发人员。
如果这不起作用 - 异构应用程序代码,对于简单的需求来说改变太多 - 我会评估选项如下:
嵌入式 SQL:取决于此 SQL 重复使用的次数,这可能没问题。如果只有一部分代码运行 SQL,那么它在逻辑上类似于数据访问库。然而,如果相同的代码片段需要在很多地方重复使用,那么它可能是错误的来源——SQL 中的一个小变化就需要在多个地方重复。
存储过程:出于维护原因,我通常不喜欢存储过程 - 它们往往会因过载而变得脆弱,并创建一种程序化的思维方式。例如,如果您有其他要求在单独的存储过程中使用此 SQL 计算,那么您很快就会得到一个过程编程模型,其中存储过程相互调用。
视图:这可能是最好的选择。它将特定的数据逻辑放在一个地方,但提倡使用基于集合的逻辑,因为访问路径是通过 SELECT 语句,而不是通过执行过程语句。视图很容易合并到其他查询中。
I like questions that define "good" - you've specifically asked about performance and maintainability, which allows answers to talk about that trade-off.
From a performance point of view, I don't think there's likely to be any difference between the 3 options, as long as the queries and data fit within your expected scenarios. I'd test with 100 times more data, and potentially widening the "where" clause to see what happens, but the indexing structure etc. is more likely to affect the performance than whether you execute the same SQL from a stored proc, through a view, or from a client application.
The best way to answer that question is to test it - there are, of course, many specific details that could invalidate the general "I'd expect x, y, or z" answers we overflowers can give. If performance is a critical concern, use a database filling tool (Redgate make on, I've used DBMonster in the past) and try all 3 options.
From a maintenance point of, view, I'd provide an option 4, which - in my view - is by far the best.
Option 4: build a data access library which encapsulates access to your data. Have the library expose methods and parameters to refine the selection of records. Consider using the specification pattern (http://en.wikipedia.org/wiki/Specification_pattern). Use whatever queries are best inside the library, and don't bother the developers with the implementation details.
If that doesn't work - heterogeneous application code, too much of a change for a simple requirement - I'd evaluate the options as follows:
Embedded SQL: depending on the number of times this SQL is re-used, this may be okay. If there's only one part of the code that runs the SQL, it's logically similar to the data access library. If, however, the same snippet needs to get re-used in lots of places, it's a likely source for bugs - a small change in the SQL would need to be repeated in several places.
Stored procedure: I generally dislike stored procedures for maintenance reasons - they tend to become brittle by over-loading, and create a procedural way of thinking. For instance, if you have other requirements for using this SQL calculation in a separate stored procedure, very quickly you end up with a procedural programming model, with stored procs calling each other.
Views: this is probably the best choice. It puts the specific data logic in a single place, but promotes the use of set-based logic because the access route is through a SELECT statement, rather than by executing a procedural statements. Views are easy to incorporate into other queries.
如果实施得当,这三个解决方案中的任何一个都适合维护,但请记住在迁移过程(代码或数据库迁移)中如何对待它们中的每一个。
如果查询很大,存储过程会为您提供一些额外的性能,因为它发送的查询较小,因此带宽开销较小。通过此解决方案,您还可以获得一点额外的安全性。
对于维护解决方案,我更喜欢第一个和第二个解决方案,因为您可以对查询进行任何更改,而无需进行任何数据库更改。如果您选择第一个解决方案,我会将查询调用包装在一个函数中,这样您就只有一个地方可以进行更改。
从开发人员的角度来看,我会选择视图解决方案,因为它是最透明的解决方案,我的意思是它就像查询常规表一样,您可以使用描述命令检查表结构,或者只需选择您需要查询的字段和条件,或与另一个表联接等...
关于where 子句灵活性,您可以使用任何建议的解决方案来实现它。您可以在包装函数中添加一个 where 参数 (1),您可以向存储过程添加一个 where 参数,但要小心注入 (2),或者开发人员可以像往常一样使用视图添加一个 where 子句 (3
)请记住,在 MySQL 中,视图不是临时表,如果查询非常复杂,如果查询被大量使用并且以不同的方式使用(禁用缓存性能提升),那么该解决方案将不是最好的。 我会考虑一个临时表解决方案(计数器表),它使用编程任务/cron(例如一天、一周、只要需要)更新每个时间段,或者通过设置适当的触发器进行更新。这个解决方案可以大大提高性能。
希望这有帮助,我最喜欢视图解决方案,但从数据库的角度来看,它的开发可能更复杂。
If well implemented, any of the three solutions would be fine for manteinance, but bear in mind how would you treat each of them in a migration process (code or database migration).
If the query is big, the stored procedure will give you a bit of extra performance due to less bandwith overhead because it's sending a smaller sized query. You may also gain a little extra security with this solution.
For a manteinance solution, I would prefer the 1st and 2nd solution, coz you can make any changes on the query without doing any database changes. If you choose the 1st solution, I would wrap the query call within a function so you'll have only one place to make changes.
From a developer point of view, I would choose the view solution beacuse is the most transparent one, I mean it's like querying just a regular table, you can check table structure with a describe command, or just select the fields and conditions you need to query, or join with another table, etc...
About the where clause flexibility, you can achieve it with any of the proposed solutions. You can add a where parameter in your wrapping function (1), you can add a where parameter to the stored procedure but be cautious with injections (2), or the developer can add a where clause as usual with the view (3)
Having in mind that in MySQL views are not temporary tables, if the query is very complex this solutions wouldn't be the best if the query is used a lot and in different ways (disabling cache performance boost). I would consider a temporary table solution (counter table) that updates each time period with a programmed task / cron (for example a day, a week, whenever needed) or gets updated by setting the propper triggers. This solution could improve performance quite a bit.
Hope this helps, I like the view solution the most but maybe it's more complex to develope from a database point of view.