我使用 SQL UDF 来封装简单的报告/业务逻辑。我应该避免这种情况吗？

发布于 2024-08-19 16:47:46 字数 463 浏览 10 评论 0原文

我正在 SQL Server 2008 中为某些报告构建一个新数据库，并且有许多与这些数据相关的常见业务规则进入不同类型的报告中。目前，这些规则大多以传统语言结合在较大的过程程序中，我正在尝试将其转移到 SQL。我正在努力实现根据这些数据实现报告的灵活性，例如一些使用 SAS 的报告，一些使用 C# 的报告等。

我目前的方法是分解这些通用规则（通常非常简单的逻辑）并将它们封装在单独的 SQL UDF 中。性能不是问题，我只是想使用这些规则以某种报告“快照”的形式填充静态字段，然后可以使用您想要的任何方式进行报告。

我喜欢这种模块化方法，因为它可以理解每个规则的作用（并维护规则本身），但我也开始有点担心维护也可能成为一场噩梦。有些规则取决于其他规则，但我无法真正摆脱它 - 这些东西相互依存......这就是我想要的......我想？ ;)

数据库中的这种模块化方法是否有更好的方法？我是否走在正确的轨道上，或者我是否以太多的应用程序开发心态来思考这个问题？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

独行侠 2024-08-26 16:47:46

在某些时候，广泛使用 UDF 将开始导致性能问题，因为它们是针对结果集中的每一行执行的，并且模糊了优化器的逻辑，从而很难使用索引（即，我不太明白性能如何无法提高）一个问题，但你最了解你的要求）。对于某些功能来说，它们非常棒；但要谨慎使用它们。

回复收藏 0 原文

橙幽之幻 2024-08-26 16:47:46

将逻辑保留在数据库端几乎总是正确的做法。

正如您在问题中提到的，大多数业务规则涉及非常简单的逻辑，但通常处理大量数据。

数据库引擎是实现该逻辑的正确选择，因为首先，它将数据 I/O 保持在最低限度，其次，数据库更有效地执行大多数数据转换。

前段时间我就这个主题写了一篇非常主观的博客文章：

Schema Junk

附注：UDF 与存储过程不同。

UDF 是一个由查询内部可调用设计的函数，因此它只能执行非常有限的可能操作子集。

您可以通过存储过程做更多的事情。

更新：

在您给出的示例中，就像更改计算“派生字段”的逻辑一样，计算该字段的UDF就可以了。

但是（以防万一）当性能成为一个问题时（相信我，这会比人们想象的要快得多），使用基于集合的操作转换数据可能比使用 UDF 更有效s。

在这种情况下，您可能希望创建一个视图、存储过程或返回结果集的表值函数，其中将包含更有效的查询，而不是限制自己更新 UDF（它们是记录） -基于）。

一个例子：您的查询有类似“用户分数”的内容，您认为它可能会发生变化，并将其包装到 UDF

SELECT  user_id, fn_getUserScore(user_id)
FROM    users

最初，这只是表中的一个普通字段：

CREATE FUNCTION fn_getUserScore(@user_id INT) RETURNS INT
AS
BEGIN
        DECLARE @ret INT
        SELECT  user_score
        INTO    @ret
        FROM    users
        WHERE   user_id = @user_id
        RETURN @ret
END

，然后您决定它使用其他表中的数据来计算它：

CREATE FUNCTION fn_getUserScore(@user_id INT) RETURNS INT
AS
BEGIN
        DECLARE @ret INT
        SELECT  SUM(vote)
        INTO    @ret
        FROM    user_votes
        WHERE   user_id = @user_id
        RETURN @ret
END

这将导致引擎在任何情况下都使用效率最低的NESTED LOOPS算法。

但是，如果您创建了一个视图并重写了如下所示的底层查询：

SELECT  user_id, user_score
FROM    users

SELECT  user_id, SUM(vote) AS user_score
FROM    users u
LEFT JOIN
        user_votes uv
ON uv.user_id = u.user_id

，这将为引擎提供更广泛的优化空间，同时仍然保留结果集结构并将逻辑与表示分离。

Keeping logic on database side is almost always a right thing to do.

As you mentioned in your question, most business rules involve quite simple logic but it usually deals with huge volumes of data.

The database engine is the right thing to implement that logic because, first, it keeps data I/O to a minimum, and, second, database performs mosts data transformations much more efficiently.

Some time ago I wrote a very subjective blog post on this topic:

Schema Junk

One side note: a UDF is not the same as a stored procedure.

A UDF is a function designed by callable inside a query, so it can do only a very limited subset of possible operations.

You can do much more is a stored procedure.

Update:

In the example you gave, like changing logic that calculates a "derived field", the UDF that calculates the field is OK.

But (just in case) when performance will be an issue (and believe me, this will be much sooner that one may think), transforming data with set-based operations may be much more efficient than using UDFs.

In this case, you may want to create a view, a stored procedure or a table valued function returning a resultset which will contain a more efficient query rather that limiting yourself to updating the UDFs (which are record-based).

One example: your query has something like "user score" which you feel to be subject to change and wrap it into a UDF

SELECT  user_id, fn_getUserScore(user_id)
FROM    users

Initially, this is just a plain field in the table:

CREATE FUNCTION fn_getUserScore(@user_id INT) RETURNS INT
AS
BEGIN
        DECLARE @ret INT
        SELECT  user_score
        INTO    @ret
        FROM    users
        WHERE   user_id = @user_id
        RETURN @ret
END

, then you decide it to calculate it using data from other table:

CREATE FUNCTION fn_getUserScore(@user_id INT) RETURNS INT
AS
BEGIN
        DECLARE @ret INT
        SELECT  SUM(vote)
        INTO    @ret
        FROM    user_votes
        WHERE   user_id = @user_id
        RETURN @ret
END

This will condemn the engine to using the least efficient NESTED LOOPS algorithm in either case.

But if you created a view and rewritten the underlying queries like this:

SELECT  user_id, user_score
FROM    users

SELECT  user_id, SUM(vote) AS user_score
FROM    users u
LEFT JOIN
        user_votes uv
ON uv.user_id = u.user_id

, this would give the engine much wider space for optimization while still keeping the resultset structure and separating logic from presentation.

回复收藏 0 原文