我应该设计一个 SQL Server 数据库来依赖 UNION 还是避免它?
采用以下 SQL 查询:
SELECT Account, Amount AS Deposit, 0.00 AS Withdrawal, Date
FROM Deposits
WHERE Account = @Account
UNION
SELECT Account, 0 AS Expr1, Amount, Date
FROM Withdrawals
WHERE Account = @Account
ORDER BY Date Desc
与此相反:
SELECT Account, TransactionType, Amount, Date
FROM Transactions
WHERE Account = @Account
ORDER BY Date Desc
在第一个查询中,存款和取款存储在不同的表中,并且每当需要一起查看它们时(例如显示银行对账单时),它们就会 UNION 在一起。第二个查询将所有交易(存款和取款)存储在一张表中,并且通过 TransactionType 列区分交易类型。产生的查询结果并不完全相同,但假设客户对最终输出感到满意。
使用第一个模式设置数据库比第二个模式更好或更差是否有任何性能原因?
编辑:为了清楚起见,我想指出我问这个问题的原因是为了找出保持表较小并在必要时使用 UNION 是否有性能优势与使用一张大表使用 WHERE 子句来查找不同类型的数据相比。我用上面的例子来更好地表达我的问题。欢迎就其他原因提出哪种模式更好的建议,但也请尝试回答这个问题。
Take the following SQL Query:
SELECT Account, Amount AS Deposit, 0.00 AS Withdrawal, Date
FROM Deposits
WHERE Account = @Account
UNION
SELECT Account, 0 AS Expr1, Amount, Date
FROM Withdrawals
WHERE Account = @Account
ORDER BY Date Desc
As opposed to:
SELECT Account, TransactionType, Amount, Date
FROM Transactions
WHERE Account = @Account
ORDER BY Date Desc
In the first query has deposits and withdrawals stored in different tables and whenever they need to be seen together, such as when showing a bank statement, they're UNIONed together. The second query has all transactions, deposits and withdrawals, stored in one table and the transaction type is differentiated via the TransactionType column. The query results prodcued are not exactly the same but let's say the client is happy with either as the final output.
Is there any performance reason why setting up the database using the first schema is better or worse than the second schema?
EDIT: Just for the sake of clarity, I'd like to point out that the reason I was asking this question is to find out whether there's a performance benefit in keeping tables smaller and using a UNION when necessary over having one large table where I would use WHERE clause to find different types of data. I used the above example to better express my question. The advice on which schema is better for other reasons is welcomed, but please try to also answer the question as well.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
Transactions
似乎更合适,因为它们就是这样。虽然不太明显,但通过表格区分存款和取款并不比通过表格区分现金交易和支票交易更明智。Transactions
seems more appropriate, as that's what they are. While it's less obvious, differentiating deposits from withdrawals via tables is no more sensible than differentiating cash transactions from check transactions via tables.老实说,我认为性能不应该是您最关心的问题 - 在您给出的示例中,性能不太可能有显着差异。
但是,假设您想要计算任意时间点的帐户余额。如果你有一个“交易”表,你可以在单个查询中实现这一点 -
如果你分成两个表,你需要执行两个查询;一般来说,我预计这会比查询单个表花费两倍的时间,即使该表的记录数与其他两个表的记录数相同。
我认为您应该专注于最能代表业务领域的内容 - 如果您查看老式分类账,我认为您会发现贷方和借方都输入在同一列中。您的业务利益相关者更有可能从“交易”(积极或消极)的角度来思考,而不是根据存款和取款的单独概念来思考。
To be honest, I don't think performance should be your primary concern - and in the example you give, performance is unlikely to be measurably different.
However, imagine you want to calculate the balance of the account at any point in time. If you have a "transactions" table, you can achieve this in a single query -
If you split into two tables, you need to execute two queries; broadly speaking, I would expect this to take twice as long as querying a single table, even if that table has the same number of records as the two other tables put together.
I think you should concentrate on what represents the business domain best - if you look at an old fashioned ledger, I think you'll find both credits and debits are entered in the same column. Your business stakeholders are more likely to think in terms of "transaction" - positive or negative - than in terms of having separate concepts for deposits and withdrawals.
我会选择
Transaction
表。这使得在应用程序中实现Transaction
基类变得更加容易。如果您后来意识到需要自定义“提款”和“存款”列,那么您可以创建包含这些列的子类表。
I would go for the
Transaction
table. This makes it much easier to implement aTransaction
base class in your application.If you later realize that you need custom WITHDRAWAL and DEPOSIT columns then you can create subclass tables containing those columns.
从会计部门的角度来看,存款和取款都是无签名金额的交易。区别在于现金流的方向。这倾向于单一交易表,而不是单独的存款和取款表。
从 ER/数据建模的角度来看,如果我们从上述开始,那么问题是:存款和取款是否具有相同的属性集?如果他们这样做,那么你就有了一个强有力的迹象表明他们实际上是同一个实体。如果它们具有不同的属性,那么您可能会查看不同的子类型:也许父事务表保存公共属性,而子表则用于每种不同类型的事务:
在此模型中,基本事务表具有基数为 1 的关系每个子类型表之间的 - 到零或一。每个事务都存在于基表中,并且仅存在于一个子类型表中。
现在您可以两全其美。您只需进行一次选择即可处理整个交易(例如,计算帐户余额)。
如果您看到交易类型数量不断增加,这一点就更重要了。
From the viewpoint of the accounting department, deposits and withdrawals are both transactions with unsigned amounts. The difference being the direction of the cash flow. That leans towards a single transaction table rather than separate deposit and withdrawal tables.
From a E-R/data modelling perspective, if we start with the above, then the question is: do deposits and withdrawals share the same set of attributes? If they do, then you've got a strong indication that they are in fact, the same entity. If they have differing attributes, then you might be looking at different subtypes: perhaps a parent transaction table holding the common attributes and a subtable for each different type of transaction:
In this model, the base transaction table has a relationship with a cardinality of one-to-zero-or-one between each of its subtype tables. Each transaction exists in the base table, and in only one of the subtype tables.
Now you get the best of both worlds. You can deal with transactions as a whole (for instance, figuring out account balance) simply with a single select.
This is more important if you can see the number of transaction types increasing.
我将创建一个组合了您需要的数据的视图。这样您就可以拥有有意义的表格以及查询信息的单一来源。您不想在同一个表中存储两个不同的内容。
通常,如果您担心性能,您可以创建索引视图,但不幸的是,由于您使用的是联合,所以您不能。如果您担心性能,您可以考虑创建一个使用联合数据加载的表。
I would create a view that has combined data that you need. That way you can having meaningful tables as well as a single source to query for information. You do not want to store two different things in the same table.
Normally you can create an indexed view if you are concerned about performance, but unfortunately since you are using unions you cannot. If you are worried about performance you might consider creating a table that you load with the unioned data.
我可能会将这两个放在一个表中,因为它们基本上是非常相似的“事务”实体,并且您可能希望将它们放在一起并一起索引以进行很多操作。您可以创建执行数据水平部分的视图,并且仅显示其中之一。如果存款或取款有不同的支持信息,这些信息也可能会出现在同一个表(其中一个或另一个具有 NULL)或辅助表中。
另请注意,您可能希望使用 UNION ALL 来避免 UNION 执行的重复数据删除(希望是不必要的)。
I would probably put these two both together in a single table, simply because they are basically very similar "transaction" entities and you'll probably want them together and indexed together for a lot of operations. You can make views which perform a horizontal section of the data and only show one or the other. If Deposits or Withdrawals have different supporting information, that might also go in the same table (with one or the other having NULLs) or in auxiliary tables.
Also, note that you probably want to use UNION ALL to avoid (hopefully unnecessary) de-duplication which UNION performs.
我会把它们放在同一张表中,我见过的每个商业会计系统都有一张表用于所有交易。这是有充分理由的。就用户的期望而言,它是有意义的,就新开发人员的期望而言,它是有意义的,并且最适合报告必须对两个表进行计算才能获得正确答案的位置。如果您希望两者之间存在不同的字段,则将它们放在单独的表中,但保留主要数据(尤其是在主表中报告所需的金额和其他项目)。
我使用一个系统(由其他人设计),其中不同类型的费用位于不同的表中,并且这种选择多年来导致了很多错误,因为新开发人员不知道他们应该在多个表中查找数据。我认为,这里单独的表也将是一个维护问题,人们在如何获取数据(包括使用 UNION,其中出于性能原因指示 UNION ALL)时做出奇怪的选择,而数据应该很简单。
I would put them in the same table, every commercial accounting system I've ever seen has one table for all transactions. There's a good reason for that. It makes sense in terms of what the users expect, it makes sense in terms of what the new developers expect and it is best for reporting where you have to do calculations against both tables to get the correct answer. If you have fields you want to be differnt bewteen the two, then put them in separate tables, but keep the main data (especially the amount and other items you need for reporting in the main table).
I worked with a system (designed by someone else) where different types of expenses were in different tables and that choice cause a lot of bugs over the years as new developers didn't know they were supposed to find the data in multiple tables. I think that separate tables here, too, would be a maintenance problem where people who make wierd choices about how to get the data (including using UNION where UNION ALL is indicated for performance reasons) when it should be simple.
如果您需要将结果分组在一起,我个人会使用这两个表并使用 UNION ALL。
通常不会提前考虑性能,并且这只是可能的许多查询之一,我会设计数据库以适应更频繁使用的查询,例如仅用于提款或存款的查找。将所有数据放在一起更简单,但您必须清除重复数据,这会导致查找时间更长。
换句话说,如果这个联合查询每天执行 100 次,但针对各个表的查找和更新每天发生 1000 次,那么请将它们分开。
I would personally go with the two tables and utilize the UNION ALL if you need to group the results together.
All too often performance isn't considered in advance and being that this is just one of likely many queries I would design the database to fit the queries more often to be used like the lookups just for Withdraws or Deposits. Having all that data together is simpler but you will have to weed through double data which results in longer look-up times.
So in other words, if this union query is executed 100's of times a day but look-ups and updates against the individual tables happens 1000's of times a day then keep them separated.