数据库设计 - 出于报告目的而复制列
我有以下表格:
Product(ProductID, ProductName ...)
ProductBidHistory(ProductID, UserID, Amount, Status, ...)
BidHistory 表可以增长到每个产品都有许多记录,我想要一个包含每个产品的已批准出价的报告,即状态 = 已批准的金额。
- 是否可以在产品表本身中有一个 ApprovedAmount 列,该列会在出价结束时填充,以便更容易报告。
- 是否应该编写报告来连接历史表以查找“已批准”并获取金额。
可以复制该列吗?
I have the following tables:
Product(ProductID, ProductName ...)
ProductBidHistory(ProductID, UserID, Amount, Status, ...)
The BidHistory table can grow to have many records for every product, I want a report containing approved bid for every products i.e.Amount where status = approved.
- Is it ok to have an ApprovedAmount column in the Product table itself that gets populated when the bid is closed, so that it is easier for reporting.
- Should the report be written to do a join to history table to look for 'approved' and fetch the amount.
Is it ok to duplicate the column ?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
从纯粹的设计角度来看,您不应该保留相同数据的两个副本。这可能会导致数据不一致。如果批准的金额与投标历史记录不一致怎么办?
从性能角度来看,您可以复制数据以加快报告生成的速度。
从应用程序的角度来看,开发人员必须确保每次更新“出价历史记录”表时,都会对“产品”表中的批准金额进行正确的更新。
From the pure design perspective, you shouldn't keep two copies of the same data. This could lead to data inconsistency. What if the approved amount and the bid history do not agree?
From the performance perspective, you may duplicate the data to gain the speedup in the generation of the report.
From the application perspective, the developer has to ensure the proper update is done to the approved amount in Product table with every update in the Bid History table.
这实际上取决于这些表的大小。
如果它们不太大,我建议您不要重复这些列,因为这会导致应用程序层产生更多开销。一个简单的视图或用户定义的表函数应该就足够了。还要确保您在正确的列上有索引。
但是,如果表非常大(数百万行),那么您可能会通过存储用于报告目的的值来获得性能提升。
This realy depends on the size of these tables.
If they are not to large, I would recomend that you not duplicate the columns, as this will cause more overhead in the application layer. A simple view or user defined table function should be good enough. Also ensure that you have indexes on the correct columns.
But if the tables are going to be very large (millions of rows), then you might see performance gains by storing the values for reporting purposes.
一般来说,您永远不应该复制关系数据库中的列,这样根据(有效)查询的表达方式,可能会得到彼此不一致的答案。
已批准不是产品的属性,而是出价的属性,您应该始终尝试让数据库结构反映现实世界的本体。
当然,如果该表中的行不是指某人拥有的特定产品(例如 iPad 16G S/N 123456789)而是指通用“目录”,则您不能将“已批准是/否”(布尔值)列放入“产品”表中。描述”产品(例如 iPad 16G),并且可能出现在多次拍卖中。从您的简化设计中不清楚,目的是询问您在这里的想法。
在您的查询中,您永远不会简单地询问“所有已批准的出价”而没有其他限定词。术语“approved=true”将始终与其他一些术语结合在一起,例如 ? 之后的出价日期,或产品 id = ?,或 (a, b, c) 中的产品类别,这些可能是候选列对于指数。因此,明智地使用索引,不要违反规范化规则,保持本体真实,您的数据库会照顾您的。
In general, you should never duplicate the columns in a relational database, so that it would be possible to get answers that do not jibe with each other depending upon how the (valid) queries were expressed.
Approved is not an attribute of the product but an attribute of the bid, and you should always try to have the database structure mirror the real-world ontology.
Of course, you could not put Approved yes/no (boolean) column in the Product table if the rows in that table do not refer to a specific product in someone's possession (e.g. iPad 16G S/N 123456789) but to a generic "catalog description" product (e.g. iPad 16G) and could appear in multiple auctions. Not clear from your simplified design for the purpose of asking the question what you had in mind here.
In your queries, you're never going to ask simply for "all approved bids" with no other qualifiers. The term "approved=true" will always be in league with some other term(s), e.g. bid-date after ?, or product id = ?, or product-category in (a, b, c) which are likely candidate columns for indices. So use indices judiciously, don't violate the normalization rules, keep your ontology real, and your database will take care of you.