如何在 SQL Server 上处理季节性数据库值的归档
我正在使用 SQL Server 2008 R2
,目前正在开发一个数据库结构,其中包含某些产品的季节性值。
通过季节性我的意思是这些值在特定日期之后对于客户的使用将不再有用。但是,这些值将由内部人员用于统计结果。
在销售网站上,我们将添加产品搜索功能,我的目标之一是使该搜索尽可能优化。 但是,数据库表中的行越多,搜索速度就越慢。因此,我考虑归档未使用的值。
我可以使用 SQL Server 作业自动处理自动归档。那里没问题。但我不确定应该如何归档这些值。
我能想到的最好方法是在同一个数据库中创建另一个具有相同列的表并将它们放在那里。
示例:
我的主表名称是
ProductPrices
并且有一个主键 为此数据库定义的。然后,我创建了另一个表,名为ProducutcPrices_archive
。我为此表创建了一个主键字段 以及与ProductPrices
表相同的列,除了ProdutPrices
主键值。我认为这没有用 存档该值(我认为正确吗?)。对于内部使用,我考虑将两个表值放在一起 与
UNION
(这是正确的方法吗?)。
这个数据库是为了长期使用,应该设计成最好的结构。我不确定从长远来看我是否会错过这里的某些东西。
任何建议将不胜感激。
I am on SQL Server 2008 R2
and I am currently developing a database structure which contains seasonal values for some products.
By seasonal I mean that those values won't be useful after a particular date in terms of customer use. But, those values will be used for statistical results by internal stuff.
On the sales web site, we will add a feature for product search and one of my aim is to make this search as optimized as possible. But, more row inside the database table, less fast this search will become. So, I consider archiving the unused values.
I can handle auto archiving with SQL Server Jobs automatically. No problem there. But I am not sure how I should archive those values.
Best way I can come up with is that I create another table inside the same database with same columns and put them there.
Example :
My main table name is
ProductPrices
and there a primary key has been
defined for this database. Then, I have created another table namedProdutcPrices_archive
. I created a primary key field for this table
as well and the same columns asProductPrices
table except forProdutPrices
primary key value. I don't think it is useful to
archive that value (do I think correct?).For the internal use, I consider putting two table values together
withUNION
(Is that the correct way?).
This database is meant to use for long time and it should be designed with best structure. I am not sure if I miss something here for the long run.
Any advice would be appreciated.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我最初会考虑两个选项之一
使用 分区将单个表分为当前工作集和存档数据。
无需使用存档表
添加 validForm、ValidTo 列来实现类型 2 SCD
然后为 ValidTo IS NULL 添加索引视图以获取当前数据集
如果所有数据都必须在一个数据库中“在线”,我就不会拥有 2 个单独的表。
这导致了第三个选择:包含所有数据的完全独立的数据库。只有“当前”数据保持有效。 (正如 @Mike_Walsh 的回答所解释的)
索引视图选项是最简单的,并且适用于标准版本(带有 NOEXPAND 提示)
I'd consider one of two options initially
Use partitioning to separate the single table into current working set and archive data.
No need to use an archive table
Add validForm, ValidTo columns to implement a type 2 SCD
Then add an indexed view for ValidTo IS NULL to get the current set of data
I wouldn't have 2 separate tables if all data has to be "on-line" in one database.
This leads to a 3rd option: an entirely separate database with all data. Only "current" data stays in live. (as @Mike_Walsh's answer explains)
The indexed view option is easiest and works with standard edition (with NOEXPAND hint)
gbn 提出了一些好的方法。不过,我认为对您来说“正确”的长期答案是第三个选项。
听起来您的数据有两个业务用例 -
1.) 实时在线事务处理 (OLTP)。这是 POS 交易、库存管理,快速“今天的收据怎么样,库存怎么样,我们有任何操作问题吗?”类型的问题并保持业务的日常运行。在这里,您需要执行操作所需的数据,并且需要针对更新/插入等进行优化的数据库。
2.) 分析型问题/报告。这是查看逐月数据、逐年数据、运行平均值。这些是您提出的战略性问题,并全面了解您的历史 - 您会想了解去年的圣诞季节性商品与今年的表现如何,甚至可能将这些数字与同年的季节性商品进行比较5年前的时期。这里您需要一个包含比 OLTP 更多数据的数据库。您希望丢弃尽可能少的历史记录,并且希望数据库针对报告和回答问题进行很大程度上优化。可能更加非规范化。您希望能够看到特定时间的事物,因此 gbn 提到的 2 型 SCD 在这里会很有用。
在我看来,您需要创建一个报告数据库。您可以将其称为数据仓库,但现在这个词让人们感到害怕。不需要害怕,如果你计划得当,它不需要花费你 6 年和 600 万美元来制作;-)
这绝对是一个长期的答案,但几年后你会很高兴你花时间创建了一个。一本了解维度建模概念和思考数据仓库及其术语的好书是 数据仓库工具包。
gbn brings up some good approaches. I think the "right" longer term answer for you is the t3rd option, though.
It sounds like you have two business use cases of your data -
1.) Real time Online Transaction Processing (OLTP). This is the POS transactions, inventory management, quick "how did receipts look today, how is inventory, are we having any operational problems?" kind of questions and keeps the business running day to day. Here you want the data necessary to conduct operations and you want a database optimized for updates/inserts/etc.
2.) Analytical type questions/Reporting. This is looking at month over month numbers, year over year numbers, running averages. These are the questions that you ask as that are strategic and look at a complete picture of your history - You'll want to see how last years Christmas seasonal items did against this years, maybe even compare those numbers with the seasonal items from that same period 5 years ago. Here you want a database that contains a lot more data than your OLTP. You want to throw away as little history as possible and you want a database largely optimized for reporting and answering questions. Probably more denormalized. You want the ability to see things as they were at a certain time, so the Type 2 SCDs mentioned by gbn would be useful here.
It sounds to me like you need to create a reporting database. You can call it a data warehouse, but that term scares people these days. Doesn't need to be scary, if you plan it properly it doesn't have to take you 6 years and 6 million dollars to make ;-)
This is definitely a longer term answer but in a couple years you'll be happy you spent the time creating one. A good book to understand the concept of dimensional modeling and thinking about data warehouses and their terminology is The Data Warehouse Toolkit.