如何存储历史服务器数据?
我正在寻找有关如何将数据存储在数据库中以进行历史数据挖掘的建议。如果我可以获取某个实体在给定时间的状态,那么存储它的最佳方式是什么,以便我可以历史地挖掘该数据,根据过去的情况预测该状态可能是什么?
对于更具体的示例,我可以获得服务器的启动/关闭状态以及该服务器的当前负载。我可以定期得到这个。我想存储这些数据,以便我可以轻松查询特定时间范围内的启动/关闭状态或负载,或获取该服务器的完整历史记录。我在数据库设计方面没有太多经验(如果有的话)。
I was looking for advice on how to store data in a database for historical data mining purposes. If I can get the state of an entity at a given time, what is the best way to store it so that I can historically mine that data, predicting what the state is likely to be based on how it has been in the past?
For a more concrete example, I can get the up/down state of a server, and that server's current load. I can get this on a periodic schedule. I'd like to store this data such that I could easily query for the up/down state or the load for a specific timeframe, or get the entire history of that server. I don't have much experience, if any, in database design.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
这就是 Ralph Kimball(和其他人)发明数据仓库的原因。
您有具有服务器和时间等维度的星型架构。您有一个记录状态更改(向上和向下)的事实表和一个记录给定时间点的负载的事实表。
步骤 1. 找到一个关于星型模式设计的好教程。如果需要的话买一两本书。花每一刻学习如何进行星型模式数据建模都是值得的。
步骤 2. 尽可能便宜、快速地构建原型模式。加载数据,以便您可以编写一些查询并查看其工作原理。犯错误。修复它们。
第三步:当你开始工作后,写下一个好的设计。
步骤 4. 构建“真正的”数据集市。
步骤 5. 构建“生产”负载。
步骤 6. 查询。
This is why Ralph Kimball (and others) invented the Data Warehouse.
You have star schema with dimensions like Server and Time. You have a fact table that records state changes (Up and Down) and a fact table that records Load at a given point in time.
Step 1. Find a good tutorial on star schema design. If necessary buy a book or two. It's worth every moment you spend learning how to do star schema data modeling.
Step 2. Build a prototype schema as cheaply and quickly as you can. Get data loaded so you can write some queries and see how things work. Make mistakes. Fix them.
Step 3. After you get something to work, then write down a good design.
Step 4. Build your "for real" data mart.
Step 5. Build your "production" loads.
Step 6. Query.
这就是为什么有新版本的标准 SQL:2011。
只需将适当的开始日期和结束日期属性添加到您的实体中,和/或将适当的列添加到您的表中,基本上就完成了。
当然,如果您有支持新功能的引擎,新的 SQL 将为您完成相当多(但不是全部)原本非常烦人的工作。
This is why there is a new version of the standard, SQL:2011.
Just add the appropriate start- and end-date attributes to your entities, and/or the appropriate columns to your tables, and basically you're done.
The new SQL will do quite a bit (but not all alas) of the otherwise very pesky work for you, if you have an engine that supports the new features, of course.