管理海量数据的最佳数据库解决方案
我必须设计一个交通数据库,其中包括来自不同城镇(8 个城镇)的数据,每个城镇 24 小时 10 分钟内 2mb。所有城镇的传入数据都是相同的。所以我的第一个问题是在性能方面哪个更好:为所有具有多个表的城镇设计一个数据库(每个城镇一个表)或设计多个数据库(每个城镇一个数据库)?我的第二个问题是,对于这种情况,最好的数据库管理系统是什么,MySQL、Postgres、Oracle 还是其他?
I have to design a traffic database which includes data from different towns (8 towns) 2mb in a period of 10 min for each town 24h. The incoming data is the same for all Town. So my first question is what is better on the performance side: design one database for all towns with many tables (one table for each town) or design many databases (one database for each town)? My second question is what is the best database management system for this scenario, MySQL, Postgres, Oracle, or others?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您每天收到的数据量相当大(~5GB),但插入的行数实际上相当低。因此,您需要设计物理模型以使数据库存储管理变得简单并且查询高效。
仅当每个数据库都有一个服务器时,每个城镇拥有一个单独的数据库才有意义。但您不需要负载平衡,因为您只需每十分钟处理八次插入。另一方面,该架构会将每个将一个城镇与另一个城镇进行比较的查询转变为分布式查询。
如果您的大多数查询仅限于来自某个城镇的数据而不是比较城镇,那么在同一数据库中每个城镇拥有一个表可能会给您带来一些性能优势。但我不想花太多钱。即使它确实有效,也可能会使其他类型的查询变得更加困难。
鉴于所有城镇的数据都相同,我的首选选择是一张带有区分列 (TOWN_ID) 的表。特别是如果我有钱购买带有分区选项的 Oracle 许可证的话。
The amount of data you are receiving each day is quite a lot (~5GB) but the number of rows being inserted is actually rather low. Consequently you need to design your physical model to make database storage adminstration easy and querying efficient.
Having a separate database per town only makes sense if you are going to have a server per database. But you do not need load balancing, as you only have to handle eight inserts every ten minutes. On the other hand that architecture will turn every query which compares one town against another into a distributed query.
Having one table per town in the same database might give you some performance advantages if the majority of your queries are constrained to data from a town rather than comparing towns. But I wouldn't like to put much money on it. Even if it did work, it might make other sorts of queries harder.
Given that the data is the same for all towns my preferred option would be one table with a differentiating column (TOWN_ID). Especially if I had the money to spring for a Oracle license with the Partitioning option.
每个城镇的不同数据库可能很难维护,不同的表也是如此。如果您不需要比较城镇,这可能是可行的,但我打赌迟早必须比较不同城镇的数据。
对数据进行分区是可行的方法。支持数据分区的 Anty 数据库(例如 Oracle 或 SQL Server)可以正常工作。不确定 Postgre 或 Mysql 是否支持这一点,你必须询问更熟悉这些数据库的人。
Differnt databases per town can be difficult to maintain, same with differnt tables. It might be workable if you never have to compare towns though, but sooner or later I'd bet on having to compare data from differnt towns.
Partitioning data is the way to go. Anty database which supports partioning of data such as Oracle or SQL Server would work fine. Not sure if Postgre or Mysql support this, you'd have to ask someone more familiar with those databases.