使用数据仓库中同一事务数据库中的缓存报告数据报告
我们有一个SaaS解决方案,每个租户都有自己的MySQL数据库。现在,我正在设计此SaaS系统的仪表板,并且需要一些分析图表。为了获取图表所需的数据,我们可以实时查询每个租户的交易数据。并获得更新的图表,没有不良的性能,因为到目前为止,数据量还没有那么大。但是,由于数据量将不断增长,因此我们决定将每个公司的分析和交易数据分开,我们将在后台获取图表的分析数据,保存/缓存它们并进行定期更新。我的问题是:
- 在确定我们是否需要从一开始我们需要包含数据仓库和数据建模或简单地缓存我们的API在新表中的API中产生的图表的分析数据,我们应该考虑哪些好问题或因素。对于每个租户的MySQL数据库中的图表。
We have a SaaS solution, in which each Tenant has his own MySQL database. Now I'm designing the dashboards of this SaaS system and it requires some analytical charts. To get the data needed for charts we could query the transactional data of each tenant from its database in real time. and get updated charts with no bad performance since so far the data volume not that big. However, because the data volume will be growing we decided to separate the analytical and transactional data of each company, we will get the analytical data for the charts in the background, save/caching them and do periodical updates. My question is:
- What good questions or factors we should considering before deciding whether or not we need to include a data warehouse and data modeling from the beginning or simply Caching the analytical data of the charts resulting from our API in JSON columns in a new table for charts in each tenant's MYSQL database.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
与其伸入数百万行的“事实”表,不如构建和维护摘要表,然后从中获取。它的运行速度可能很快。
由于额外的表格,这确实需要更改代码,但这可能是值得的。
摘要表
换句话说,如果数据集将变为比X大,摘要表是最好的解决方案。缓存将无济于事。硬件是不够的。 JSON只会阻碍。
从一年的价值数据点(每秒一个)构建长达一年的图表是缓慢而浪费的。从每天的小计构建一年的图表更为合理。
Instead of reaching into the "Fact" table for millions of rows, build and maintain a Summary table, then fetch from that. It may run 10 times as fast.
This does require code changes because of the extra table, but it may be well worth it.
Summary Tables
In other words, if the dataset will become bigger than X, Summary tables is the best solution. Caching will not help. Hardware won't be sufficient. JSON only gets in the way.
Building a year-long graph from a year's worth data points (one per second) is slow and wasteful. Building a year-long graph from daily subtotals is much more reasonable.