数据库内记忆化 - 好主意吗?有什么经验吗?

发布于 2024-08-22 00:17:13 字数 671 浏览 7 评论 0原文

我有一个尚未实现的想法,因为我担心我可能会找错树……主要是因为谷歌搜索该主题返回的结果很少。

基本上我有一些 SQL 查询速度很慢,很大程度上是因为它们有子查询,很耗时。例如,他们可能会做“给我统计 10-15 岁男孩骑过的所有红色自行车的数量”之类的事情。这很昂贵,因为它会遍历所有自行车,但最终结果只是一个数字。而且,就我而言,我并不需要该数字 100% 是最新的。

此类问题的最终解决方案似乎是应用基于 OLAP 的引擎来预先缓存这些排列。然而,就我而言,我并不是真的试图围绕大量指标对数据进行切片和切块,而且我希望不必因为另一个进程/数据存储的运行而使我的架构变得复杂。

所以......我的想法基本上是在数据库中记住这些子查询。我可能有一个名为“BicycleStatistics”的表,它可能会将上面子查询的输出存储为其输入和输出的名称值对。

例如名称:“c_red_g_male_a_10-15”值:235

并具有一种机制,可以在运行查询时将这些值记忆到该表中。

有没有人遇到过这种情况并尝试过类似的事情?我认为这样的解决方案比“在数据库中投入大量 RAM 并让数据库处理它”更有价值的原因是 (A) 我的数据库大于我可以方便地投入的 RAM 量,并且 ( B) 数据库将确保我获得这些统计数据的准确数字,而我的最大胜利是,我可以接受这些数字过时一两天。

感谢您的任何想法/反馈。

汤姆

I have an idea I have yet to implement, because I have some fear I may be barking up the wrong tree... mainly because Googling on the topic returns so few results.

Basically I have some SQL queries that are slow, in large part because they have subqueries that are time-consuming. For example, they might do things like "give me a count of all bicycles that are red and ridden by boys between the ages of 10-15". This is expensive as it sloshes through all of the bicycles, but the end result is a single number. And, in my case, I don't really need that number to be 100% up to date.

The ultimate solution for problems of this sort seems to be to apply an OLAP-based engine to pre-cache these permutations. However, in my case I'm not really trying to slice and dice the data around a ton of metrics, and I'd love not to have to complicate my architecture with yet another process/datastore running.

So... my idea was basically memoizing these subqueries in the database. I might have a table called "BicycleStatistics" and it might store the output of that subquery above as a name value pair of it's inputs and outputs.

Ex name: "c_red_g_male_a_10-15" value: 235

And have a mechanism that memoizes those values to that table as the queries are run.

Has anyone been in this situation and tried anything similar? The reason I think a solution like this is valuable over the "throw a lot of RAM in your DB and let the database handle it" is (A) my database is bigger than the amount of RAM I can conveniently throw at it, and (B) the database is going to ensure I get the exact right number for these statistics, and my big win, above, is that I'm ok with the numbers being a day or two out of date.

Thanks for any thoughts/feedback.

Tom

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

意中人 2024-08-29 00:17:13

物化视图是实现此要求的一种方法(如果您的 DBMS 支持的话)。

Materialized views are a way of achieving this requirement, if your DBMS supports them.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文