在MySQL中存储高写入负载的分层数据
我正在构建应该具有高写入负载和数千甚至数百万表示用户定义/构建的树的分层记录的 Web 应用程序。我并不是想用线程构建论坛,而是用数千个小型层次结构(最多有 10-20 个后代的树)构建庞大的数据库......
我知道有许多用于存储层次结构的模型 - 目前我正在使用嵌套集,但性能大量数据和负载是问题。我也怀疑邻接列表或类似的东西是否可以解决这个问题。
我一直在尝试 Mongo 数据库,它是超快的键/值存储,但我只能使用 MySQL。
我想听听其他人遇到类似问题的经历。
I am building web application that should have high write load and thousands, even millions of hierarchical records representing user defined/constructed trees. I am not trying to build forum with threads but huge database with thousands of small-sized hierarchies (trees with up to 10-20 descendants)...
I am aware of many models for storing hierarchies - currently I am using Nested Sets but performance with huge data and load is issue. I am also doubtful that Adjacency Lists or something similar may resolve this.
I have been experimenting with Mongo database which is superfast key/value storage but I can use only MySQL.
I would like to hear about other people experiences with similar issues.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
如果您可以安装 MySQL 插件,那么 OQGraph 存储引擎就是您所需要的。
If you can install MySQL plugins, then OQGraph storage engine is what you need.
嵌套集合有什么问题?
添加/删除节点时是否重新计算 lft/rgt 值?
可以肯定的是,通过一些仔细的规划,您可以对其进行调整,因此只需进行罕见的重新计算。我实际上并没有尝试过,但确实对系统做了一些规划(客户最终不想要这个系统!)
其一,在第一次计算它们时将这些值乘以 1000。然后,如果添加节点,则只需在值之间插入数字即可。只有当有大量插入时,你才会开始用完数字。低优先级批处理可以重新计算树以释放数字以用于新插入。
删除也可以通过操作数字来存档。事实上,没有子节点的节点很容易。无需重新计算。如果是孩子的话会变得更复杂,但我认为应该是可行的。
What is the problem with nested sets?
Is recomputing the lft/rgt values when you add/remove nodes?
Pretty sure with a bit of careful planning, you can tweak it so do only have to do rare recomputations. I've not actully tried it, but did do some planning for a system once (the client didnt want the system in the end!)
One, is multiplying the values, by say 1000, when first calculating them. Then if you add a node, you can just insert numbers between the values. Its only when there is a large number of insertions, do you start running out of numbers. A low priority batch process, could recompute the tree to free up numbers for fresh insertions.
Deleting can also be archived, with manipulating numbers. In fact a node without children is easy. No recomputation nedded. Gets more complicated if children, but I think should be doable.