复制 - 有时同步大部分数据
我有一些数据没有正确“分区”(因为缺乏更好的词)。
所有插入、处理和报告都发生在同一个表上。大部分处理发生在插入之后不久,并且之后不久它就变得不可变(我们谈论的是几天)。
我可以在复制到旧表的新表上执行所有插入和处理。当我检测到数据已变得不可变时,我会从新表中删除数据,但我会编辑删除复制存储过程,以便删除不会复制。
这个主意有多糟糕?
目前看来它很有吸引力(我还没有睡过觉),因为它只需对应用程序进行很小的更改就可以缓解性能问题。这似乎也是搬起石头砸自己脚的好方法。
Edit1:
我喜欢插入两个表的想法,因为我可以避免 Jono 的答案中描述的视图和维护窗口。无意冒犯,乔诺,我实际上在其他地方使用过这种技术。
我可能想使用复制,因为一个表可能位于另一个数据库中(我知道,我没有提到这一点),这样我就不必担心提交两个表,我只需让复制来处理这个问题。
我真正担心的是(我没有明确说明)编辑复制存储过程最终可能会成为部署/维护的难题。
I have some data that isn't properly "partitioned" (for lack of a better word).
All inserts, processing and reporting happen on the same table. The bulk of the processing happens not long after the insert and not long after that it becomes immutable (we're talking days).
I could do all inserts and processing on a new table that I replicate to the old table. When I detect that the data has become immutable I would delete the data from the new table, but I would edit the delete replication stored procedure so that the delete did not replicate.
How bad an idea is this? <edit1>That is, editing the replication stored procedure.</edit1>
It seems attractive at the moment (I haven't slept on it yet) because it might mitigate a performance problem with only very small changes to the application. It also seems like it might be a good way to shoot myself in the foot.
Edit1:
I like the idea of inserting into two tables because I can avoid the view and the maintenance window described in Jono's answer. No offense, Jono, I actually use this technique elsewhere.
I might want to use replication because one table might be in another database (I know, I didn't mention this) and that way I don't have to worry about committing to two tables, I just let replication handle that.
My actual concern (that I didn't make clear) is that editing the replication stored procedure could end up being a deployment/maintenance headache.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我不提倡通过复制来解决性能问题(除非它是物理数据分布的问题);如果有的话,当更改传播到目的地时,它会减慢您的系统速度。如果您使用单个服务器,我建议添加第二个表,其架构与第一个表相同,但索引针对您在处理阶段所做的工作类型进行了优化。然后创建一个从两个表中进行选择的视图,并在您想要合并两个表的任何查询中使用该视图。然后,您可以在第二个表中投入更多硬件(我正在考虑在更多主轴上建立一个单独的文件组),然后在可用的维护窗口期间将每周延迟的数据迁移到第一个表中。
I wouldn't advocate replication to solve a performance issue (unless it's a problem of physical data distribution); if anything it's going to slow your system down as the changes are propagated to their destination. If you're using a single server, I'd suggest adding a second table with the same schema as the first, but with your indexes optimised for the kind of work you do in your processing phase. Then create a view that selects from both tables, and use that view in any query where you want the union of both tables. You could then throw more hardware at the second table (I'm thinking of a separate file group over more spindles) and then migrate the data on a weekly delay into the first table, during an available maintenance window.