辅助副本中的特定数据库的日志备份作业始终在 SQL 上失败
辅助副本中的特定数据库的日志备份作业始终在 SQL 上失败,并出现以下错误。
错误 :
辅助副本上数据库的日志备份失败,因为 主数据库的最后一个备份 LSN 大于当前 本地重做LSN
log backup job getting failed for particular DBs in secondary replica always on SQL with below error.
Error :
Log backup for databases on a secondary replica failed because the
last backup LSN from the primary database is greater than the current
local redo LSN
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
根据上面的评论,这听起来像是与在拓扑中的多个副本上进行日志备份有关的竞争条件。例如,如果您在包括 LSN 1-15(含)的主副本上进行日志备份,并且仅通过 LSN 12 重做辅助副本,则在辅助副本上备份日志的任何尝试都没有足够的新数据来完成。
也就是说,有一个简单的修复方法 - 设置备份首选项。这本身并不能解决问题,因为您使用的任何备份过程都需要遵守这些设置(默认的备份日志命令不会;备份数据库也不会) ,但这不在本次讨论的范围内)。这也是一个简单的修复 - 使用 Ola Hallengren 的备份程序。它确实遵守这些设置。使其明确,因为它确实遵循这些设置,所以您可以始终在所有副本上启用作业。
另一条建议 - 我只会在主节点上运行日志备份。为什么?假设由于某种原因您的副本明显落后。它只能备份其拥有的 LSN 的日志。即使在最好的情况下,这也会阻止主数据库中的日志被清除。但这也会危及您的目标 RPO。如果灾难在副本落后时发生,您所能得到的最接近的结果就是执行备份的副本距离当前的距离。想象一下这样的对话 - “是的,老板......我知道我们每五分钟进行一次日志备份,但我能做的最佳恢复是从两个小时前开始”。您的主副本始终是最新的,因此不会发生滞后。
Based on comments from above, this sounds like a race condition having to do with taking log backups on multiple replicas in your topology. For example, if you take a log backup on your primary replica that includes LSNs 1-15 (inclusive) and the secondary replica is redone only through LSN 12, any attempt to backup the log on the secondary replica doesn't have sufficiently new data to complete.
That said, there is an simple fix - set backup preferences. That by itself doesn't fix the issue as whatever backup procedure you're using needs to honor those settings (the default
backup log
command does not;backup database
also does not, but that's not in scope for this discussion). This too is a simple fix - use Ola Hallengren's backup procedure. It does obey those settings. Making it explicit, because it does obey those settings you can leave the jobs enabled on all replicas at all times.One other piece of advice - I'd run your log backups on the primary node only. Why? Let's say that there is some reason that your replica is significantly far behind. It can only back up log for LSNs that it has. Even in the best case scenario, this prevents log from clearing at the primary. But it also jeopardizes your target RPO. If disaster were to strike while the replica was behind, the closest you could get would be however far the replica doing the backups was to current. Imagine this conversation - "yeah boss... I know that we take log backups every five minutes, but the best recovery I can do is from two hours ago". Your primary replica is always current and so that lag doesn't happen.