从未提交的事务中恢复数据库行

发布于 10-07 19:16 字数 333 浏览 5 评论 0原文

我们有一个数据库,该数据库是由使用 sqlite3 模块的 Python 编写的程序写入的。数据库执行了大量插入语句,但事务从未因提交而结束。

结果是我们有两个文件:

     Size             Time       Name
855117824 2010-12-14 15:27 db
  1665240 2010-12-14 15:27 db-journal

数据库文件很大,但大部分数据未提交,因此当我们从数据库中选择时,我们只得到几行。当我们执行sql命令“VACUUM”时,数据库缩小到大约3MB。

有什么办法可以恢复数据吗?

We have a database that was written to by a program written in Python that uses the sqlite3 module. The database had a large number of insert statements executed on it, but the transaction was never ended by a commit.

The result is that we have two files:

     Size             Time       Name
855117824 2010-12-14 15:27 db
  1665240 2010-12-14 15:27 db-journal

The database file is large, but most of the data is uncommitted so when we select from the database, we only get a few rows. When we execute the sql command 'VACUUM', the database shrinks to about 3MB.

Is there any way to get the data back?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

拥醉2024-10-14 19:16:05

我使用 sqlite3 shell 程序进行了一些测试。

假设 sqlite3 Python 模块的行为方式相同,似乎没有办法可靠恢复未提交的事务。

对于相对较少数量的语句,未提交的事务似乎仅完全保留在应用程序内存中,并且没有数据写入文件系统。一旦数据库连接关闭或应用程序终止,这些插入就会完全丢失。

对于较大的事务块,数据会写入文件系统,但一旦数据库连接关闭或(如果应用程序崩溃)下次打开数据库时,数据就会被清除。简单来说,新的数据库页面会分配给未提交的事务,但如果事务未提交,它们将被视为可用空间,这就是 VACUUM 减小数据库大小的原因。这些页面将在下次写入数据库文件时被写入(并且它们的数据丢失)。如果它们位于数据库文件的末尾,则该文件只会在清理时被截断。

只要此后没有执行其他写入事务,您可能能够从最后执行的未提交事务中恢复一些数据。从您的问题的措辞方式来看,听起来好像整个数据库是在单个程序运行和单个事务中创建并填充的(尽管 VACUUM 不会产生这么大的文件)。在这种情况下,事情可能会容易一些。

这很大程度上取决于如何终止行为不当的程序。如果您允许它正常终止,它可能有时间进行清理,在这种情况下这是不可取的。既然你有数据库日志,我会假设它有一个更暴力的结局。

无论如何,您至少必须深入研究 sqlite3 DB 文件格式并修改库代码以解析未提交的数据。您仍然会丢失保留在应用程序内存中的事务部分。

如果数据库文件中存在空闲页面(例如,来自DELETE 语句),则还可能存在旧事务的片段,尽管解释这些片段是另一回事。

在我看来,整个操作将过于接近(如果不是完全进入)计算机取证和数据恢复领域,以及所有相关问题。除非您拥有无法通过其他方式获取的非常重要的数据,否则我怀疑这是否足够简单,值得您为此付出代价。

I performed a bit of testing using the sqlite3 shell program.

Assuming that the sqlite3 Python module behaves in the same manner, there does not seem to be a way to reliably recover uncommitted transactions.

For a relatively small number of statements, uncommitted transactions seem to remain entirely in the application memory only and no data is written to the filesystem. Those insertions are completely lost once the DB connection is closed or the application terminates.

For larger transaction blocks data is written to the filesystem, but it is cleaned up once the DB connection is closed or (if the application crashes) on the next open of the DB. In simple terms, new DB pages get allocated for the uncommitted transaction, but if the transaction is not committed they are considered free space, which is why VACUUM reduces the DB size. These pages will be written to (and their data lost) at the next write to the DB file. If they are at the end of the DB file, the file simply gets truncated at clean-up.

You might be able to recover some data from the last uncommitted transaction that was performed, as long as no other write transaction was performed afterwards. From the way your question is phrased, it sounds as if the whole DB was created and filled in a single program run and a single transaction (although VACUUM would not produce such a large file then). In that case things could be a bit easier.

It heavily depends on how the misbehaving program was terminated. If you allowed it to terminate gracefully it may have had the time to clean-up, which in this case is not desirable. Since you have the DB journal I will assume that it had a more violent end.

In any case, you would have to at least delve into the sqlite3 DB file format and modify the library code to parse the uncommitted data. You will still lose those parts of the transaction that remained in the application memory.

If there were free pages (e.g. from DELETE statements) in the DB file there might also be fragments of older transactions, although interpretting those fragments is another story.

In my opinion, the whole operation will stray way too close (if not outright enter) to the fields of computer forensics and data recovery, along with all the relevant issues. Unless you have really important data that you cannot get in any other way, I doubt it will be easy enough to be worth the trouble.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文