更好的 SQLite 损坏检测

发布于 2024-12-28 09:45:21 字数 2060 浏览 2 评论 0原文

首先,一些背景知识:

我的 Android 应用程序有一个包含大量四列行的数据库表。它向服务器发送请求,并且仅当这四个值全部“有效”时服务器才会响应。数千名用户中的一些人报告说有些东西对他们不起作用(有一段时间他们没有从服务器获得结果) - 我试图找出导致问题的原因,结果发现唯一可能的原因是未检测到的数据库损坏。

在 ACRA 日志中,我收到一些带有 SQL 错误的消息,但这些消息是关于应用程序由于文件已损坏而无法打开文件的。这给了我一些线索,但我仍然不相信这就是问题所在。因此,我创建了一个非常简单的 Python 脚本,它更改 DB 文件中的随机字节并检查 SQLite 将如何处理该问题:

import random
import array
import sqlite3

db = array.array('B')
db.fromstring(open('db').read())

ta =  [x for x in sqlite3.connect('db').execute('SELECT * FROM table ORDER BY _id')]

results = [0,0,0,0]
tries = 1000

for i in xrange(0,tries):
    work = db[:]
    while work == db: 
        for j in xrange(0,random.randint(1,5)):
            work[random.randint(1,len(db))-1] = random.randint(0,255)

    work.tofile(open('outdb','w'))

    try:
        c = sqlite3.connect('outdb')
        results[0] += 1

        for r in c.execute('PRAGMA integrity_check;'):
        results[1] += 1 if (r[0] == 'ok') else 0 
    except:
        continue    

    try:
        results[3] += 1 if [x for x in c.execute('SELECT * FROM table ORDER BY _id')] != ta else 0
        results[2] += 1
    except:
        c.close()
        continue

print 'Results for '+str(tries)+' tests:'
print 'Creating connection failed '+str(tries-results[0])+ ' times'
print 'Integrity check failed '+str(results[0]-results[1])+ ' times'
print 'Running a SELECT * query failed '+str(results[1]-results[2])+ ' times'
print 'Data was succesfully altered '+str(results[3])+ ' times'

结果表明,以这种方式“编辑”表数据是完全可能的:

Results for 1000 tests:
Creating connection failed 0 times
Integrity check failed 503 times
Running a SELECT * query failed 289 times
Data was succesfully altered 193 times

看到运行对于完整性检查未检测到的一半修改,查询失败,但对我来说最有趣的是,某些内容可能会交换数据库中的随机字节,从而使我的应用程序对部分用户毫无用处。

我在 SQLite 网站和 StackOverflow 上读到了有关损坏的可能原因,我知道例如强制关闭应用程序可能会对数据库造成损害。我只是想知道是否可以实现快速且更强大的数据库完整性检查。

我在启动时从整个表的一列中读取数据(用于自动完成),所以我想到从所有值中计算一些哈希 - 我认为这会很好用,因为某些哈希函数是专门为执行而设计的完整性检查,但也许有一个更简单、更快和更好的解决方案 - 因此我问你,如果你知道的话。

At first, some background:

My Android app has DB table with a lot of four-column rows. It sends requests to the server and server responds only when all of these four values are "valid". A few of the thousands users reported that something doesn't work for them (since awhile they are not getting the results from the server) - I was trying to figure out what's causing the problem and it turned out that the only possible cause is the DB corruption that's not being detected.

In ACRA logs I've got some messages with SQL errors, but these were about application not being able to open the file because of it being corrupt. That gave me some clue, but I was still not convinced that this is the issue. So, I created a very simple Python script which changes random bytes in the DB file and checks how SQLite will deal with that:

import random
import array
import sqlite3

db = array.array('B')
db.fromstring(open('db').read())

ta =  [x for x in sqlite3.connect('db').execute('SELECT * FROM table ORDER BY _id')]

results = [0,0,0,0]
tries = 1000

for i in xrange(0,tries):
    work = db[:]
    while work == db: 
        for j in xrange(0,random.randint(1,5)):
            work[random.randint(1,len(db))-1] = random.randint(0,255)

    work.tofile(open('outdb','w'))

    try:
        c = sqlite3.connect('outdb')
        results[0] += 1

        for r in c.execute('PRAGMA integrity_check;'):
        results[1] += 1 if (r[0] == 'ok') else 0 
    except:
        continue    

    try:
        results[3] += 1 if [x for x in c.execute('SELECT * FROM table ORDER BY _id')] != ta else 0
        results[2] += 1
    except:
        c.close()
        continue

print 'Results for '+str(tries)+' tests:'
print 'Creating connection failed '+str(tries-results[0])+ ' times'
print 'Integrity check failed '+str(results[0]-results[1])+ ' times'
print 'Running a SELECT * query failed '+str(results[1]-results[2])+ ' times'
print 'Data was succesfully altered '+str(results[3])+ ' times'

The results showed that "editing" table data in this way is entirely possible:

Results for 1000 tests:
Creating connection failed 0 times
Integrity check failed 503 times
Running a SELECT * query failed 289 times
Data was succesfully altered 193 times

It's generally interesting to see that running a query failed for half of the modifications that went undetected by integrity check, but the most interesting thing for me is that something may swap random bytes in my DB rendering my application useless for a part of my users.

I've read about possible causes of corruption on SQLite website and also on StackOverflow, I know that e.g. forcing application to close may do a harm to the DB. I'd just like to know if it's possible to implement a fast and more robust DB integrity check.

I'm reading the data from a one column of the whole table at startup (for autocompletion), so I thought of calculating some hash from the all values - I think this would work quite good, since some hash function are designed just for doing integrity checks, but maybe there's a simpler, faster and better solution - I'm thus asking you, if you know any.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

人│生佛魔见 2025-01-04 09:45:21

我不知道有这样的 SQLite 功能,所以我想说计算哈希是最简单的解决方案,请查看 MessageDigest 类作为开始。

I don't know of any SQLite feature like this, so I'd say that calculating a hash is the simplest solution, take a look at the MessageDigest class for a start.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文