Mysqlߝ使用哈希函数检测表的一部分的数据变化
我需要对表中的某些数据生成单个哈希,
CREATE TABLE Table1
(
F1 INT UNSIGNED NOT NULL AUTO_INCREMENT,
F2 INT default NULL,
F3 Varchar(50) default NULL,
..
FN INT default NULL,
PRIMARY KEY (F1)
);
即 F1、F3、FN,其中 F2=10
SELECT md5(CONCAT_WS('#',F1,F3,FN)) FROM Tabe1 WHERE F2=10
为表中的每一行提供一个哈希。
问题
1)如何获得整个表的单个散列?
2) 使用 MD5、SHA1、SHA 或任何其他的 fasts 哈希算法是什么?
编辑:
使用了Mysql 4.1 - 并且它不有触发器支持
I need generate a single hash over some data in a table
CREATE TABLE Table1
(
F1 INT UNSIGNED NOT NULL AUTO_INCREMENT,
F2 INT default NULL,
F3 Varchar(50) default NULL,
..
FN INT default NULL,
PRIMARY KEY (F1)
);
i.e. F1, F3,FN where F2=10
SELECT md5(CONCAT_WS('#',F1,F3,FN)) FROM Tabe1 WHERE F2=10
Gives a Hash for each row in the table.
QUESTIONS
1) How do get a single hash over the whole table?
2) What is the fasts hashing algorithm to use MD5, SHA1, SHA or any other?
EDIT:
Mysql 4.1 is been used - and it does NOT have Trigger Support
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
1)
2) 速度并不重要,因为函数只需运行一次并且所有哈希函数都足够快
1)
2) Speed doesn't really matters as a function has to run only once and all hash functions are fast enough
至于速度,你应该尝试一下。这取决于功能的实现方式。
然而,您可能会发现速度差异非常小。您引用的散列函数都比普通磁盘可以输出的速度快,因此问题并不是真正的“什么散列函数将使代码运行得最快?”但是“什么哈希函数将使 CPU 在等待磁盘数据时最空闲?”。
在主频为 2.4 GHz(64 位模式)的 Intel Core2 Q6600 上,使用我自己的哈希函数 C 实现,我获得以下哈希速度:
仅使用单核。我的硬盘最高速度约为 100 MB/s,因此可以说,即使使用 SHA-256,哈希过程也不会使用超过 17% 的机器 CPU 功率。当然,没有什么可以保证 MySQL 使用的实现那么快,这就是您应该尝试的原因。此外,在 32 位模式下,SHA-512 性能会大幅下降。
从密码学的角度来说,MD5 和 SHA-1 中已发现(严重的)弱点,因此,如果您在安全相关的设置中工作(即您想要检测更改,即使有人可以选择某些更改并且更喜欢这样做)您没有检测到上述更改),您应该坚持使用 SHA-256 或 SHA-512,据我们所知,它们足够强大。不过,MD5 和 SHA-1 在非安全情况下仍然没问题。
As for speed, you should try. It depends on the way the functions are implemented.
Chances are, however, that you will see very little speed differences. The hash functions you cite are all faster than what an average disk can spew out, so the question is not really "what hash function will make the code runs fastest ?" but "what hash function will make the CPU most idle while it waits for the data from the disk ?".
On my Intel Core2 Q6600, clocked at 2.4 GHz (64-bit mode), with my own C implementation of hash functions, I get the following hashing speeds:
That's using a single core only. My hard disks top at about 100 MB/s, so one can say that even with SHA-256, the hashing process will use no more than 17% of the machine CPU power. Of course, nothing guarantees that the implementation used by MySQL is that fast, which is why you should try. Also, in 32-bit mode, SHA-512 performance decreases quite a bit.
Cryptographically speaking, (grave) weaknesses have been found in MD5 and SHA-1, so if you work in a security-relevant setting (i.e. you want to detect changes even if there is someone who can choose some of the changes and would prefer that you do not detect said changes), you should stick to SHA-256 or SHA-512, which, as far as we know, are robust enough. MD5 and SHA-1 are still fine in non-security situations, though.
我会使用 MySQL 触发器 来检测插入时的更改,删除、更新等
I would use a MySQL Trigger to detect changes on insert, delete, update, etc.
尽管这个线程很旧,但也许这就是您所需要的:
http://dev.mysql.com/doc/refman/5.0 /en/checksum-table.html
Altough this thread is old, maybe this is what you need:
http://dev.mysql.com/doc/refman/5.0/en/checksum-table.html
请参阅 BIT_XOR:
http://dev.mysql.com/doc/refman /5.6/en/group-by-functions.html
“返回 expr 中所有位的按位 XOR。计算以 64 位 (BIGINT) 精度执行。如果没有匹配的行,此函数将返回 0。”
有关使用示例,请查看 pt-table-sync。
See BIT_XOR:
http://dev.mysql.com/doc/refman/5.6/en/group-by-functions.html
"Returns the bitwise XOR of all bits in expr. The calculation is performed with 64-bit (BIGINT) precision. This function returns 0 if there were no matching rows."
For an example of usage, check pt-table-sync.
如果由于任何原因您无法使用触发器,另一种方法是使用 CONCAT 选项,例如:
但请注意,如果表分配了数据,查询将会很慢!如果可能,尝试从 CONCACT 中排除不必要的列。
另请注意,默认情况下 MySQL Max CONCACT 为 1024,可能需要通过首先运行以下查询来更改此设置:
请注意,18446744073709547520 是最大值,您可以使用不同的!
If by any reason you can't use Triggers, a different approach is to use the CONCAT option, like:
But be aware that if the table has allot of data the query will be slow! if possible try to exclude unnecessary columns from the CONCACT.
Also take note that by default MySQL Max CONCACT is 1024, there maybe the need to change this by running first the following query:
Note that 18446744073709547520 is the maximum value, you could use a different one!