将一个值映射到另一个值并返回
想象一个值,比如“1234”。 我想将该值映射到其他值,例如“abcd”。 限制:
- 目标值的长度等于起始值
- 映射应该是唯一的。 例如 1234 应该只映射到 abcd 和 visevera
- 映射过程应该(非常)难以猜测。 例如,乘以 2 确实计数
- 映射应该是可逆的
- 起始值是一个整数
- 目标值可以是任何类型
这应该是一个基本算法,最终我会用 Ruby 编写它,但这在这里并不重要。
我的想法如下:
SECRET = 1234
def to(int)
SECRET + int * 2
end
def fro(int)
(int - SECRET) / 2
end
显然这违反了约束 1 和 3。
最终目标是匿名化数据库中的记录。 我可能想多了。
Imagine a value, say '1234'. I want to map that value to an other value, say 'abcd'. The constrains:
- The length of the target value is equal to the start value
- The mapping should be unique. E.g. 1234 should only map to abcd and viseversa
- The mapping process should be (very) difficult to guess. E.g. multiplying by 2 does count
- The mapping should be reversible
- The start value is an integer
- The target value can be of any type
This should be a basic algorithm, eventually I'll write it in Ruby but that is of no concern here.
I was thinking along the following lines:
SECRET = 1234
def to(int)
SECRET + int * 2
end
def fro(int)
(int - SECRET) / 2
end
Obviously this violates constrains 1 and 3.
The eventual goal is to anonymize records in my database. I might be over thinking this.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
首先,我认为你的目标过于雄心勃勃:为什么要限制 6?
其次,从技术上讲,您需要的是来自整数域的双射。
第三,你的约束3违背了克霍夫原则。 您最好使用由密钥控制的众所周知的算法,即使您知道大量整数的结果,也很难导出密钥。
第四,你匿名反对什么? 如果您正在处理个人信息,您将如何防止统计分析根据与其他数据的关系揭示 Xyzzy 实际上是 John Doe? 有一些关于对抗此类攻击媒介的研究(谷歌搜索例如“k-anonymization ')。
第五,使用现有的加密原语,而不是尝试发明自己的密码原语。 存在加密算法(例如 cipher-block-chaining 中的 AES 模式)经过充分测试——所有现代平台都很好地支持 AES,大概 Ruby 也是如此。 然而,加密仍然没有赋予记录任何强烈意义上的匿名性。
First off, I rather think your objectives are too ambitious: why constraint 6?
Second, what you need is technically a bijection from the domain of integers.
Third, your constraint 3 goes against Kerkhoff's principle. You'd be better off with a well-known algorithm governed by a secret key, where the secret key is hard to derive even if you know the results for a large set of integers.
Fourth, what are you anonymizing against? If you are dealing with personal information, how will you protect against statistical analysis revealing that Xyzzy is actually John Doe, based on the relations to other data? There's some research on countering such attack vectors (google for e.g. 'k-anonymization').
Fifth, use existing cryptographic primitives rather than trying to invent your own. Encryption algorithms exist (e.g. AES in cipher-block-chaining mode) that are well-tested -- AES is well supported by all modern platforms, presumably Ruby as well. However, encryption still doesn't give records anonymity in any strong sense.
您可能值得对您想要实现的目标提供更多细节。 想必您担心某些邪恶的人会获取您的数据,但是这个邪恶的人是否也同样有可能访问访问您的数据库的代码? 什么可以阻止他们通过检查你的代码来学习算法?
如果您确实想对数据进行匿名化,那么这通常是一种单向方式(删除姓名、删除信用卡金额等)。 如果您尝试加密数据库的内容,那么许多数据库引擎都提供了经过良好测试的机制来执行此操作。 例如:
在 MSSQL 中处理加密数据的最佳实践
数据库加密
使用产品的加密机制总是比自己推出更好。
It might be worth you giving a little more detail on what you're trying to acheive. Presumably you're worried about some evil person getting hold of your data, but isn't it equally possible that this evil person will also have access to the code that accessed your database? What's to stop them learning the algorithm by inspecting your code?
If you truely want to anonymize the data then that's generally a one way thing (names are removed, credit card values are removed etc). If you're trying to encrypt the contents of the database then many database engines provide well tested mechanisms to do this. For example:
Best practices for dealing with encrypted data in MSSQL
database encryption
It's always better to use a product's encryption mechanism than roll your own.