Lucene/Solr 索引的逆向工程数据
我正在研究将搜索服务器部署到云端是否可行,其中一个问题与数据安全有关。目前,我们的所有字段(除了少数用于分面的字段)都已索引且未存储(ID 除外,我们用它在搜索完成后检索文档)。
如果由于某种原因云中的服务器受到损害,即使没有存储字段,该人是否也可以从索引中对我们的数据进行逆向工程。
I am investigating whether it is feasable to deploy search servers to the cloud and one of the questions I had revolved around data security. Currently all of our fields (except a few used for faceting) are indexed and not stored (except for the ID, which we use to retrieve the document after search has completed).
If for some reason the servers within the cloud were compromized, would it be possible for that person to reverse engineer our data from the indexes even without the fields being stored.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
取决于您需要的安全级别和文档内容的敏感性...
使用您描述的配置,不可能将原始文件重建为“克隆”...但是可以反转足够的信息来获得有关内容的大量知识...根据上下文,这可能会造成损害...
重要的一点:
如果您使用基于云的服务器来构建索引并且它们受到损害,那么就没有必要根据您的配置进行“反转”:至少对于您索引的任何文档在服务器受到损害之后,因为为了构建索引,文档会按原样发送(例如,当使用 http: //wiki.apache.org/solr/ExtractingRequestHandler)...
Depends on the security level you need and the sensitivity of the document content...
With a configuration you describe it wouldn't be possible to rebuild the original as a "clone"... BUT it would be possible to reverse enough information to gain a lot of knowledge about the content... depending on the context this could be damaging...
An important point:
If you use the cloud based servers to build the index and they get compromized THEN there would be no need for "reversing" depending on your configuration: at least for any document you index after the servers get compromized because for building the index the document gets sent over as it is (for example when using http://wiki.apache.org/solr/ExtractingRequestHandler)...
正如叶海亚所说,可以获得一些信息。如果您确实对此担心,请按照 Amazon 的建议使用加密文件系统 。
As Yahia says, it's possible to get some information. If you're really concerned about this, use an encrypted file system, as Amazon suggests.