在内存中运行 Solr?

发布于 2024-12-11 22:03:29 字数 196 浏览 0 评论 0原文

一天晚上,我在当地一家酒吧喝了一品脱啤酒,并与旁边的小伙子攀谈起来。事实证明,他不仅是一名开发人员,而且还经常使用 Solr。当我们谈论 Solr 有多棒时,他提到了一些我一直无法弄清楚的事情。他说,“让 Solr 真正发挥作用的方法是让它在内存中运行。”

唉,我不知道他的名字,尽管在谷歌上搜索答案,但我从未真正找到任何具体的东西。你认为他这句话的意思是什么?

One night I was drinking a pint at a local bar and struck up a conversation with the chap next to me. As it turned out, not only was he a fellow developer, but he also used Solr a lot. As we got to talking about how awesome Solr was, he mentioned something that I've never been able to figure out. He said, "The way to make Solr really perform is to have it run in memory."

Alas, I did not get his name and despite googling for an answer, I've never really found anything concrete. What do you think he meant by this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

南…巷孤猫 2024-12-18 22:03:29

对于任何希望这样做以加快测试速度的人:

如果您的测试索引有单独的核心,则可以将 solrconfig.xml 中的 directoryFactory 属性更改为:

不用说,仅将任何生产数据保留在内存中并不是一个好主意。

For anyone looking to do this for the purpose of speeding up tests:

If you have a separate core for your test index, you can change the directoryFactory attribute in solrconfig.xml to:

<directoryFactory name="DirectoryFactory" class="solr.RAMDirectoryFactory"/>

Needless to say, it's not a good idea to keep any production data in memory only.

仲春光 2024-12-18 22:03:29

在内存中运行 SOLR 确实没有意义。 SOLR 旨在成为一个网络服务器,客户端可以使用 RESTless api 进行查询。您可以设置复制来补偿高流量。 SOLR 包装了 Lucene。如果你想在内存中运行 SOLR,那么你基本上就是在内存中运行 Lucene。所以我建议启动一个 Lucene 实例并将磁盘放在内存中。我很想知道其他人的想法,但在内存中运行 SOLR 确实不是预期用途。 Lucene 有一个 目录 在名为 RAMDirectory

不久前也有一个关于运行 SOLR 嵌入式模式的类似问题,但过了一段时间 Apache 就弃用了嵌入式服务器的东西,因为这确实是 Lucene 的用途。

Running SOLR in memory is really pointless. SOLR is meant to be a webserver where clients can query using RESTless api. You can set up replication to compensate for high traffic. SOLR wraps Lucene. If you want to run SOLR in memory then you are basically running Lucene in memory. So I would suggest just start an instance of Lucene and have the disk be in memory. I am curious to know what others think but running SOLR in memory is really not intended use. With Lucene there is a Directory implemented in memory called RAMDirectory.

There was also a similar question a while back about running SOLR embedded mode but after a while Apache deprecated the embedded server stuff because that was really what Lucene was for.

惟欲睡 2024-12-18 22:03:29

他的意思可能是有足够的磁盘缓存来保存整个索引。这是一种广泛推荐的确保快速小型随机 IO 读取(以及索引时批量写入)的方法,这对于良好的 Solr 性能至关重要:https://wiki.apache.org/solr/SolrPerformanceProblems#OS_Disk_Cache

对于较小的索引,额外的 RAM 成本相对较低,这是一个很好的建议。随着索引的增长,最好投入更多的时间尺度测试和尝试其他硬件设置,SSD 是一种明显的可能性。

It is possible that he meant having enough disk cache to hold the entire index. That is a widely recommended way of ensuring fast small random IO-reads (and bulk writes when indexing), which is essential for good Solr performance: https://wiki.apache.org/solr/SolrPerformanceProblems#OS_Disk_Cache

For smaller indexes where the extra RAM cost is relatively low, it is a fine advice. As the indexes grows, it is probably better to invest more time scale-testing and experimenting with other hardware setups, SSDs being an obvious possibility.

莫多说 2024-12-18 22:03:29

虽然速度是运行 RAMDirectory 的充分理由,但您最终必须将目录保留到磁盘。您可能可以编写一个简单的包装器,而不是包装 RAMDirectory 和 FSDirectory 并将调用镜像到两者。所有查询都来自 RAMDirectory,但更改将应用​​于两者。

但这样做的另一个很好的理由是静态加密。如果您确实想使用数据,那么加密数据很糟糕,因为您必须支付解密才能查询它的开销。静态使用加密数据是不切实际的,但如果您将内容解密到内存中并将其缓存,那么速度会非常快。

While speed is good reason to run a RAMDirectory you'll have to eventually persist the directory to disk. You probably could write a simple wrapper than wrapped a RAMDirectory and FSDirectory and mirror the calls to both. All queries would come out of RAMDirectory, but changes would be applied to both.

But another very good reason to do this would be for encryption at rest. Encrypting data sucks if you actually want to use it because you have to pay the overhead of decrypting to query it. It's not practical to use encrypted data at rest, but if you decrypted the contents into memory and cached it then it would be very fast.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文