为什么我的 solr 从属索引不断增长？

发布于 2024-09-08 06:25:13 字数 1549 浏览 5 评论 0原文

我有一个 5 核 solr 1.4 master，它使用 solr 复制复制到另一个 5 核 solr，如此处所述。所有写入都是针对主服务器完成的，并间歇性地复制到从服务器。这是使用以下顺序完成的：

在每个主核心上提交
在每个从核心上复制在每个
从核心上优化在
每个从核心上提交

我遇到的问题是从核心似乎保留旧索引文件并占用更多磁盘空间。例如，经过 3 次复制后，主核心数据目录如下所示：

$ du -sh *
145M    index

但同一核心的从属上的数据目录如下所示：

$ du -sh *
300M    index
144M    index.20100621042048
145M    index.20100629035801
4.0K    index.properties
4.0K    replication.properties

这是 index.properties 的内容：

#index properties
#Tue Jun 29 15:58:13 CDT 2010
index=index.20100629035801

和replication.properties：

#Replication details
#Tue Jun 29 15:58:13 CDT 2010
replicationFailedAtList=1277155032914
previousCycleTimeInSeconds=12
timesFailed=1
indexReplicatedAtList=1277845093709,1277155253911,1277155032914
indexReplicatedAt=1277845093709
replicationFailedAt=1277155032914
lastCycleBytesDownloaded=150616512
timesIndexReplicated=3

这个的 solrconfig.xml从属包含默认删除策略：

[...]
<mainIndex>
    <unlockOnStartup>false</unlockOnStartup>
    <reopenReaders>true</reopenReaders>
    <deletionPolicy class="solr.SolrDeletionPolicy">
        <str name="maxCommitsToKeep">1</str>
        <str name="maxOptimizedCommitsToKeep">0</str>
    </deletionPolicy>
</mainIndex>
[...]

我缺少什么？

原文

I have a 5-core solr 1.4 master that is replicated to another 5-core solr using solr replication as described here. All writes are done against the master and replicated to the slave intermittently. This is done using the following sequence:

Commit on each master core
Replicate on each slave core
Optimize on each slave core
Commit on each slave core

The problem I am having is that the slave seems to be keeping around old index files and taking up ever more disk space. For example, after 3 replications, the master core data directory looks like this:

$ du -sh *
145M    index

But the data directory on the slave of the same core looks like this:

$ du -sh *
300M    index
144M    index.20100621042048
145M    index.20100629035801
4.0K    index.properties
4.0K    replication.properties

Here's the contents of index.properties:

#index properties
#Tue Jun 29 15:58:13 CDT 2010
index=index.20100629035801

And replication.properties:

#Replication details
#Tue Jun 29 15:58:13 CDT 2010
replicationFailedAtList=1277155032914
previousCycleTimeInSeconds=12
timesFailed=1
indexReplicatedAtList=1277845093709,1277155253911,1277155032914
indexReplicatedAt=1277845093709
replicationFailedAt=1277155032914
lastCycleBytesDownloaded=150616512
timesIndexReplicated=3

The solrconfig.xml for this slave contains the default deletion policy:

[...]
<mainIndex>
    <unlockOnStartup>false</unlockOnStartup>
    <reopenReaders>true</reopenReaders>
    <deletionPolicy class="solr.SolrDeletionPolicy">
        <str name="maxCommitsToKeep">1</str>
        <str name="maxOptimizedCommitsToKeep">0</str>
    </deletionPolicy>
</mainIndex>
[...]

What am I missing?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

小耗子 2024-09-15 06:25:14

在slave上进行提交和优化是没有用的。由于所有写操作都在主设备上完成，因此它是这些操作应该发生的唯一位置。

这可能是问题的原因：由于您在从属设备上进行了额外的提交和优化，因此它在从属设备上保留了更多提交点。但这只是一个猜测，应该更容易理解主服务器和从服务器上的完整 solrconfig.xml 会发生什么。

回复收藏 0 原文

孤单情人 2024-09-15 06:25:14

在从站上进行的优化导致索引的大小增加了一倍。优化时，将创建单独的索引段，以将原始索引重写为优化期间提到的段数（默认为 1）。
最佳实践是偶尔优化一次，不要在任何事件中调用它（运行 cron 作业或其他操作），并且仅在主服务器而不是从服务器上进行优化。从站将通过复制获得这些新的段。
您应该在从站上提交，索引重新加载将在复制后处理从站上新文档的可用性。

回复收藏 0 原文