EdgeNGramFilterFactory 不工作(不建立索引?)
我在让 ngram 工作时遇到问题。这是我的 schema.xml:
<fieldType name="text" class="solr.TextField" omitNorms="false">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StandardFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="25" />
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StandardFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true" />
</analyzer>
</fieldType>
我的数据库有一堆条目
“伊丽莎白”
和
“伊丽莎白”
当我尝试查询“伊丽莎白”时,我只得到“伊丽莎白”,而不是“伊丽莎白”。 奇怪的是,当我检查 solr 管理时,分析页面显示 EdgenGramFilterFactory 确实可用,并导致“Elizabeths”扩展为
伊尔·伊利·伊莉莎·伊莉莎·伊莉莎布·伊莉莎贝·伊莉莎贝·伊丽莎白
索引器似乎没有注意到这一点。当我将同义词过滤器从查询块移动到索引块时,我遇到了同样的问题。也就是说,当我在查询块中有同义词过滤器时,它起作用,但是当我将其放入索引块时,它没有效果。
我已重新启动 Sunspot 并重新索引多次。没有骰子。有什么想法吗?如何直接查看索引词列表?
I am having trouble getting ngrams to work. Here's my schema.xml:
<fieldType name="text" class="solr.TextField" omitNorms="false">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StandardFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="25" />
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StandardFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true" />
</analyzer>
</fieldType>
My database has a bunch of entries with
"Elizabeth"
and
"Elizabeths"
When I try to query on "Elizabeth" I get only "Elizabeth" and not "Elizabeths".
The odd thing is, when I check out the solr admin, the Analysis page shows that the EdgenGramFilterFactory is indeed available, and results in "Elizabeths" being expanded into
e el eli eliz eliza elizab elizabe elizabet elizabeth
It seems like the indexer isn't picking up on this. I have the same problem when I move the synonyms filter from the query block to the index block. That is to say, when I have the synonyms filter in the query block, it works, but when I put it in the index block, it has no effect.
I have restarted Sunspot and reindexed multiple times. No dice. Any ideas? How can I directly check the indexed words list?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我想我发现了这个问题,它看起来像是一个菜鸟错误。
在我的模型中,根据教程之一使用以下构造:
当我开始、停止或重新索引时,这似乎没有抛出任何错误。让我感到奇怪的是,我在停止 Solr 后能够立即重新索引。
。
当我这样做时,我发现停止Solr后无法重新索引 然而,当我启动 Solr 并重新建立索引时,索引似乎真的是
刷新后,我的查询终于按预期运行。
I think I found the problem and it looks like a noob error.
In my model, is was using the following construct as per one of the tutorials:
This did not seem to throw any errors when I started, stopped, or reindexed. It struck me as strange that I was able to reindex immediately after stopping Solr.
I switched to
When I did this, I found that I could not reindex after stopping Solr. However, when I started Solr and reindexed, the index appeared to be truly
refreshed and my queries finally behaved as expected.