Solr 重复数据删除(dedupe)无法工作,更新文档时出现错误
我已按照以下文档中列出的示例进行操作: https://solr.apache.org/guide/8_4/de-duplication。 我的要求是忽略重复记录,但
在实现重复数据删除后,我无法添加任何文档(即使它是唯一的)并收到相同的错误:
线程“main”中的异常org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:来自 http://localhost:8983/solr/my_core 的服务器错误:文档包含 uniqueKey 字段的多个值:id=[0011,affa84b255f98fd800dd0056b7040855] 在org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:681) 在org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:266) 在org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:248) 在 org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:214) 在 org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:177) 在 org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:138) 在org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:156)
solrconfig.xml:
<updateRequestProcessorChain name="dedupe">
<processor class="solr.processor.SignatureUpdateProcessorFactory">
<bool name="enabled">true</bool>
<str name="signatureField">id</str>
<str name="fields">first_name,last_name,phone_no</str>
<bool name="overwriteDupes">false</bool>
<str name="signatureClass">solr.processor.TextProfileSignature</str>
</processor>
<processor class="solr.LogUpdateProcessorFactory" />
<processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>
<requestHandler name="/update" class="solr.UpdateRequestHandler" >
<lst name="defaults">
<str name="update.chain">dedupe</str>
</lst>
</requestHandler>
schema.xml:
<?xml version="1.0" encoding="UTF-8" ?>
<schema name="dummydata" version="1.5">
<field name="first_name" type="string" indexed="true" stored="true" multiValued="false" />
<field name="last_name" type="string" indexed="true" stored="true" multiValued="false" />
<field name="location" type="string" indexed="true" stored="true" multiValued="false" />
<field name="phone_no" type="string" indexed="true" stored="true" multiValued="false" />
<field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" />
<uniqueKey>id</uniqueKey>
</schema>
使用的Java代码:
{
String urlString = "http://localhost:8983/solr/my_core";
SolrClient Solr = new HttpSolrClient.Builder(urlString).build();
UpdateResponse response;
SolrInputDocument myDocumentInstantlycommited = new SolrInputDocument();
myDocumentInstantlycommited.addField("id", "0011");
myDocumentInstantlycommited.addField("first_name", "T11");
myDocumentInstantlycommited.addField("last_name","L11");
myDocumentInstantlycommited.addField("phone_no","9912121312");
myDocumentInstantlycommited.addField("location","TESt211");
response=Solr.add( myDocumentInstantlycommited);
Solr.commit();
Solr.close();
System.out.println("Documents Updated");
}
I have followed the example listed in the below documentation :
https://solr.apache.org/guide/8_4/de-duplication.html
My requirement is to ignore duplicate records, but after implementing dedupe I am not able to add any document(even if it is unique) and getting same error :
Exception in thread "main" org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://localhost:8983/solr/my_core: Document contains multiple values for uniqueKey field: id=[0011, affa84b255f98fd800dd0056b7040855]
at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:681)
at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:266)
at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:248)
at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:214)
at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:177)
at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:138)
at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:156)
solrconfig.xml :
<updateRequestProcessorChain name="dedupe">
<processor class="solr.processor.SignatureUpdateProcessorFactory">
<bool name="enabled">true</bool>
<str name="signatureField">id</str>
<str name="fields">first_name,last_name,phone_no</str>
<bool name="overwriteDupes">false</bool>
<str name="signatureClass">solr.processor.TextProfileSignature</str>
</processor>
<processor class="solr.LogUpdateProcessorFactory" />
<processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>
<requestHandler name="/update" class="solr.UpdateRequestHandler" >
<lst name="defaults">
<str name="update.chain">dedupe</str>
</lst>
</requestHandler>
schema.xml :
<?xml version="1.0" encoding="UTF-8" ?>
<schema name="dummydata" version="1.5">
<field name="first_name" type="string" indexed="true" stored="true" multiValued="false" />
<field name="last_name" type="string" indexed="true" stored="true" multiValued="false" />
<field name="location" type="string" indexed="true" stored="true" multiValued="false" />
<field name="phone_no" type="string" indexed="true" stored="true" multiValued="false" />
<field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" />
<uniqueKey>id</uniqueKey>
</schema>
Java code used :
{
String urlString = "http://localhost:8983/solr/my_core";
SolrClient Solr = new HttpSolrClient.Builder(urlString).build();
UpdateResponse response;
SolrInputDocument myDocumentInstantlycommited = new SolrInputDocument();
myDocumentInstantlycommited.addField("id", "0011");
myDocumentInstantlycommited.addField("first_name", "T11");
myDocumentInstantlycommited.addField("last_name","L11");
myDocumentInstantlycommited.addField("phone_no","9912121312");
myDocumentInstantlycommited.addField("location","TESt211");
response=Solr.add( myDocumentInstantlycommited);
Solr.commit();
Solr.close();
System.out.println("Documents Updated");
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论