Solr 多值字段和添加值

发布于 2024-12-10 14:07:04 字数 486 浏览 1 评论 0原文

我正在构建一个搜索引擎,并且对于许多不同的名称有一个不太唯一的ID...因此,例如,可能有一个B0051QVF7A的ID,它有多个名称,例如“Kindle”“Amazon Kindle”“Amazon Kindle 3G”“Kindle 电子书阅读器”“New Kindle”等。

我的问题是,我试图从 11 百万行的数据库中输入这些数据。每一篇都被一次一个地阅读。所以我没有每个ID的所有名称。我每次都会向列表中添加新文档。

我想找出的是如何将名称添加到现有文档中?如果我正确阅读文档,它似乎会覆盖整个文档,而不是向该字段添加额外的信息...我只想向文档多值字段添加一个额外的名称...

我知道这可能会导致一些奇怪而美妙的事情如果删除名称,则会出现“问题”(在上面的示例中,当发布较新的 Kindle 时,可以删除“New Kindle”),但我正在考虑时不时地重新创建索引,以清除类似的问题(一旦目前大约需要 45 分钟。创建索引)。

那么,如何为现有文档的 solr 中的多值字段添加值?

I am building a search engine, and have a not so unique ID for a lot of different names... So, for example, there could be an id of B0051QVF7A which would have multiple names like "Kindle" "Amazon Kindle" "Amazon Kindle 3G" "Kindle Ebook Reader" "New Kindle" etc.

The problem, and question i have, is that i am trying to enter this data from a DB of 11 ish million rows. each is being read one at a time. So i dont have all the names of each ID. I am adding new documents to the list each time.

What i am trying to find out is how do i add names to an existing Document? if i am reading documentation correctly, it seems to overwrite the whole document, not add extra info to the field... i just want to add an extra name to the document multivalue field...

I know this could cause some weird and wonderful "issues" if a name is removed (in the example above, "New Kindle" could be removed when a newer Kindle gets released) but i am thinking of recreating the index every now and again, to clear out issues like that (once a month or so. Its taking about 45min currently to create the index).

So, how do you add a value to a multivalue field in solr for an existing document?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

本王不退位尔等都是臣 2024-12-17 14:07:04

由于根据@Mauricio Scheffer 的评论链接到的问题...Solr 目前不支持更新现有文档中的单个字段值。我看到这里可能有几个选项...

  1. 在从数据库中提取数据的过程中,当它找到新名称时,它将需要从 Solr 中提取现有文档的所有字段,添加新值并将完整的文档重新发送给 Solr(您可能已经这样做了)。
  2. 在从数据库读取的代码中添加一些额外的逻辑,以便在将文档插入索引之前收集每个文档的所有唯一名称。但是,鉴于您有大约 1100 万条记录,可能存在资源限制,导致此操作无法实现。

Since according to the question linked to by @Mauricio Scheffer's comment... Solr does not currently support updating a single field value in an existing document. I see that there might be a couple of options here...

  1. In your process that is pulling data from the database, when it finds a new name, it will need to pull all fields for the existing document from Solr, add the new value and resend the complete document to Solr (you may already be doing this).
  2. Add some additional logic to your code that reads from the database, to gather all of the unique names for each document prior to inserting documents into the index. However, given that you have ~11 million records, there could be a resource constraint that would prevent this from being feasible.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文