TokenStream 中存储值的 Lucene 字段

发布于 2024-10-08 17:05:47 字数 346 浏览 0 评论 0原文

我有一个需要来自令牌流的字段;它不能用字符串实例化然后分析为标记。例如,我可能想将多个列(在我的 RDBMS 中)的数据组合到单个 Lucene 字段中,但我想以自己的方式分析每个列。因此,我不能简单地将它们全部连接为单个字符串,然后分析生成的字符串。

我现在遇到的问题是无法存储从令牌流创建的字段,这在一般情况下是有意义的,因为流可能没有明显的字符串表示形式。但是,我知道字符串表示形式,并且我想存储它。

我尝试添加相同的字段两次,一次是存储它并具有字符串数据,一次是来自令牌流,但似乎无法做到这一点。除了添加一个名为“myfield__stored”的字段之类的黑客之外,还有其他方法可以做到这一点吗?

我正在使用2.9.2。

I have a field which needs to come from a token stream; it cannot be instantiated with a string and then analyzed into tokens. For example, I might want to combine the data from multiple columns (in my RDBMS) into a single Lucene field, but I want to analyze each column in its own way. So I cannot simply concat them all as a single string then analyze the resulting string.

The problem I am running into now is that fields created from token streams cannot be stored, which makes sense in the general case since the stream may not have an obvious string representation. However, I know the string representation, and I would like to store that.

I tried adding the same field twice, once with it being stored and having string data and once with it coming from a token stream, but it seems that this can't be done. Apart from some hack like adding a field with a name of "myfield__stored" is there a way to do this?

I am using 2.9.2.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

凉栀 2024-10-15 17:05:48

我找到了办法。您可以通过将其实例化为普通字段但稍后调用 SetTokenStream 来潜入它:

Field f = new Field(Name, StringValue, Store, Analyzed, TV);
f.SetTokenStream(TokenStreamValue);

因为仅当令牌流值为 null 时才对读取器/字符串值建立索引,因此将对令牌流值建立索引。无论令牌流如何,存储方法都会查看字符串/读取器,因此将存储该值。

I found a way. You can sneak it in by instantiating it as a normal field but calling SetTokenStream later:

Field f = new Field(Name, StringValue, Store, Analyzed, TV);
f.SetTokenStream(TokenStreamValue);

Because the reader/string value is only indexed if the token stream value is null, the token stream value will be indexed. The store methods look at string/reader regardless of token stream, so it will be this value which is stored.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文