获取搜索字段中突出显示片段的开始和结束索引

发布于 2024-09-02 02:11:37 字数 383 浏览 10 评论 0原文

“我的搜索返回一个字段中突出显示的片段。我想知道在特定搜索文档的该字段中，该片段在哪里开始和结束？”

例如。

考虑我正在上面的行中搜索“突出显示的片段”（将上面的段落视为单个文档）。

我将分段器设置为：

SimpleFragmenter fragmenter = 
            new SimpleFragmenter(30);

现在 GetBestFragment 的输出有点像：“从返回突出显示的片段”

是否可以开始以及上面文本中该片段的结束索引（假设开始是 10，结束是 45）

原文

"My search returns a highlighted fragment from a field. I want to know that in that field of particular searched document, where does that fragment starts and ends ?"

for instance.

consider i am searching "highlighted fragment" in above lines (consider the above para as single document).

I am setting my fragmenter as :

SimpleFragmenter fragmenter = 
            new SimpleFragmenter(30);

now the output of GetBestFragment is somewhat like : "returns a highlighted fragment from"

Is it possible to get the starting and ending index of this fragment in the text above (say starting is 10 and ending is 45)

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

つ可否回来 2024-09-09 02:11:37

当您使用 getBestFragment 方法时，荧光笔不会返回该信息。在幕后，荧光笔使用 TokenGroup 类
获取每个片段的开始和结束索引。您可能可以使用该类。

回复收藏 0 原文

海拔太高太耀眼 2024-09-09 02:11:37

几个月前我就这么做了。您必须构建自定义格式化程序和编码器。
基本上，在荧光笔内部，格式化程序处理选择用于突出显示的标记，而编码器处理其余标记。在您的情况下，您需要编码器在每次调用时发出空值，并且格式化程序发出开始索引和结束索引。它们确实存储在突出显示部分的 TokenGroup 中。您的荧光笔应该使用这些自定义格式化程序和编码器来构建。

回复收藏 0 原文

~没有更多了~