如何在自定义Arangosearch Analyzer中删除空字符串

发布于 2025-02-10 17:52:45 字数 1386 浏览 2 评论 0原文

我有一个像text_en一样的自定义分析仪,但不包括连字符作为定界符:

{pipeline:[
 {type:"norm",properties:{
  locale: "en.utf-8", accent: false, case: "lower", stemming: false}},
 {type:"delimiter",properties:{delimiter:" "}},
 {type:"delimiter",properties:{delimiter:"!"}},
 {type:"delimiter",properties:{delimiter:"."}},
 {type:"delimiter",properties:{delimiter:","}},
 {type:"delimiter",properties:{delimiter:";"}},
 {type:"delimiter",properties:{delimiter:"?"}},
 {type:"delimiter",properties:{delimiter:"["}},
 {type:"delimiter",properties:{delimiter:"]"}},
 {type:"delimiter",properties:{delimiter:"{"}},
 {type:"delimiter",properties:{delimiter:"}"}},
 {type:"delimiter",properties:{delimiter:"("}},
 {type:"delimiter",properties:{delimiter:")"}},
 {type:"delimiter",properties:{delimiter:"<"}},
 {type:"delimiter",properties:{delimiter:">"}},
 {type:"delimiter",properties:{delimiter:"~"}},
 {type:"delimiter",properties:{delimiter:"@"}},
 {type:"delimiter",properties:{delimiter:"="}},
 {type:"delimiter",properties:{delimiter:"&"}},
 {type:"delimiter",properties:{delimiter:"|"}},
 {type:"delimiter",properties:{delimiter:"\n"}},
 {type:"stem",properties:{locale:"en.utf-8"}}]}'

问题是这样的链接将返回空字符串。字符串“ Hypnos2,Aphrodite和其他微控制器”的令牌。这是:

[
 "hypnos2",
 "",
 "aphrodit",
 "and",
 "other",
 "microcontrol",
 ""
]

我该怎么做才能在此分析仪中删除空字符串令牌?

I've got a custom analyzer that is like text_en but doesn't include the hyphen as a delimiter:

{pipeline:[
 {type:"norm",properties:{
  locale: "en.utf-8", accent: false, case: "lower", stemming: false}},
 {type:"delimiter",properties:{delimiter:" "}},
 {type:"delimiter",properties:{delimiter:"!"}},
 {type:"delimiter",properties:{delimiter:"."}},
 {type:"delimiter",properties:{delimiter:","}},
 {type:"delimiter",properties:{delimiter:";"}},
 {type:"delimiter",properties:{delimiter:"?"}},
 {type:"delimiter",properties:{delimiter:"["}},
 {type:"delimiter",properties:{delimiter:"]"}},
 {type:"delimiter",properties:{delimiter:"{"}},
 {type:"delimiter",properties:{delimiter:"}"}},
 {type:"delimiter",properties:{delimiter:"("}},
 {type:"delimiter",properties:{delimiter:")"}},
 {type:"delimiter",properties:{delimiter:"<"}},
 {type:"delimiter",properties:{delimiter:">"}},
 {type:"delimiter",properties:{delimiter:"~"}},
 {type:"delimiter",properties:{delimiter:"@"}},
 {type:"delimiter",properties:{delimiter:"="}},
 {type:"delimiter",properties:{delimiter:"&"}},
 {type:"delimiter",properties:{delimiter:"|"}},
 {type:"delimiter",properties:{delimiter:"\n"}},
 {type:"stem",properties:{locale:"en.utf-8"}}]}'

The issue is that chaining like this will return empty strings. The tokens for the string "HYPNOS2, Aphrodite and other Microcontrollers." is this:

[
 "hypnos2",
 "",
 "aphrodit",
 "and",
 "other",
 "microcontrol",
 ""
]

What do I do to remove the empty string tokens in this analyzer?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

羁客 2025-02-17 17:52:45

定界符之后,我在管道中添加了另一个阶段。这是一个aql,带有keepnull = false和return @param ==“”“?null:@param的Querystring。如果有人有更简单的建议,我很感兴趣。

I added another stage in the pipeline after the delimiters. It's an aql with keepNull=false and queryString of return @param==""?null:@param. If someone has a simpler suggestion I'm interested.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文