如何在自定义Arangosearch Analyzer中删除空字符串
我有一个像text_en一样的自定义分析仪,但不包括连字符作为定界符:
{pipeline:[
{type:"norm",properties:{
locale: "en.utf-8", accent: false, case: "lower", stemming: false}},
{type:"delimiter",properties:{delimiter:" "}},
{type:"delimiter",properties:{delimiter:"!"}},
{type:"delimiter",properties:{delimiter:"."}},
{type:"delimiter",properties:{delimiter:","}},
{type:"delimiter",properties:{delimiter:";"}},
{type:"delimiter",properties:{delimiter:"?"}},
{type:"delimiter",properties:{delimiter:"["}},
{type:"delimiter",properties:{delimiter:"]"}},
{type:"delimiter",properties:{delimiter:"{"}},
{type:"delimiter",properties:{delimiter:"}"}},
{type:"delimiter",properties:{delimiter:"("}},
{type:"delimiter",properties:{delimiter:")"}},
{type:"delimiter",properties:{delimiter:"<"}},
{type:"delimiter",properties:{delimiter:">"}},
{type:"delimiter",properties:{delimiter:"~"}},
{type:"delimiter",properties:{delimiter:"@"}},
{type:"delimiter",properties:{delimiter:"="}},
{type:"delimiter",properties:{delimiter:"&"}},
{type:"delimiter",properties:{delimiter:"|"}},
{type:"delimiter",properties:{delimiter:"\n"}},
{type:"stem",properties:{locale:"en.utf-8"}}]}'
问题是这样的链接将返回空字符串。字符串“ Hypnos2,Aphrodite和其他微控制器”的令牌。这是:
[
"hypnos2",
"",
"aphrodit",
"and",
"other",
"microcontrol",
""
]
我该怎么做才能在此分析仪中删除空字符串令牌?
I've got a custom analyzer that is like text_en but doesn't include the hyphen as a delimiter:
{pipeline:[
{type:"norm",properties:{
locale: "en.utf-8", accent: false, case: "lower", stemming: false}},
{type:"delimiter",properties:{delimiter:" "}},
{type:"delimiter",properties:{delimiter:"!"}},
{type:"delimiter",properties:{delimiter:"."}},
{type:"delimiter",properties:{delimiter:","}},
{type:"delimiter",properties:{delimiter:";"}},
{type:"delimiter",properties:{delimiter:"?"}},
{type:"delimiter",properties:{delimiter:"["}},
{type:"delimiter",properties:{delimiter:"]"}},
{type:"delimiter",properties:{delimiter:"{"}},
{type:"delimiter",properties:{delimiter:"}"}},
{type:"delimiter",properties:{delimiter:"("}},
{type:"delimiter",properties:{delimiter:")"}},
{type:"delimiter",properties:{delimiter:"<"}},
{type:"delimiter",properties:{delimiter:">"}},
{type:"delimiter",properties:{delimiter:"~"}},
{type:"delimiter",properties:{delimiter:"@"}},
{type:"delimiter",properties:{delimiter:"="}},
{type:"delimiter",properties:{delimiter:"&"}},
{type:"delimiter",properties:{delimiter:"|"}},
{type:"delimiter",properties:{delimiter:"\n"}},
{type:"stem",properties:{locale:"en.utf-8"}}]}'
The issue is that chaining like this will return empty strings. The tokens for the string "HYPNOS2, Aphrodite and other Microcontrollers." is this:
[
"hypnos2",
"",
"aphrodit",
"and",
"other",
"microcontrol",
""
]
What do I do to remove the empty string tokens in this analyzer?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
定界符之后,我在管道中添加了另一个阶段。这是一个aql,带有keepnull = false和return @param ==“”“?null:@param的Querystring。如果有人有更简单的建议,我很感兴趣。
I added another stage in the pipeline after the delimiters. It's an aql with keepNull=false and queryString of return @param==""?null:@param. If someone has a simpler suggestion I'm interested.