是否可以在一次查询中获取匹配文档及其所有祖先?

发布于 2024-09-18 07:00:23 字数 1076 浏览 3 评论 0原文

为了说明我的要求,请考虑以下目录结构:

C:\Dev
C:\Dev\Projects
C:\Dev\Projects\测试项目
C:\Dev\Projects\测试项目\Test.cs
C:\Dev\Projects\Foo
C:\Dev\Projects\Foo\foo.cs(包含单词 test)

基本文档将具有 id、类型、名称和内容字段,其中类型将是文件或文件夹,名称将是以太文件名或文件夹名称。

当搜索“test”时,我应该得到:

C:\Dev(结果的祖先)
C:\Dev\Projects(结果的祖先)
C:\Dev\Projects\Test 项目(结果)
C:\Dev(结果的祖先)
C:\Dev\Projects(结果的祖先)
C:\Dev\Projects\Test Project(结果的祖先)
C:\Dev\Projects\Test Project\Test.cs(结果)
C:\Dev(结果的祖先)
C:\Dev\Projects(结果的祖先)
C:\Dev\Projects\Foo(结果的祖先)
C:\Dev\Projects\Foo\foo.cs(结果)

如果可以避免重复就更好了:

C:\Dev(结果的祖先)
C:\Dev\Projects(结果的祖先)
C:\Dev\Projects\Test 项目(结果)
C:\Dev\Projects\Test Project\Test.cs(结果)
C:\Dev\Projects\Foo(结果的祖先)
C:\Dev\Projects\Foo\foo.cs(结果)

当搜索“project”时,我应该得到:

C:\Dev(结果的祖先)
C:\Dev\Projects(结果的祖先)
C:\Dev\Projects\Test Project(结果)

当搜索“foo”时,我应该得到:

C:\Dev(结果的祖先)
C:\Dev\Projects(结果的祖先)
C:\Dev\Projects\Foo(结果) C:\Dev\Projects\Foo\foo.cs (结果)

感谢您的帮助

To illustrate my requirements consider the following directory structure:

C:\Dev
C:\Dev\Projects
C:\Dev\Projects\Test Project
C:\Dev\Projects\Test Project\Test.cs
C:\Dev\Projects\Foo
C:\Dev\Projects\Foo\foo.cs (containing the word test)

The basic document will have id, type, name and content fields, where type will be file or folder and name will be ether file name or folder name.

When searching for "test" I should get:

C:\Dev (ancestor of a result)
C:\Dev\Projects (ancestor of a result)
C:\Dev\Projects\Test Project (result)
C:\Dev (ancestor of a result)
C:\Dev\Projects (ancestor of a result)
C:\Dev\Projects\Test Project (ancestor of a result)
C:\Dev\Projects\Test Project\Test.cs (result)
C:\Dev (ancestor of a result)
C:\Dev\Projects (ancestor of a result)
C:\Dev\Projects\Foo (ancestor of a result)
C:\Dev\Projects\Foo\foo.cs (result)

Even better if it possible to avoid duplications:

C:\Dev (ancestor of a result)
C:\Dev\Projects (ancestor of a result)
C:\Dev\Projects\Test Project (result)
C:\Dev\Projects\Test Project\Test.cs (result)
C:\Dev\Projects\Foo (ancestor of a result)
C:\Dev\Projects\Foo\foo.cs (result)

When searching for "project" I should get:

C:\Dev (ancestor of a result)
C:\Dev\Projects (ancestor of a result)
C:\Dev\Projects\Test Project (result)

When searching for "foo" I should get:

C:\Dev (ancestor of a result)
C:\Dev\Projects (ancestor of a result)
C:\Dev\Projects\Foo (result)
C:\Dev\Projects\Foo\foo.cs (result)

Thanks for any help

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

如歌彻婉言 2024-09-25 07:00:25

如果您生成索引一次或写入次数非常少,您可以在文档索引中设置解决方案。

因此,对于每个文档,您将保存另一个名为“path”的字段,并让它保存路径子元素中所有单词的标记化列表:

名称:C:\Dev\Projects
路径:C:、Dev、Projects、Test、Test Project、Test.cs、Foo、Foo.cs(使用您想要的任何标记生成器)

,然后将字段索引为 INDEXED:true STORED:false 并使用它来搜索匹配项:

query : +path:"Foo"

应该返回所有以 Foo 作为子元素的文档。
请记住,此解决方案的写入成本非常高,并且对于拥有数千个叶子的非常大的树结构来说可能不切实际。

If you generate your index once or have a very small number of writes you could set up a solution in the indexing of the documents.

So for each document you would save another field called "path" and have it hold a tokenized list of all words from the sub elements of the path:

name: C:\Dev\Projects
path: C:, Dev, Projects, Test, Test Project, Test.cs, Foo, Foo.cs (use whatever tokenizer you want)

then index the field as INDEXED:true STORED:false and use it for searching for matches:

query: +path:"Foo"

Should return all the documents that have Foo as a child element.
Keep in mind this solution is very costly for writes and may be impractical for a very large tree structure where you have many thousands of leafs.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文