无法将自定义 Sitecore Lucene 索引限制为 /sitecore/content/Home

发布于 2024-10-25 05:10:39 字数 3091 浏览 1 评论 0原文

我正在尝试在运行 Sitecore 6.3.1 的网站上创建新的 Lucene 索引。我使用现有的“系统”索引作为指导,成功地在 Web 和 master 上创建了一个新索引,以索引 Sitecore 内容树中的所有项目。

然而,我遇到的困难是限制数据库爬虫索引内容树的哪一部分。目前,搜索索引包含内容树中各个位置的项目(内容项目、媒体库项目、布局、模板等)。我想将索引限制为仅 /sitecore/content/Home 中的项目。

我在 ~/App_Config/Include/Search Indexes/website.config 创建了一个文件,并在下面粘贴了相关部分:

<?xml version="1.0" encoding="utf-8" ?>
<configuration>
  <sitecore>

    <!-- This works as expected.... -->
    <databases>
      <database id="web">
        <indexes hint="list:AddIndex">
          <index path="indexes/index[@id='website']" />
        </indexes>
      </database>

      <!-- ... similar entry for master database goes here ... -->
    </databases>

    <!-- So does this.... -->
    <indexes>
      <index id="website" singleInstance="true" type="Sitecore.Data.Indexing.Index, Sitecore.Kernel">
        <param desc="name">$(id)</param>
        <fields hint="raw:AddField">
          <!-- ... field descriptions go here ... -->
        </fields>
      </index>
    </indexes>

    <!-- This works... mostly.  The "__website" directory does get created,
          but the Root directive is getting ignored.
    -->
    <search>
      <configuration type="Sitecore.Search.SearchConfiguration, Sitecore.Kernel" singleInstance="true">
        <indexes hint="list:AddIndex">
          <index id="website" singleInstance="true" type="Sitecore.Search.Index, Sitecore.Kernel">
            <param desc="name">$(id)</param>
            <param desc="folder">__$(id)</param>

            <Analyzer ref="search/analyzer" />

            <locations hint="list:AddCrawler">
              <web type="Sitecore.Search.Crawlers.DatabaseCrawler, Sitecore.Kernel">
                <Database>web</Database>
                <Root>/sitecore/content/home</Root>
                <Tags>content</Tags>
              </web>

              <!-- ... similar entry for master database goes here ... -->
            </locations>
          </index>
        </indexes>
      </configuration>
    </search>
  </sitecore>
</configuration>

一些注释:

  • 这不是来自我的 web.config 文件;我创建了一个单独的文件,以便可以通过 Sitecore 包分发配置更改。

  • 索引已添加到masterweb中;为了简洁起见,我省略了对 master 的引用。

  • Sitecore 肯定正在处理 configuration/sitecore/search/configuration 的条目。当我访问 http://localhost/sitecore/admin/showconfig.aspx 时,如果我将其中一个标签值更改为无效值(例如 ;/nothere),Sitecore 在下一个页面加载时抛出异常。

  • 我已经查看了 IndexViewer 中的索引内容,并且肯定会索引错误的项目(例如,索引中的文档 #0 是 /sitecore 节点)。

我哪里错了?我需要对配置文件进行哪些更改才能使搜索索引器忽略 /sitecore/content/Home 之外的项目?

I am trying to create a new Lucene index on a site running Sitecore 6.3.1. I used the existing "system" index as a guide, and I was successfully able to create a new index on web and master to index all items in the Sitecore content tree.

Where I am running into difficulty, however, is limiting which part of the content tree the database crawler indexes. Currently, the search index contains items from everywhere in the content tree (content items, media library items, layouts, templates, etc.). I would like to limit the index to only items in /sitecore/content/Home.

I have created a file at ~/App_Config/Include/Search Indexes/website.config, and I have pasted relevant sections below:

<?xml version="1.0" encoding="utf-8" ?>
<configuration>
  <sitecore>

    <!-- This works as expected.... -->
    <databases>
      <database id="web">
        <indexes hint="list:AddIndex">
          <index path="indexes/index[@id='website']" />
        </indexes>
      </database>

      <!-- ... similar entry for master database goes here ... -->
    </databases>

    <!-- So does this.... -->
    <indexes>
      <index id="website" singleInstance="true" type="Sitecore.Data.Indexing.Index, Sitecore.Kernel">
        <param desc="name">$(id)</param>
        <fields hint="raw:AddField">
          <!-- ... field descriptions go here ... -->
        </fields>
      </index>
    </indexes>

    <!-- This works... mostly.  The "__website" directory does get created,
          but the Root directive is getting ignored.
    -->
    <search>
      <configuration type="Sitecore.Search.SearchConfiguration, Sitecore.Kernel" singleInstance="true">
        <indexes hint="list:AddIndex">
          <index id="website" singleInstance="true" type="Sitecore.Search.Index, Sitecore.Kernel">
            <param desc="name">$(id)</param>
            <param desc="folder">__$(id)</param>

            <Analyzer ref="search/analyzer" />

            <locations hint="list:AddCrawler">
              <web type="Sitecore.Search.Crawlers.DatabaseCrawler, Sitecore.Kernel">
                <Database>web</Database>
                <Root>/sitecore/content/home</Root>
                <Tags>content</Tags>
              </web>

              <!-- ... similar entry for master database goes here ... -->
            </locations>
          </index>
        </indexes>
      </configuration>
    </search>
  </sitecore>
</configuration>

A couple of notes:

  • This is not from my web.config file; I created a separate file so that I could distribute config changes via Sitecore packages.

  • The index was added to both master and web; I omitted the references to master for brevity.

  • Sitecore is definitely processing the entries for configuration/sitecore/search/configuration. I can see them when I go to http://localhost/sitecore/admin/showconfig.aspx, and if I change one of the tag values to something invalid (e.g., <Root>/nothere</Root>), Sitecore throws an Exception on the next page load.

  • I have reviewed the index contents in IndexViewer, and the wrong items are definitely getting indexed (for example, document #0 in the index is the /sitecore node).

Where am I going wrong? What changes do I need to make to my configuration file to get the search indexer to ignore items outside /sitecore/content/Home?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

赠我空喜 2024-11-01 05:10:39

I was able to solve the problem using the Advanced Database Crawler. Switching out the configuration/search/configuration block with the code provided in Alex's presentation (see above link) made everything start to work, more or less automagically.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文