如何提高 Zend_Search_Lucene 查询的性能?

发布于 2024-10-22 02:13:26 字数 3515 浏览 1 评论 0原文

如何改进查询,如下所示?

我的索引已完全优化,除了 item_id 是关键字字段之外,所有字段均未存储。

问题出在“if ($auth) {”部分。如果删除此部分,搜索时间始终低于 1 秒,但添加此部分后,搜索时间为 5 秒或更长。显然这是一个更复杂的查询,但没有它我就活不下去。我需要该部分中的逻辑来仅获取用户有权查看的搜索结果。我知道速度减慢在于搜索工作本身,因为如果我删除“if ($authQuery) { $query->addSubquery($authQuery, true); }”行,搜索速度会相当快。

我试图在“if ($auth) {”部分中基本上实现以下逻辑:

lucene 字段 gi_aro、gc_aro、i_access 和 c_access 全部只包含一个整数,每个

if ((array_in({gi_aro}, $gmid) OR {i_access} <= $gid)
    AND (array_in({gc_aro}, $gmid) {OR c_access} <= $gid)) {
  include in search results
}

{} = lucene 字段

// keywords query
$keywords = explode(' ', $keyword);

$keywordQuery = new Zend_Search_Lucene_Search_Query_Multiterm();
foreach ($keywords as $term) {
  $keywordQuery->addTerm(new Zend_Search_Lucene_Index_Term($term, 'content'));
  $keywordQuery->addTerm(new Zend_Search_Lucene_Index_Term($term, 'search_display_name'));
}

// topcat query
if (!empty($topcat)) {
  $term = new Zend_Search_Lucene_Index_Term($topcat, 'topcats');
  $topcatQuery = new Zend_Search_Lucene_Search_Query_Term($term);
}

// cat query
if (!empty($cat)) {
  $term = new Zend_Search_Lucene_Index_Term($cat, 'cats');
  $catQuery = new Zend_Search_Lucene_Search_Query_Term($term);
}

// only authorized items query
if ($auth) {
  $user = JFactory::getUser();
  $gid = (int)$user->get('aid');
  $gmid = explode(',', $user->gmid);

  // flexicontent cat auth
  $gcQuery = new Zend_Search_Lucene_Search_Query_MultiTerm();
  foreach ($gmid as $g) {
    $gcQuery->addTerm(new Zend_Search_Lucene_Index_Term($g, 'gc_aro'));
  }

  // stock joomla cat auth
  $lowCAccessTerm = new Zend_Search_Lucene_Index_Term(0, 'c_access');
  $highCAccessTerm = new Zend_Search_Lucene_Index_Term($gid, 'c_access');
  $cAccessQuery = new Zend_Search_Lucene_Search_Query_Range($lowCAccessTerm, $highCAccessTerm, true);

  // ORed flexicontent cat auth & stock joomla cat auth
  $catAuthQuery = new Zend_Search_Lucene_Search_Query_Boolean();
  $catAuthQuery->addSubquery($gcQuery);
  $catAuthQuery->addSubquery($cAccessQuery);

  // flexicontent itm auth
  $giQuery = new Zend_Search_Lucene_Search_Query_MultiTerm();
  foreach ($gmid as $g) {
    $giQuery->addTerm(new Zend_Search_Lucene_Index_Term($g, 'gi_aro'));
  }

  // stock joomla itm auth
  $lowIAccessTerm = new Zend_Search_Lucene_Index_Term(0, 'i_access');
  $highIAccessTerm = new Zend_Search_Lucene_Index_Term($gid, 'i_access');
  $iAccessQuery = new Zend_Search_Lucene_Search_Query_Range($lowIAccessTerm, $highIAccessTerm, true);

  // ORed flexicontent itm auth & stock joomla itm auth
  $itmAuthQuery = new Zend_Search_Lucene_Search_Query_Boolean();
  $itmAuthQuery->addSubquery($giQuery);
  $itmAuthQuery->addSubquery($iAccessQuery);

  // ANDed itmAuthQuery & catAuthQuery
  $authQuery = new Zend_Search_Lucene_Search_Query_Boolean();
  $authQuery->addSubquery($catAuthQuery, true);
  $authQuery->addSubquery($itmAuthQuery, true);
}

// composite query
$query = new Zend_Search_Lucene_Search_Query_Boolean();
$query->addSubquery($keywordQuery, true);
// if cat query is set we don't need topcat to restrict result set
if ($catQuery) {
  $query->addSubquery($catQuery, true);
} elseif ($topcatQuery) {
  $query->addSubquery($topcatQuery, true);
}
if ($authQuery) { $query->addSubquery($authQuery, true); }

// search
$execTime = new JProfiler();
$this->hits = $index->find($query);
echo $execTime->mark('executed');

How can I improve the query as seen below?

My index is fully optimized and all fields are unstored except for item_id which is a keyword field.

The problem is in the "if ($auth) {" section. If this section is removed search times are always under 1 sec but when this section is added in search times are 5 sec or more. Obviously it's a more complex query but I can't live without it. I need the logic in that section to get only search results that the user is authorized to view. I know the slowdown is in the search effort itself because if I remove the line "if ($authQuery) { $query->addSubquery($authQuery, true); }" the search is quite fast.

I'm trying to basically effect the following logic in the "if ($auth) {" section:

lucene fields gi_aro, gc_aro, i_access and c_access all consist of nothing more than a single integer each

if ((array_in({gi_aro}, $gmid) OR {i_access} <= $gid)
    AND (array_in({gc_aro}, $gmid) {OR c_access} <= $gid)) {
  include in search results
}

{} = lucene fields

// keywords query
$keywords = explode(' ', $keyword);

$keywordQuery = new Zend_Search_Lucene_Search_Query_Multiterm();
foreach ($keywords as $term) {
  $keywordQuery->addTerm(new Zend_Search_Lucene_Index_Term($term, 'content'));
  $keywordQuery->addTerm(new Zend_Search_Lucene_Index_Term($term, 'search_display_name'));
}

// topcat query
if (!empty($topcat)) {
  $term = new Zend_Search_Lucene_Index_Term($topcat, 'topcats');
  $topcatQuery = new Zend_Search_Lucene_Search_Query_Term($term);
}

// cat query
if (!empty($cat)) {
  $term = new Zend_Search_Lucene_Index_Term($cat, 'cats');
  $catQuery = new Zend_Search_Lucene_Search_Query_Term($term);
}

// only authorized items query
if ($auth) {
  $user = JFactory::getUser();
  $gid = (int)$user->get('aid');
  $gmid = explode(',', $user->gmid);

  // flexicontent cat auth
  $gcQuery = new Zend_Search_Lucene_Search_Query_MultiTerm();
  foreach ($gmid as $g) {
    $gcQuery->addTerm(new Zend_Search_Lucene_Index_Term($g, 'gc_aro'));
  }

  // stock joomla cat auth
  $lowCAccessTerm = new Zend_Search_Lucene_Index_Term(0, 'c_access');
  $highCAccessTerm = new Zend_Search_Lucene_Index_Term($gid, 'c_access');
  $cAccessQuery = new Zend_Search_Lucene_Search_Query_Range($lowCAccessTerm, $highCAccessTerm, true);

  // ORed flexicontent cat auth & stock joomla cat auth
  $catAuthQuery = new Zend_Search_Lucene_Search_Query_Boolean();
  $catAuthQuery->addSubquery($gcQuery);
  $catAuthQuery->addSubquery($cAccessQuery);

  // flexicontent itm auth
  $giQuery = new Zend_Search_Lucene_Search_Query_MultiTerm();
  foreach ($gmid as $g) {
    $giQuery->addTerm(new Zend_Search_Lucene_Index_Term($g, 'gi_aro'));
  }

  // stock joomla itm auth
  $lowIAccessTerm = new Zend_Search_Lucene_Index_Term(0, 'i_access');
  $highIAccessTerm = new Zend_Search_Lucene_Index_Term($gid, 'i_access');
  $iAccessQuery = new Zend_Search_Lucene_Search_Query_Range($lowIAccessTerm, $highIAccessTerm, true);

  // ORed flexicontent itm auth & stock joomla itm auth
  $itmAuthQuery = new Zend_Search_Lucene_Search_Query_Boolean();
  $itmAuthQuery->addSubquery($giQuery);
  $itmAuthQuery->addSubquery($iAccessQuery);

  // ANDed itmAuthQuery & catAuthQuery
  $authQuery = new Zend_Search_Lucene_Search_Query_Boolean();
  $authQuery->addSubquery($catAuthQuery, true);
  $authQuery->addSubquery($itmAuthQuery, true);
}

// composite query
$query = new Zend_Search_Lucene_Search_Query_Boolean();
$query->addSubquery($keywordQuery, true);
// if cat query is set we don't need topcat to restrict result set
if ($catQuery) {
  $query->addSubquery($catQuery, true);
} elseif ($topcatQuery) {
  $query->addSubquery($topcatQuery, true);
}
if ($authQuery) { $query->addSubquery($authQuery, true); }

// search
$execTime = new JProfiler();
$this->hits = $index->find($query);
echo $execTime->mark('executed');

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

揽清风入怀 2024-10-29 02:13:26

只检查不在索引内的结果怎么样?将某种键保存到索引中(数据库主键、guid,几乎任何东西)。然后获取结果并删除那些不允许用户看到的结果。

$allowedArray = $acl->getAllowedIds();
// check happens when echoing the content to prevent double cycling (filtering and echoing in view)
foreach ($result as $item) {
    if (in_array($item->keyVaue, $allowedArray)) {
        //echo
    }
} 

编辑:请注意,这取决于大多数查询的结果。如果常规查询中有 <50 个结果,那么用 PHP 执行就可以了。但如果普通查询返回大约 10,000 个结果,这可能不是一个好主意;)

What about checking only the results not inside the index? Save some sort of key into the index (db primary key, guid, practically anything). Then fetch the results and remove those which the user is not allowed to see.

$allowedArray = $acl->getAllowedIds();
// check happens when echoing the content to prevent double cycling (filtering and echoing in view)
foreach ($result as $item) {
    if (in_array($item->keyVaue, $allowedArray)) {
        //echo
    }
} 

Edit: Note it depends on the result of most queries. If there are say <50 result in regular query, then doing it in PHP is OK. But if common query returns like 10,000 results, it might not be a good idea ;)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文