zend 搜索 lucene

发布于 2024-07-25 10:23:01 字数 2884 浏览 6 评论 0原文

我有一个数据库,希望与 Zend_Search_Lucene 一起使用。 但是,我在为 Lucene 创建“完全可搜索”文档时遇到困难。

每个 Zend_Search_Lucene 文档从两个关系数据库表(Table_OneTable_Two)中提取信息。 Table_One 具有基本信息(idowner_idtitledescriptionlocation 等),Table_TwoTable_One 具有 1:N 关系(这意味着,对于 Table_One 中的每个条目, Table_Two 中可能有一个或多个条目)。 Table_Two 包含:id、listing_id卧室浴室price_minprice_max可用日期。 请参见图 1。

图 1

Table_One
    id (Primary Key)
    owner_id
    title
    description
    location
    etc...

Table_Two
    id (Primary Key)
    listing_id (Foreign Key to Table_One)
    bedrooms (int)
    bathrooms (int)
    price_min (int)
    price_max (int)
    date_available (datetime)

问题是,每个 Table_One 条目都有多个 Table_Two 条目。 [问题 1] 如何创建每个字段都是唯一的 Zend_Search_Lucene 文档? (参见图 2)

图 2

Lucene Document
    id:Keyword
    owner_id:Keyword
    title:UnStored
    description:UnStored
    location: UnStored
    date_registered:Keyword
    ... (other Table_One information)
    bedrooms: UnStored
    bathrooms: UnStored
    price_min: UnStored
    price_max: UnStored
    date_available: Keyword
    bedrooms_1: <- Would prefer not to have do this as this makes the bedrooms harder to search.

接下来,我需要能够对 bedroomsbathroomsprice_min执行范围查询>price_max 字段。 (例如:查找具有 1 到 3 个卧室的文档)Zend_Search_Lucene 将只允许在同一字段上进行范围搜索。 根据我的理解,这意味着我想要进行范围查询的每个字段只能包含一个值(例如:卧室:“1 间卧室”);

我现在在 Lucene 文档中拥有的是 bedroomsbathroomsprice_minprice_maxdate_available 字段以空格分隔。

示例:

Sample Table_One Entry: 
    | 5 | 2 | "Sample Title" | "Sample Description" | "Sample Location" | 2008-01-12

Sample Table_Two Entries:
    | 10 | 5 | 3 | 1 | 900 | 1000 | 2009-10-01
    | 11 | 5 | 2 | 1 | 800 | 850 | 2009-08-11
    | 12 | 5 | 1 | 1 | 650 | 650 | 2009-09-15 

示例 Lucene 文档

id:5
owner_id:2
title: "Sample Title"
description: "Sample Description"
location: "Sample Location"
date_registered: [datetime stamp YYYY-MM-DD]
bedrooms: "3 bedroom 2 bedroom 1 bedroom" 
bathrooms: "1 bathroom 1 bathroom 1 bathroom"
price_min: "900 800 650"
price_max: "1000 850 650"
date_available: "2009-10-01 2009-08-11 2009-09-15"

[问题 2] 您能否对 bedroombathroomprice_minprice_max< 进行范围查询搜索/code>、date_available 字段如上所示,还是每个范围查询字段必须仅包含一个值(例如“1 间卧室”)? 我无法让范围查询以其当前的形式工作。 我在这里不知所措。

提前致谢。

I have a database that I would like to leverage with Zend_Search_Lucene. However, I am having difficulty creating a "fully searchable" document for Lucene.

Each Zend_Search_Lucene document pulls information from two relational database tables (Table_One and Table_Two). Table_One has basic information (id, owner_id, title, description, location, etc.), Table_Two has a 1:N relationship to Table_One (meaning, for each entry in Table_One, there could be one or more entries in Table_Two). Table_Two contains: id, listing_id, bedrooms, bathrooms, price_min, price_max, date_available. See Figure 1.

Figure 1

Table_One
    id (Primary Key)
    owner_id
    title
    description
    location
    etc...

Table_Two
    id (Primary Key)
    listing_id (Foreign Key to Table_One)
    bedrooms (int)
    bathrooms (int)
    price_min (int)
    price_max (int)
    date_available (datetime)

The problem is, there are multiple Table_Two entries for each Table_One entry. [Question 1] How to create a Zend_Search_Lucene document where each field is unique? (See Figure 2)

Figure 2

Lucene Document
    id:Keyword
    owner_id:Keyword
    title:UnStored
    description:UnStored
    location: UnStored
    date_registered:Keyword
    ... (other Table_One information)
    bedrooms: UnStored
    bathrooms: UnStored
    price_min: UnStored
    price_max: UnStored
    date_available: Keyword
    bedrooms_1: <- Would prefer not to have do this as this makes the bedrooms harder to search.

Next, I need to be able to do a Range Query on the bedrooms, bathrooms, price_min and price_max fields. (Example: finding documents that have between 1 and 3 bedrooms) Zend_Search_Lucene will only allow ranged searches on the same field. From my understanding, this means each field I want to do a ranged query on can only contain one value (example: bedrooms:"1 bedroom");

What I have now, within the Lucene Document is the bedrooms, bathrooms, price_min, price_max, date_available fields being space delimited.

Example:

Sample Table_One Entry: 
    | 5 | 2 | "Sample Title" | "Sample Description" | "Sample Location" | 2008-01-12

Sample Table_Two Entries:
    | 10 | 5 | 3 | 1 | 900 | 1000 | 2009-10-01
    | 11 | 5 | 2 | 1 | 800 | 850 | 2009-08-11
    | 12 | 5 | 1 | 1 | 650 | 650 | 2009-09-15 

Sample Lucene Document

id:5
owner_id:2
title: "Sample Title"
description: "Sample Description"
location: "Sample Location"
date_registered: [datetime stamp YYYY-MM-DD]
bedrooms: "3 bedroom 2 bedroom 1 bedroom" 
bathrooms: "1 bathroom 1 bathroom 1 bathroom"
price_min: "900 800 650"
price_max: "1000 850 650"
date_available: "2009-10-01 2009-08-11 2009-09-15"

[Question 2] Can you do a Range Query search on the bedroom, bathroom, price_min, price_max, date_available fields as they are shown above or does each range query field have to contain only one value (e.g. "1 bedroom")? I have not been able to get the Range Query to work in its current form. I am at a lose here.

Thanks in advance.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

猥︴琐丶欲为 2024-08-01 10:23:01
  1. 我建议您为 Table_Two 中的每个条目创建一个单独的 Lucene 文档。 这将导致这些条目共有的 Table_One 信息出现一些重复,但是对于 Lucene 中更简单的索引结构来说,这并不是一个很高的代价。
  2. 使用 布尔查询 组合多个范围查询。 数值字段应类似于:

bedrooms: 3

price_min: 900

,Lucene 语法中的示例查询如下:

date_available:[20100101 TO 20100301] AND price_min:[600 TO 1000]
  1. I suggest you create a separate Lucene document for each entry in Table_Two. This will cause some duplication of the Table_One information common to these entries, but this is not a high price to pay for much easier index structure in Lucene.
  2. Use a boolean query to combine several range queries. The number-valued fields should be something like this:

bedrooms: 3

price_min: 900

and a sample query in Lucene syntax will be:

date_available:[20100101 TO 20100301] AND price_min:[600 TO 1000]
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文