Zend:索引生成以及 Zend_Search_Lucene 的优缺点

发布于 2024-08-14 17:35:11 字数 566 浏览 6 评论 0原文

我以前从未遇到过像 Zend Search Lucene 这样的应用程序/类,因为我总是查询我的数据库。

Zend_Search_Lucene 使用 文档作为原子对象 索引。一个文档分为 命名字段,并且字段有内容 可以搜索到。

文档由 Zend_Search_Lucene_Document 类,以及 该类的 this 对象包含 Zend_Search_Lucene_Field 的实例 代表的字段 文档。

值得注意的是,任何 可以将信息添加到索引中。 应用程序特定信息或 元数据可以存储在文档中 字段,稍后使用 搜索期间的文档。

所以这基本上是说我可以将其应用于包括数据库在内的任何事物,这里的关键是创建用于搜索的索引。

我想要掌握的是我应该将索引存储在应用程序中的确切位置,例如我们将手机存储在数据库中,制造商,型号 - 我应该如何对索引进行分类?

如果我用地址创建用户索引,我显然不希望它们公开可见,我只是对这一切如何一起运作感到困惑,如果有已知的缺点,我在使用时应该知道的任何陷阱它。

I've never came across an app/class like Zend Search Lucene before, as I've always queried my database.

Zend_Search_Lucene operates with
documents as atomic objects for
indexing. A document is divided into
named fields, and fields have content
that can be searched.

A document is represented by the
Zend_Search_Lucene_Document class, and
this objects of this class contain
instances of Zend_Search_Lucene_Field
that represent the fields on the
document.

It is important to note that any
information can be added to the index.
Application-specific information or
metadata can be stored in the document
fields, and later retrieved with the
document during search.

So this is basically saying that I can apply this to anything including databases, the key thing here is making indexes for searching.

What I'm trying to grasp is where exactly should I store the indexes in my application, let's take for example we have phones stored in a database, a manufacturers, models - how should I categorize the indexes?

If I'm making indexes of users with say, addresses I obviously wouldn't want them to be publically viewable, I'm just confused on how it all works out together, if there are known disadvantages, any gotchas I should know while using it.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

2024-08-21 17:35:11

Lucene 索引存储在数据库外部。我将它存储在“数据”目录中,作为控制器、模型和视图的姐妹。但你可以将它存储在任何地方;只需要在打开索引查询时指定路径即可。

它基本上是存储在数据库中的文档的冗余副本,您必须自己保持它们同步。这是缺点之一:您必须编写代码来根据数据库查询结果填充 Lucene 索引。当您向数据库添加数据时,您还必须更新 Lucene 索引。

使用外部全文索引解决方案的优点是可以减少 RDBMS 的工作负载。要查找文档,您可以使用 Lucene API 执行搜索。结果应包括一个包含主键值的字段(作为文档的一部分,但无需对其进行分析以进行 FT 搜索)。当您执行 Lucene 搜索时,您会取回该字段,以便您可以在数据库中查找相应的行。

这有助于回答您的问题吗?

我最近为 MySQL 大学做了一个演讲,比较了全文搜索解决方案:
http://forge.mysql.com/wiki/Practical_Full-Text_Search_in_MySQL

我也发布我的幻灯片位于 http://www.SlideShare.net/billkarwin

A Lucene index is stored outside the database. I'd store it in a "data" directory as a sister to your controllers, models, and views. But you can store it anywhere; you just need to specify the path when you open the index for querying.

It's basically a redundant copy of the documents stored in your database, and you have to keep them in sync yourself. That's one of the disadvantages: you have to write code to populate the Lucene index based on results of a query against your database. As you add data to the database, you have to update your Lucene index as well.

An advantage of using an external full-text index solution is that you can reduce the workload on your RDBMS. To find a document, you execute a search using the Lucene API. The result should include a field containing the primary key value (as part of the document but no need to make it analyzed for FT search). You get this field back when you do a Lucene search, so you can look up the respective row in the database.

Does that help answer your question?

I gave a presentation recently for MySQL University comparing full-text search solutions:
http://forge.mysql.com/wiki/Practical_Full-Text_Search_in_MySQL

I also publish my slides at http://www.SlideShare.net/billkarwin.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文