Zend:索引生成以及 Zend_Search_Lucene 的优缺点
我以前从未遇到过像 Zend Search Lucene 这样的应用程序/类,因为我总是查询我的数据库。
Zend_Search_Lucene 使用 文档作为原子对象 索引。一个文档分为 命名字段,并且字段有内容 可以搜索到。
文档由 Zend_Search_Lucene_Document 类,以及 该类的 this 对象包含 Zend_Search_Lucene_Field 的实例 代表的字段 文档。
值得注意的是,任何 可以将信息添加到索引中。 应用程序特定信息或 元数据可以存储在文档中 字段,稍后使用 搜索期间的文档。
所以这基本上是说我可以将其应用于包括数据库在内的任何事物,这里的关键是创建用于搜索的索引。
我想要掌握的是我应该将索引存储在应用程序中的确切位置,例如我们将手机存储在数据库中,制造商,型号 - 我应该如何对索引进行分类?
如果我用地址创建用户索引,我显然不希望它们公开可见,我只是对这一切如何一起运作感到困惑,如果有已知的缺点,我在使用时应该知道的任何陷阱它。
I've never came across an app/class like Zend Search Lucene before, as I've always queried my database.
Zend_Search_Lucene operates with
documents as atomic objects for
indexing. A document is divided into
named fields, and fields have content
that can be searched.A document is represented by the
Zend_Search_Lucene_Document class, and
this objects of this class contain
instances of Zend_Search_Lucene_Field
that represent the fields on the
document.It is important to note that any
information can be added to the index.
Application-specific information or
metadata can be stored in the document
fields, and later retrieved with the
document during search.
So this is basically saying that I can apply this to anything including databases, the key thing here is making indexes for searching.
What I'm trying to grasp is where exactly should I store the indexes in my application, let's take for example we have phones stored in a database, a manufacturers, models - how should I categorize the indexes?
If I'm making indexes of users with say, addresses I obviously wouldn't want them to be publically viewable, I'm just confused on how it all works out together, if there are known disadvantages, any gotchas I should know while using it.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
Lucene 索引存储在数据库外部。我将它存储在“数据”目录中,作为控制器、模型和视图的姐妹。但你可以将它存储在任何地方;只需要在打开索引查询时指定路径即可。
它基本上是存储在数据库中的文档的冗余副本,您必须自己保持它们同步。这是缺点之一:您必须编写代码来根据数据库查询结果填充 Lucene 索引。当您向数据库添加数据时,您还必须更新 Lucene 索引。
使用外部全文索引解决方案的优点是可以减少 RDBMS 的工作负载。要查找文档,您可以使用 Lucene API 执行搜索。结果应包括一个包含主键值的字段(作为文档的一部分,但无需对其进行分析以进行 FT 搜索)。当您执行 Lucene 搜索时,您会取回该字段,以便您可以在数据库中查找相应的行。
这有助于回答您的问题吗?
我最近为 MySQL 大学做了一个演讲,比较了全文搜索解决方案:
http://forge.mysql.com/wiki/Practical_Full-Text_Search_in_MySQL
我也发布我的幻灯片位于 http://www.SlideShare.net/billkarwin。
A Lucene index is stored outside the database. I'd store it in a "data" directory as a sister to your controllers, models, and views. But you can store it anywhere; you just need to specify the path when you open the index for querying.
It's basically a redundant copy of the documents stored in your database, and you have to keep them in sync yourself. That's one of the disadvantages: you have to write code to populate the Lucene index based on results of a query against your database. As you add data to the database, you have to update your Lucene index as well.
An advantage of using an external full-text index solution is that you can reduce the workload on your RDBMS. To find a document, you execute a search using the Lucene API. The result should include a field containing the primary key value (as part of the document but no need to make it analyzed for FT search). You get this field back when you do a Lucene search, so you can look up the respective row in the database.
Does that help answer your question?
I gave a presentation recently for MySQL University comparing full-text search solutions:
http://forge.mysql.com/wiki/Practical_Full-Text_Search_in_MySQL
I also publish my slides at http://www.SlideShare.net/billkarwin.