Solr中的复杂查询,可能吗?
大家好,我是 Solr 的新手,想要完成以下场景(如下),但不确定 Solr 是否能够处理这样的情况:
问题非常简单,我想构建一个价格比较搜索。有我的理性数据库表:
t_company:
company_id
company_name
t_product:
product_id
product_price
t_company_product:
company_product_id
company_id
product_id
在 Solr 中,我想执行以下搜索 - 获取以最低总价格提供 1 种或多种特定产品的所有公司(因此,如果您选择螺钉、钉子和板岩,我想给出总采购最低价)。
当我设置架构时,我将业务设置为主要实体,并将 Product_ids 和 Product_prices 设置为两个多值字段。
我可以这样查询吗?我该怎么做求和?
这是我所有的 XML schema.xml 和 data-config.xml
<document name="companies">
<entity name="company" dataSource="dsCompany"
query="select
newid() as row_id,
company_id,
company_name
from
t_company WITH (NOLOCK)">
<field column="row_id" name="row_id" />
<field column="company_id" name="company_id" />
<field column="company_name" name="company_name" />
<entity name="products" query="select
company_product_id,
product_id,
price
from
t_company_product WITH (NOLOCK)
where
company_id='${company.company_id}'"
dataSource="dsCompany">
<field name="company_product_id" column="company_product_id" />
<field name="product_id" column="product_id" />
<field name="price" column="price" />
</entity>
</entity>
<fields>
<field name="row_id" type="string" indexed="true" stored="true" required="true"/>
<field name="company_id" type="integer" indexed="true" stored="true" required="true" />
<field name="company_name" type="text" indexed="true" stored="true"/>
<field name="service_id" type="integer" indexed="true" stored="true" required="true" />
<field name="price" type="tfloat" indexed="true" stored="true" required="true" />
</fields>
任何反馈将不胜感激!!!
Hey guys, I am new to Solr, and want to accomplish the following scenario (below), but not sure if Solr is capable of handling cases like that:
The problem very straight forward, I want to build a price comparison search. There are my rational DB tables:
t_company:
company_id
company_name
t_product:
product_id
product_price
t_company_product:
company_product_id
company_id
product_id
In Solr, I want to perform the following search - Get all companies that offer 1 or many of specific products for the lowest TOTAL price (so if you select screws, nails, and sheet rock, I want to give a total purchase lowest price).
When I set up my schema, I set the business as the main entity and product_ids and product_prices as two multivalued fields.
Can I query like that? How would I do sum?
Here is all my XML schema.xml and data-config.xml
<document name="companies">
<entity name="company" dataSource="dsCompany"
query="select
newid() as row_id,
company_id,
company_name
from
t_company WITH (NOLOCK)">
<field column="row_id" name="row_id" />
<field column="company_id" name="company_id" />
<field column="company_name" name="company_name" />
<entity name="products" query="select
company_product_id,
product_id,
price
from
t_company_product WITH (NOLOCK)
where
company_id='${company.company_id}'"
dataSource="dsCompany">
<field name="company_product_id" column="company_product_id" />
<field name="product_id" column="product_id" />
<field name="price" column="price" />
</entity>
</entity>
<fields>
<field name="row_id" type="string" indexed="true" stored="true" required="true"/>
<field name="company_id" type="integer" indexed="true" stored="true" required="true" />
<field name="company_name" type="text" indexed="true" stored="true"/>
<field name="service_id" type="integer" indexed="true" stored="true" required="true" />
<field name="price" type="tfloat" indexed="true" stored="true" required="true" />
</fields>
Any feedback will be greatly appreciated!!!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
您可以使用函数查询按总和对结果进行排序,请参阅此处。
在我的上一个项目中,我们使用了 4.0 的夜间构建,并且运行良好。它包含的功能比 1.4 多得多,值得您冒使用未发布版本的小风险。
更新:
要使用总和,您可以尝试为每个产品价格添加一个动态字段(我不知道如何将总和与多值字段一起使用,或者是否可能)。
添加到数据配置
添加到 schema.xml
如果我理解正确,您应该能够使用如下查询:
q=:&sort=sum(price_"钉子的 id",price_"螺丝的 id",price_"...的 id") asc
You can use a function query to sort the results by a sum, see here.
In my last project we used a nightly build of 4.0 and it is working fine. It contains so much more functionality than 1.4 that is worth the small risk you may take by using a non released version.
Update:
To use the sum you could try to do add a dynamic field per each product price (I don't know how to use the sum with multivalued fields or if it is possible).
Add to data-config
<field name="price_${products.product_id}" column="price" />
Add to schema.xml
<dynamicField name="price_*" type="decimal" indexed="false" stored="true" />
and if I understand it correctly you should be able to use a query like:
q=:&sort=sum(price_"id for nails",price_"id for screws",price_"id for ...") asc
在 1.4.1 中可能,在当前主干(4.0)中没有或至少不容易。
在 solr 1.4 中,字段折叠可以对返回的记录执行聚合。在 trunk solr 4.0 中,这已变成一个分组选项,只能执行最小/最大类型查询(据我所知)。
该文档可以在这里找到:
http://wiki.apache.org/solr/FieldCollapsing
您必须扩展关系(将其视为所涉及表的一个大非规范化视图)。
In 1.4.1 probably, in current trunk (4.0) no or at least not easily.
In solr 1.4 there is field collapsing that can perform aggregates over the records returned. In trunk solr 4.0 this has turned into a grouping option that can perform only min / max type queries (as far as I'm aware).
The documentation can be found here:
http://wiki.apache.org/solr/FieldCollapsing
Remember you'll have to expand out the relationships ( consider it as 1 big denormalised view over the tables involved ).
Solr 无意取代关系数据库。如果您仍然想对关系内容建立索引,那么它们需要非规范化,因此将包含冗余数据。因此,对于大多数查询来说,结果数的计数将会关闭,例如,仅搜索公司名称将产生比预期更高的结果总数。然而,随着场的崩溃,你可以摆脱它。但是,如果您使用分面,那么从那里消除重复项是不可能的。
如果您使用您提到的所有数据形成一个单一架构,那么您可以在一定程度上执行关系查询。谷歌“solr issues 2272”以获取详细信息。目前只能在单个模式中实现。
我相信目前在搜索引擎中执行求和操作是不可能的。我可能是错的,如果有人知道一种方法,我也会很感兴趣。
Solr is not intended to replace a relational database. If you would still like to index relational content then they need to be denormalized hence would contain redundant data. So the count of # of results will be off for most of the queries, for example a search for just the company name will yield a higher total number of results than expected. However with field collapsing you can get away from it. However if you use faceting then eliminating duplicates from there is not possible afaik.
If you form a single schema with all the data that you had mentioned then you could perform the relational queries to a certain extent. Google "solr issue 2272" to get the details. It is currently possible only within a single schema.
Performing a summation operation within a search engine is not possible at this time i believe. i might be wrong and if someone knows a way to do it, i will be very interested also.
我想您可能会问如何自定义评分。这是 lucene 中的示例。
http://sujitpal.blogspot.com/2010/10 /custom-scoring-with-lucene-payloads.html
来自 LucidImagination
http://www.lucidimagination.com/blog/ 2009/08/05/有效负载入门/
I think you might be asking about how to customize scoring. Here's an example in lucene.
http://sujitpal.blogspot.com/2010/10/custom-scoring-with-lucene-payloads.html
From LucidImagination
http://www.lucidimagination.com/blog/2009/08/05/getting-started-with-payloads/