SQL扁平化& 动态查询问题

发布于 2024-07-30 13:12:08 字数 871 浏览 2 评论 0原文

首先,我不确定这是否是处理此问题的最佳方法……完全接受替代解决方案。

其次,我觉得我错过了明显的......但我仍然错过它,所以不要骄傲地问!

更新:.NET 3.5环境与SQL 2005,所以像动态linq这样的事情是可能的,尽管我总是倾向于认为任何类型的动态(动态构建)查询都有点笨拙。 PITA 维持。

更新2:针对northpole,伪代码/书面文字逻辑/sql/linq/C#都可以接受(!)...更多的是概念上的“什么是好的方法”而不是我需要的代码类似的问题。

给定一个看起来像这样的“鞋子”表:

  ShoeID   PropertyName    PropertyValue
  1        COLOR           RED   
  2        COLOR           RED   
  2        SIZE            11
  3        COLOR           RED   
  3        SIZE            11   
  3        MANUFACTURER    GUCCI

我需要一种方法来查询鞋子,使得

COLOR=RED 返回

  1
  2
  3

COLOR=RED,SIZE=11 返回

  2
  3

COLOR=RED 和 SIZE=11,并且 MANUFACTURER=GUCCI 返回

  3

At设计时,我不知道可能有多少个不同的属性,也不知道可能有多少个查询参数......

希望这是有道理的......如果没有,请相应地评论我会再试一次。

Firstly, I am not certain this is the best way to handle this AT ALL ... totally open to alternative solutions.

Secondly, I feel like I'm missing the obvious ... but I'm still missing it so not to proud to ask!

UPDATE: .NET 3.5 environment w/ SQL 2005, so things like dynamic linq possible, although I always tend to think of dynamic (build on fly) queries of any kind as kind of clunky. PITA to maintain.

UPDATE 2: In response to northpole, pseudo code / written word logic / sql / linq / C# all acceptable (!) ... more of a notional "what is a good approach" than I need code kind of question.

Given a table of "shoes" that looks like this:

  ShoeID   PropertyName    PropertyValue
  1        COLOR           RED   
  2        COLOR           RED   
  2        SIZE            11
  3        COLOR           RED   
  3        SIZE            11   
  3        MANUFACTURER    GUCCI

I need a way(s) to query for shoes such that

COLOR=RED returns

  1
  2
  3

COLOR=RED and SIZE=11 returns

  2
  3

COLOR=RED and SIZE=11 and MANUFACTURER=GUCCI returns

  3

At design time, I do not know how many different properties that there might be, nor do I know how many query parameters there might be ...

Hopefully this makes sense ... if not, please comment accordingly and I'll try again.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

初见 2024-08-06 13:12:08

因此,这是否是最好的方法取决于很多因素。 例如,您是否需要支持可能具有不同(不兼容)属性的不同类别的实体(例如鞋子与连衣裙)? 或者您预计将拥有的实体数量是多少(对于 10K 来说相当有效的实体对于 100M 来说就不起作用)? 或者您需要多久处理一次此类查询以及它们需要执行得如何?

对此最常见的两种思想流派是 EAV 模型,该模型或多或少减少您拥有的内容和基于列的方法,其中实体的属性(颜色、大小、等等...)每个都映射到一个单独的列。 每种方法都有其优点和缺点,其中最大的是前者的灵活性/性能以及后者动态更改表结构的必要性。

如果您确实使用现有模型,我建议将您的属性名称移至单独的表中,并更改“鞋子”表以将 FK 移至该表。 然后,您可以在 (property_id, shoe_id) 上创建索引并生成查询,如下所示:

SELECT shoe_id FROM shoes S_1 [, shoes S_2, ..., shoes S_X]
 WHERE S_1.property_id = 3 /* FK for 'color' */
  /* the following 3 lines will be repeated for each 'property' you need to query on */
   AND S_X.property_id = 4 /* FK for 'size' */
   AND S_X.shoe_id = S_1.shoe_id
   AND S_X.property_value = 'RED'

只要您有更多信息,该查询应该可以很好地执行合理或者属性分布不太均匀并且鞋子数量不是巨大。

So whether or not this is a best approach depends on many things. For example, do you need to support different classes of entities (e.g. shoes vs dresses) that may have different (incompatible) attributes? Or what's the estimated number of entities you'll have (something that will work reasonably well for 10K will not work for 100M)? Or how often will you have to handle such queries and how well do they need to perform?

Two most common schools of thought on this are EAV model, which is more or less what you have and column-based approach where your entity's properties (color, size, etc...) are each mapped to a separate column. Each has its advantages and disadvantages, the biggest of them being flexibility / performance of the former and the necessity to dynamically alter table structure for the latter.

If you do go with your existing model, I would recommend to move your property names into a separate table and change 'shoes' table to have FK to that table. You can then create an index on (property_id, shoe_id) and generate your queries as follows:

SELECT shoe_id FROM shoes S_1 [, shoes S_2, ..., shoes S_X]
 WHERE S_1.property_id = 3 /* FK for 'color' */
  /* the following 3 lines will be repeated for each 'property' you need to query on */
   AND S_X.property_id = 4 /* FK for 'size' */
   AND S_X.shoe_id = S_1.shoe_id
   AND S_X.property_value = 'RED'

which should perform reasonably well provided you have a more or less uniform distribution of attributes and not a huge number of shoes.

阳光下的泡沫是彩色的 2024-08-06 13:12:08

如果我理解正确的话,您希望能够创建一个查询,该查询可以具有基于任意数量字段的条件,而这些字段是预先未知的。 这很难……

虽然大多数 RDBMS 确实提供某种“动态 SQL”功能,但这些方法往往要么使用起来很麻烦,要么很慢,或者两者兼而有之。

当然,您也可以在客户端代码中将 SQL 语句连接在一起,无论是 C#、Java、PHP 还是其他语言 - 但这同样会很麻烦,容易受到 SQL 注入攻击,而且通常相当笨拙。

此外,如果您的查询与一个请求不同,则 RDBMS 将无法缓存您的任何查询计划,并且正确建立索引以获得良好的查询性能往好里说是具有挑战性,往坏了说是不可能的。

因此,虽然我完全理解这个要求(并且几乎每天都必须与自己抗争),但它确实效果不佳。 我宁愿尝试确定最常用的搜索及其标准,并确保它们工作正常且快速。 限制用户的灵活性以获得不错的查询性能。 一个经典的权衡——你可以完全灵活,但性能却很糟糕,或者你可以让频繁的查询运行得很快——但失去一些灵活性。

您可能会有所收获的一个领域是在文本字段中搜索自由文本时 - SQL Server 等 RDBMS 支持全文搜索的概念,这为您提供了一定的灵活性和良好的性能。 如果您有很多文本字段,请检查一下。

马克

If I understand you correctly, you want to be able to create a query that could have criteria based on any number of fields, which are unknown upfront. That's tough....

While most RDBMS do offer some kind of "dynamic SQL" capability, those approachs tend to be either cumbersome to use, or slow, or both.

Of course, you could also concatenate together a SQL statement in your client code, be it C#, Java, PHP or whatever - but that again will tend to be cumbersome, prone to SQL injection attacks, and in general rather clumsy.

Also, if you have queries that are different from one request to another, the RDBMS won't be able to cache any of your query plans, and getting indexing right to get a decent query performance will be challenging at best, impossible at worst.

So while I totally understand the requirement (and have to fight it off myself almost daily), it's really something that doesn't work well. I would rather try to identify the most frequently used searches and their criteria and make sure those work fine and are fast. Limit the user's flexibility in order to achieve a decent query performance. A classic trade-off - you can be totally flexible but then you have horrible performance, or you can make your frequent queries run fast - but loose some flexibility.

One area where you could possibly gain something is when searching free text in text fields - RDBMS like SQL Server support the concept of a fulltext search, which gives you some flexibility and good performance. Check into that if you have lots of text fields.

Marc

我也只是我 2024-08-06 13:12:08

动态sql会帮助你。 比如:

ALTER PROCEDURE [dbo].[sp_GetFilteredRecords]
(
...
@ConditionList nvarchar(MAX) = null
)
AS
BEGIN
  DECLARE @sql        nvarchar(MAX), 
          @paramlist  nvarchar(MAX)

  SELECT @sql = 'select ... '
  ...
  SELECT @sql = @sql + ' where 1=1 '
  ...
  SELECT @sql = @sql + ' ' + @ConditionList + ' '


  SELECT @paramlist = ...

  EXEC sp_executesql @sql, @paramlist

当然,由于工具的不同,它可能会略有不同。

鲍里斯。

Dynamic sql will help you. Something like:

ALTER PROCEDURE [dbo].[sp_GetFilteredRecords]
(
...
@ConditionList nvarchar(MAX) = null
)
AS
BEGIN
  DECLARE @sql        nvarchar(MAX), 
          @paramlist  nvarchar(MAX)

  SELECT @sql = 'select ... '
  ...
  SELECT @sql = @sql + ' where 1=1 '
  ...
  SELECT @sql = @sql + ' ' + @ConditionList + ' '


  SELECT @paramlist = ...

  EXEC sp_executesql @sql, @paramlist

Sure it may slight differ due to tools.

Boris.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文