重复数据还是更好的性能?
我们需要用不同的语言显示我们的产品名称,但只有其中一些产品的名称不是英文的。当我们查询某种语言的产品时,如果缺少某种语言名称,我们希望显示默认的英文名称。
为了获得更好的查询性能,当缺少某种语言的名称时,我们必须将默认的英文名称填充到语言相关产品名称表中(languageid + Productid 为主键)。它在此语言相关表中产生了大量重复名称,并且当默认英文名称更改时更新此表有点困难。
目前,该表中有约30万个产品,约30种语言,超过8,000,000行,至少90%以上的数据是重复的,并填充了默认的英文名称。但如果我们在查询中使用left join和isnull检查,查询性能会慢很多。
谁可以给我推荐一个更好的数据库设计,可以避免填充重复数据并具有更好的查询性能?
当前的表架构如下所示,
Table1 (about 300,000 rows)
ProductId | Country | Currency | others fields
------------|----------------|-----------|---------------
Product A | US | USD | ...
Product B | GB | GBP | ...
Table2 (about 9,000,000 rows)
LanguageId | ProductId | Product Name
------------|----------------|--------------------------
English | Product A | Product A Name
English | Product B | Product B Name
German | Product A | Produkt A Name
German | Product B | Product B Name (it's filled by English name)
我尝试过下面的查询以避免重复数据,但性能有点差。
SELECT
A.ProductId,
A.Country,
ISNULL(B1.ProductName, B2.ProductName) as ProductName
FROM
Table1 A (NOLOCK)
LEFT JOIN Table2 B1 (NOLOCK) on A.ProductId = B1.ProductId
LEFT JOIN Table2 B2 (NOLOCK) on A.ProductId = B2.ProductId and B2.LanguageId = 'ENGLISH'
WHERE
B1.LanguageId = 'German'
ORDER BY
ISNULL(B1.ProductName, B2.ProductName)
We need to display our product name in different languages, but only some of the them have name in different language than English. When we query products with the certain language, we want to show the default name in English if the certain language name is missing.
To get the better query performance, we have to fill the default English name to the language dependent product name table (languageid + productid is the primary key) when the name for the certain language is missing. It made lots of duplicate name in this language dependent table and it's a bit difficult to update this table when the default English name changed.
Currently, we have about 300,000 products with about 30 languages and more than 8,000,000 rows in this table, at least more than 90% data is duplicate and fill with default English name. But if we use left join and isnull check in the query, the query performance will be much slower.
Who can recommend me a better database design that I can avoid to fill the duplicate data and have a better query performance?
The current tables schema like below
Table1 (about 300,000 rows)
ProductId | Country | Currency | others fields
------------|----------------|-----------|---------------
Product A | US | USD | ...
Product B | GB | GBP | ...
Table2 (about 9,000,000 rows)
LanguageId | ProductId | Product Name
------------|----------------|--------------------------
English | Product A | Product A Name
English | Product B | Product B Name
German | Product A | Produkt A Name
German | Product B | Product B Name (it's filled by English name)
I have tried below query to avoid duplicate data, but the performance was a bit worst.
SELECT
A.ProductId,
A.Country,
ISNULL(B1.ProductName, B2.ProductName) as ProductName
FROM
Table1 A (NOLOCK)
LEFT JOIN Table2 B1 (NOLOCK) on A.ProductId = B1.ProductId
LEFT JOIN Table2 B2 (NOLOCK) on A.ProductId = B2.ProductId and B2.LanguageId = 'ENGLISH'
WHERE
B1.LanguageId = 'German'
ORDER BY
ISNULL(B1.ProductName, B2.ProductName)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您测试过 LEFT JOIN 和 ISNULL 吗?或者这只是一个猜测?考虑到您正在转移大量数据,我想说可选语言行会快得多,除非您有一些非常糟糕的索引
Have you tested the LEFT JOIN and ISNULL? or is this just a guess? Given you are shifting a lot of data around, I'd say the optional language row would be far quicker unless you have some really bad indexing
我不确定这对于您的特定情况是否可行,但为什么不让 UI 界面层或应用程序层通过通用本地化模式处理翻译呢?
I'm not sure if this is feasible for your particular situation, but why not let the UI interface layer or Application layer handle the translation via a common localization pattern?
如果该语言记录中没有您的产品名称,请在该字段中输入
NULL
。当您执行查询时,请使用COALESCE
将NULL
替换为您的英文产品名称。我建议的设计如下所示:
由于西班牙语有一个
NULL
条目,因此COALESCE
会跳过 null 并放入默认产品名称。If there is no name for your product in that language record, put a
NULL
in that field. When you do your query, useCOALESCE
to replace theNULL
with your English product name.My proposed design would look something like this:
Since Spanish has a
NULL
entry, theCOALESCE
skips the null and puts in the default product name.