重复数据还是更好的性能？

发布于 2024-10-20 00:11:38 字数 1424 浏览 3 评论 0原文

我们需要用不同的语言显示我们的产品名称，但只有其中一些产品的名称不是英文的。当我们查询某种语言的产品时，如果缺少某种语言名称，我们希望显示默认的英文名称。

为了获得更好的查询性能，当缺少某种语言的名称时，我们必须将默认的英文名称填充到语言相关产品名称表中（languageid + Productid 为主键）。它在此语言相关表中产生了大量重复名称，并且当默认英文名称更改时更新此表有点困难。

目前，该表中有约30万个产品，约30种语言，超过8,000,000行，至少90%以上的数据是重复的，并填充了默认的英文名称。但如果我们在查询中使用left join和isnull检查，查询性能会慢很多。

谁可以给我推荐一个更好的数据库设计，可以避免填充重复数据并具有更好的查询性能？

当前的表架构如下所示，

Table1 (about 300,000 rows)
ProductId   | Country        | Currency  | others fields
------------|----------------|-----------|---------------
Product A   | US             | USD       | ...
Product B   | GB             | GBP       | ...

Table2 (about 9,000,000 rows)
LanguageId  | ProductId      | Product Name
------------|----------------|--------------------------
English     | Product A      | Product A Name
English     | Product B      | Product B Name
German      | Product A      | Produkt A Name
German      | Product B      | Product B Name (it's filled by English name)

我尝试过下面的查询以避免重复数据，但性能有点差。

SELECT
    A.ProductId,
    A.Country,
    ISNULL(B1.ProductName, B2.ProductName) as ProductName
FROM
    Table1 A (NOLOCK)
    LEFT JOIN Table2 B1 (NOLOCK) on A.ProductId = B1.ProductId
    LEFT JOIN Table2 B2 (NOLOCK) on A.ProductId = B2.ProductId and B2.LanguageId = 'ENGLISH'
WHERE
    B1.LanguageId = 'German'
ORDER BY
    ISNULL(B1.ProductName, B2.ProductName)

原文

We need to display our product name in different languages, but only some of the them have name in different language than English. When we query products with the certain language, we want to show the default name in English if the certain language name is missing.

To get the better query performance, we have to fill the default English name to the language dependent product name table (languageid + productid is the primary key) when the name for the certain language is missing. It made lots of duplicate name in this language dependent table and it's a bit difficult to update this table when the default English name changed.

Currently, we have about 300,000 products with about 30 languages and more than 8,000,000 rows in this table, at least more than 90% data is duplicate and fill with default English name. But if we use left join and isnull check in the query, the query performance will be much slower.

Who can recommend me a better database design that I can avoid to fill the duplicate data and have a better query performance?

The current tables schema like below

Table1 (about 300,000 rows)
ProductId   | Country        | Currency  | others fields
------------|----------------|-----------|---------------
Product A   | US             | USD       | ...
Product B   | GB             | GBP       | ...

Table2 (about 9,000,000 rows)
LanguageId  | ProductId      | Product Name
------------|----------------|--------------------------
English     | Product A      | Product A Name
English     | Product B      | Product B Name
German      | Product A      | Produkt A Name
German      | Product B      | Product B Name (it's filled by English name)

I have tried below query to avoid duplicate data, but the performance was a bit worst.

SELECT
    A.ProductId,
    A.Country,
    ISNULL(B1.ProductName, B2.ProductName) as ProductName
FROM
    Table1 A (NOLOCK)
    LEFT JOIN Table2 B1 (NOLOCK) on A.ProductId = B1.ProductId
    LEFT JOIN Table2 B2 (NOLOCK) on A.ProductId = B2.ProductId and B2.LanguageId = 'ENGLISH'
WHERE
    B1.LanguageId = 'German'
ORDER BY
    ISNULL(B1.ProductName, B2.ProductName)

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

桃酥萝莉 2024-10-27 00:11:38

您测试过 LEFT JOIN 和 ISNULL 吗？或者这只是一个猜测？考虑到您正在转移大量数据，我想说可选语言行会快得多，除非您有一些非常糟糕的索引

SELECT
   ...,
   ISNULL(L.languageproductName, P.productname)
FROM
   Product P
   LEFT JOIN
   LangaugeStuff L ON P.productID = L.productID AND L.languageID = @Mylanguage

Have you tested the LEFT JOIN and ISNULL? or is this just a guess? Given you are shifting a lot of data around, I'd say the optional language row would be far quicker unless you have some really bad indexing

SELECT
   ...,
   ISNULL(L.languageproductName, P.productname)
FROM
   Product P
   LEFT JOIN
   LangaugeStuff L ON P.productID = L.productID AND L.languageID = @Mylanguage

回复收藏 0 原文

无人接听 2024-10-27 00:11:38

我不确定这对于您的特定情况是否可行，但为什么不让 UI 界面层或应用程序层通过通用本地化模式处理翻译呢？

回复收藏 0 原文

伏妖词 2024-10-27 00:11:38

如果该语言记录中没有您的产品名称，请在该字段中输入 NULL。当您执行查询时，请使用 COALESCE 将 NULL 替换为您的英文产品名称。

SELECT COALESCE(l.ProductName, 'Product Name')
FROM Language l

我建议的设计如下所示：

Language | ProductName | TitleMenu
----------------------------------
English  | Widgetizer  | Title
French   | La Widgette | La Title
Spanish  |             | El Title

由于西班牙语有一个 NULL 条目，因此 COALESCE 会跳过 null 并放入默认产品名称。

If there is no name for your product in that language record, put a NULL in that field. When you do your query, use COALESCE to replace the NULL with your English product name.

SELECT COALESCE(l.ProductName, 'Product Name')
FROM Language l

My proposed design would look something like this:

Language | ProductName | TitleMenu
----------------------------------
English  | Widgetizer  | Title
French   | La Widgette | La Title
Spanish  |             | El Title

Since Spanish has a NULL entry, the COALESCE skips the null and puts in the default product name.

回复收藏 0 原文

~没有更多了~