SQL 数据规范化/性能

发布于 2024-07-12 06:35:51 字数 861 浏览 7 评论 0原文

我正在为保险行业开发一个Web API,并试图为保险报价制定一个合适的数据结构。

数据库已经包含一个“评级”表,基本上是:

sysID (PK, INT IDENTITY)
goods_type (VARCHAR(16))
suminsured_min (DECIMAL(9,2))
suminsured_max (DECIMAL(9,2))
percent_premium (DECIMAL(9,6))
[Unique Index on goods_type, suminsured_min and suminsured_max]

[编辑] 每种类型的货物通常有 3 - 4 个保额范围 [/edit]

商品类型列表很少变化,大多数保险查询都涉及价值低于 100 美元的商品。 因此,我正在考虑使用以下格式的表格进行反规范化(对于从 0.00 美元到 100.00 美元的所有值):

Table Name: tblRates[goodstype]
suminsured (DECIMAL(9,2)) Primary Key
premium (DECIMAL(9,2))

反规范化此数据应该很容易维护,因为费率通常最多每月只更新一次。 所有价值大于 100 美元的请求将始终在主表中查找并计算。

我的问题是:
1. 我最好将总保险值存储为 DECIMAL(9,2) 还是存储在 BIGINT 中的以分为单位的值?
2. 这种反规范化方法涉及在可能的 20 个表中存储 10,001 个值(0.00 美元到 100.00 美元,增量为 0.01 美元)。 这可能比查找percent_premium 并执行计算更有效吗? - 或者我应该坚持使用主表并进行计算?

I am working on a web API for the insurance industry and trying to work out a suitable data structure for the quoting of insurance.

The database already contains a "ratings" table which is basically:

sysID (PK, INT IDENTITY)
goods_type (VARCHAR(16))
suminsured_min (DECIMAL(9,2))
suminsured_max (DECIMAL(9,2))
percent_premium (DECIMAL(9,6))
[Unique Index on goods_type, suminsured_min and suminsured_max]

[edit]
Each type of goods typically has 3 - 4 ranges for suminsured
[/edit]

The list of goods_types rarely changes and most queries for insurance will involve goods worth less than $100. Because of this, I was considering de-normalising using tables in the following format (for all values from $0.00 through to $100.00):

Table Name: tblRates[goodstype]
suminsured (DECIMAL(9,2)) Primary Key
premium (DECIMAL(9,2))

Denormalising this data should be easy to maintain as the rates are generally only updated once per month at most. All requests for values >$100 will always be looked up in the primary tables and calculated.

My question(s) are:
1. Am I better off storing the suminsured values as DECIMAL(9,2) or as a value in cents stored in a BIGINT?
2. This de-normalisation method involves storing 10,001 values ($0.00 to $100.00 in $0.01 increments) in possibly 20 tables. Is this likely to be more efficient than looking up the percent_premium and performing a calculation? - Or should I stick with the main tables and do the calculation?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

半城柳色半声笛 2024-07-19 06:35:51

不要创建新表。 您已经有一个关于商品、最小值和最大值的索引,因此这个 sql for (known Goods and its value):

SELECT percent_premium 
FROM ratings 
WHERE goods='PRECIOUST' and :PREC_VALUE BETWEEN suminsured_min AND suminsured_max

将有效地使用您的索引。

您要查找的数据类型是smallmoney。 用它。

Don't create new tables. You already have an index on goods, min and max values, so this sql for (known goods and its value):

SELECT percent_premium 
FROM ratings 
WHERE goods='PRECIOUST' and :PREC_VALUE BETWEEN suminsured_min AND suminsured_max

will use your index efficently.

The data type you are looking for is smallmoney. Use it.

我们的影子 2024-07-19 06:35:51

您建议的计划将对 10001 行使用二分搜索,而不是 34

这几乎不能提高性能,所以不要这样做。

至于算术,BIGINT 会稍微快一些,我想你几乎不会注意到这一点。

The plan you suggest will use a binary search on 10001 rows instead of 3 or 4.

It's hardly a performance improvement, don't do that.

As for arithmetics, BIGINT will be slightly faster, thought I think you will hardly notice that.

零度° 2024-07-19 06:35:51

我不完全确定我们正在谈论什么计算,但除非它们非常复杂,否则它们很可能比在几个不同的表中查找数据要快得多。 如果可能,请在数据库中执行计算(即使用存储过程),以最大限度地减少应用程序层之间的数据流量。

即使数据加载会更快,我认为必须每月一次(甚至每季度一次)更新非规范化数据的想法是非常可怕的。 您可能可以很快地完成这项工作,但是下一个人处理该系统怎么办? 您会要求他们学习数据库结构,记住每次需要更新的 20 个表中的哪些表,并正确执行吗? 我想说的是,与不正确信息污染数据的风险相比,反规范化可能带来的性能提升并没有多大价值。

i am not entirely sure exactly what calculations we are talking about, but unless they are obnoxiously complicated, they will more than likely be much quicker than looking up data in several different tables. if possible, perform the calculations in the db (i.e. use stored procedures) to minimize the data traffic between your application layers too.

and even if the data loading would be quicker, i think the idea of having to update de-normalized data as often as once a month (or even once a quarter) is pretty scary. you can probably do the job pretty quickly, but what about the next person handling the system? would you require of them to learn the db structure, remember which of the 20-some tables that need to be updated each time, and do it correctly? i would say the possible performance gain on de-normalizing will not be worth much to the risk of contaminating the data with incorrect information.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文