SQL 数据规范化/性能

发布于 2024-07-12 06:35:51 字数 861 浏览 7 评论 0原文

我正在为保险行业开发一个Web API，并试图为保险报价制定一个合适的数据结构。

数据库已经包含一个“评级”表，基本上是：

sysID (PK, INT IDENTITY)
goods_type (VARCHAR(16))
suminsured_min (DECIMAL(9,2))
suminsured_max (DECIMAL(9,2))
percent_premium (DECIMAL(9,6))
[Unique Index on goods_type, suminsured_min and suminsured_max]

[编辑] 每种类型的货物通常有 3 - 4 个保额范围 [/edit]

商品类型列表很少变化，大多数保险查询都涉及价值低于 100 美元的商品。因此，我正在考虑使用以下格式的表格进行反规范化（对于从 0.00 美元到 100.00 美元的所有值）：

Table Name: tblRates[goodstype]
suminsured (DECIMAL(9,2)) Primary Key
premium (DECIMAL(9,2))

反规范化此数据应该很容易维护，因为费率通常最多每月只更新一次。所有价值大于 100 美元的请求将始终在主表中查找并计算。

我的问题是：
1. 我最好将总保险值存储为 DECIMAL(9,2) 还是存储在 BIGINT 中的以分为单位的值？
2. 这种反规范化方法涉及在可能的 20 个表中存储 10,001 个值（0.00 美元到 100.00 美元，增量为 0.01 美元）。这可能比查找percent_premium 并执行计算更有效吗？ - 或者我应该坚持使用主表并进行计算？

原文

I am working on a web API for the insurance industry and trying to work out a suitable data structure for the quoting of insurance.

The database already contains a "ratings" table which is basically:

sysID (PK, INT IDENTITY)
goods_type (VARCHAR(16))
suminsured_min (DECIMAL(9,2))
suminsured_max (DECIMAL(9,2))
percent_premium (DECIMAL(9,6))
[Unique Index on goods_type, suminsured_min and suminsured_max]

[edit]
Each type of goods typically has 3 - 4 ranges for suminsured
[/edit]

The list of goods_types rarely changes and most queries for insurance will involve goods worth less than $100. Because of this, I was considering de-normalising using tables in the following format (for all values from $0.00 through to $100.00):

Table Name: tblRates[goodstype]
suminsured (DECIMAL(9,2)) Primary Key
premium (DECIMAL(9,2))

Denormalising this data should be easy to maintain as the rates are generally only updated once per month at most. All requests for values >$100 will always be looked up in the primary tables and calculated.

My question(s) are:
1. Am I better off storing the suminsured values as DECIMAL(9,2) or as a value in cents stored in a BIGINT?
2. This de-normalisation method involves storing 10,001 values ($0.00 to $100.00 in $0.01 increments) in possibly 20 tables. Is this likely to be more efficient than looking up the percent_premium and performing a calculation? - Or should I stick with the main tables and do the calculation?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

半城柳色半声笛 2024-07-19 06:35:51

不要创建新表。您已经有一个关于商品、最小值和最大值的索引，因此这个 sql for (known Goods and its value):

SELECT percent_premium 
FROM ratings 
WHERE goods='PRECIOUST' and :PREC_VALUE BETWEEN suminsured_min AND suminsured_max

将有效地使用您的索引。

您要查找的数据类型是smallmoney。用它。

Don't create new tables. You already have an index on goods, min and max values, so this sql for (known goods and its value):

SELECT percent_premium 
FROM ratings 
WHERE goods='PRECIOUST' and :PREC_VALUE BETWEEN suminsured_min AND suminsured_max

will use your index efficently.

The data type you are looking for is smallmoney. Use it.

回复收藏 0 原文

我们的影子 2024-07-19 06:35:51

您建议的计划将对 10001 行使用二分搜索，而不是 3 或 4。

这几乎不能提高性能，所以不要这样做。

至于算术，BIGINT 会稍微快一些，我想你几乎不会注意到这一点。

回复收藏 0 原文

零度° 2024-07-19 06:35:51

我不完全确定我们正在谈论什么计算，但除非它们非常复杂，否则它们很可能比在几个不同的表中查找数据要快得多。如果可能，请在数据库中执行计算（即使用存储过程），以最大限度地减少应用程序层之间的数据流量。

即使数据加载会更快，我认为必须每月一次（甚至每季度一次）更新非规范化数据的想法是非常可怕的。您可能可以很快地完成这项工作，但是下一个人处理该系统怎么办？您会要求他们学习数据库结构，记住每次需要更新的 20 个表中的哪些表，并正确执行吗？我想说的是，与不正确信息污染数据的风险相比，反规范化可能带来的性能提升并没有多大价值。

回复收藏 0 原文

~没有更多了~