MySQL:使用 char(n) 与带有零填充的十进制(n)
我被要求使用一个数据库,其中大多数主键以及其他字段都使用 char(n) 来存储带有填充的数值,例如:
product_id: char(8) [00005677]
user_id: char(6) [000043]
category_id: char(2) [05]
他们想这样使用它的原因是能够如果他们愿意的话可以使用字符(在遥远的将来)。然而,它们有许多基于数字的规则,例如,category_id从01到79对应于一般类别,从80到89是特殊类别,90到99是用户定义的类别。
我个人认为使用 char(n) 来存储数字是一种不好的做法。我的原因是:
- 使用char、" " != 0、0 != 00、05 != 5、00043 != 000043 等等。由于这个原因, 必须不断检查这些值(以防止数据损坏)。
- 如果我填充一个数字: 0 -> 00,那我得注意别垫了 字符 (A -> 0A)
- 如果使用字符,则范围会变得奇怪,例如: 从 01 到 79 以及 AB 和 RX 和 TZ 和 S 等...
- 索引数字而不是字符会带来性能增益
我建议将其更改为带有零填充的十进制(n),以使其更加“防错” ,因为此信息是由不同来源(Web、Windows 客户端、上传 csv)修改的。例如,如果他们想添加更多类别,那么从十进制(2)更新到十进制(3)会更容易。
我的问题是:我错了吗?可以信任 char(n) 来完成此任务吗?如果“字符”对数字来说是邪恶的,那么我在上面的列表中还缺少哪些其他缺点(如果我想赢得官司,我可能需要更好的理由)?
TIA(任何评论/答案将不胜感激)。
I was asked to use a database in which most of the primary keys, and other fields as well, uses char(n) to store numeric values with padding, for example:
product_id: char(8) [00005677]
user_id: char(6) [000043]
category_id: char(2) [05]
The reason they want to use it like that, is to be able to use characters (in the far future) if they want. However they have many rules based in numbers, for example, category_id from 01 to 79 correspond to a general category and from 80 to 89 is a special category and 90 to 99 is user defined category.
I personally think that using char(n) to store numbers is a bad practice. My reasons are:
- using char, " " != 0, 0 != 00, 05 != 5, 00043 != 000043, and so on. For that reason,
the values have to be constantly checked (to prevent data corruption). - If I pad a number: 0 -> 00, then I have to pay attention not to pad
a character (A -> 0A) - If characters are used, then ranges become strange, something like:
from 01 to 79 and AB and RX and TZ and S, etc... - Indexing numbers instead of chars result in a performance gain
I'm proposing to change it to decimal(n) with zerofill to make it more "error-proof", as this information is modified by different sources (web, windows client, upload csv). If they want to add more categories, for example, then updating from decimal(2) to decimal(3) will be easier.
My question then is: Am I wrong? can char(n) be trusted for this task? If "chars" are evil with numbers, then which other disadvantages am I missing in the above list (I may need better reasons if I want to win my case)?
TIA (any comment/answer will be appreciated).
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
如果这是 SQL Server 或 Oracle 或任何其他 RDBMS,我建议对这些列实施检查约束,以便数据始终与列的全部容量匹配 - 这将确保您的标识符是统一的。
不幸的是MySQL不支持这个。
虽然它不会阻止必须在客户端或数据库中的过程中填充进入数据库或搜索例程中的内容的烦恼,但它可以保证字段在最低级别上是干净的。
我发现使用这样的约束有助于避免事情变得严重失控。
至于使用数字进行的优化,如果将来必须容纳非数字字符,那么这将不是一个选择。
使用 varchar/char 数据使用自然键(可能是主键的候选键)是很常见的,但是却强制代理键(通常是某种自动编号整数,它只是一个内部引用,并且通常是聚集索引和主键)。
If this was SQL Server or Oracle or any other RDBMS, I would recommend enforcing a check constraint on those columns so that the data always matched the full capacity of the column - this would ensure your identifiers are uniform.
Unfortunately MySQL doesn't support this.
While it wouldn't stop the annoyance of having to pad things coming into the database or in search routines, on the client or in procs in the database, it would guarantee you that the fields were clean at the lowest level.
I find using constraints like this help avoid things getting badly out of hand.
As far as the optimization by using numbers, if they have to accommodate non-numeric characters in the future, that's not going to be an option.
It is very common to have natural keys (which could be candidates for a primary key) with varchar/char data, but yet instead enforce referential integrity on surrogate keys (usually some kind of autonumbering integer which is simply an internal reference, and often the clustered index and primary key).
引用你的问题:
您没有显示任何数字数据的示例,仅显示恰好由数字组成的字符数据。如果您说他们的
OrderTotal
列是 char(10),那么我会开始担心。只要将其视为字符数据就可以了。我看不到任何更改数据库的业务或技术案例(除非您开始几乎完全重写)。
关于性能...如果这实际上是一个问题,那么您很可能有更大的问题需要处理。 MySQL 快速且准确。
--
在某处编写一个函数,将用户输入的 ID 填零以进行查询。在您需要接受用户输入的任何地方都可以使用此函数。永远不要使用数字数据类型来存储数据(如果是 PHP,永远不要使用
+
,始终使用.
来连接,等等...)记住,这是 与
Item_Number = "SHIRT123"
或您可能遇到的任何其他字符串 ID 没有什么不同。小心
Quoting your question:
You did not show any examples of numeric data, only character data that happens to consist of numbers. If you had said that their
OrderTotal
column was a char(10), then I'd start to worry.Just treat this as character data and you will be fine. I can see no business or technical case to change the database (unless you are beginning a near-total rewrite).
Regarding performance... If this is actually a concern, then you most likely have far bigger issues to deal with. MySQL is fast and accurate.
--
Write a function somewhere that will zerofill user inputted ID's for the purpose of querying. Use this function everywhere you need to accept user input. NEVER EVER use a numeric data type to store your data (if PHP, never use
+
, always use.
to concat, etc...)Remember, this is no different than
Item_Number = "SHIRT123"
or any other string ID you may encounter.Take care