It's hard to understand what you are describing without seeing an example. But it sounds like you're talking about a database design antipattern that's often called One True Lookup Table.
CREATE TABLE Parameters (
key VARCHAR(20) PRIMARY KEY,
value VARCHAR(255) NOT NULL,
);
The problems with this design include:
You can't use the SQL data type that fits best, because the value column must accommodate integers, dates, strings -- all values for all possible parameter types.
You can't use constraints to limit the values for a given parameter type. For example you might want a CHECK constraint that makes sure postal codes fit a certain pattern. But you can't, because then all parameters of all types would have to conform to the same pattern.
You can't use this table as a lookup table to constrain other tables. For example, if you want a country column in another table to reference the list of countries and permit no other value. By storing all lookup data in the same column, your country column could reference any value of any parameter type. There are no countries 'M' or 'F' but your database would permit country to reference it.
You can't create a real FOREIGN KEY constraint anyway, unless the Parameters.value column has a UNIQUE constraint on it, which it couldn't because 'M' could be either "male" or "medium shirt size" and therefore appear twice in the Parameters table.
You can't guarantee that a given key exists at all, because there's no way to force a table to contain a row with a certain primary key. This is what NOT NULL is for in a conventional database design, where parameters belong in columns, not rows.
The table may grow large, so that it becomes inefficient to look up values that rightly belong in a small set. Your list of application configuration parameters is stored with lists of countries and postal codes and so on.
Do you think is a bad practice to let the html contain the pks of a web app? I think it is, and I propose to use guids instead pks.
The most common warning about revealing pks in the user-visible layer of a web app is that it can give attackers information about how to address parts of your data. It sounds like you're still showing the pk values in your web app, but you've simply traded GUIDs for integers. I don't see how this is better.
update: To be clear, if your code allows users to do something illicit by knowing the PK value, then it would also allow the same illicit action if the user knows some surrogate value that maps to a PK value. You haven't added any protection by using a surrogate value.
Is then necessary to "take the risk"? what can be done instead?
Defensive programming. Don't assume that users will only click on links provided for them, they could edit the link to specify some other pk value and submit that. Attackers are adept at this sort of thing.
Your PHP script should check that the current user's session is logged into an account that has privilege to change the password for account 12345. Don't just assume that it's okay because the web app wouldn't have shown the user a link to do something they don't have privilege to do. Even if the app is correct, an attacker can change the values to anything they want, and submit that.
You must write code in your app to check that the user has privilege to use the data they request, assuming they can request data even if they don't have privilege to see it. If you can do this, you reduce the risk of exposing pk values.
update: Hiding PK values is security by obscurity, which is not an effective security strategy. Your code needs to check that the user has privilege to see or change the records for that PK. If you do this correctly, an attacker should get an "access denied" error if they try to do something they shouldn't.
If you have programmers who make mistakes, then structure your application to assume a user has no privileges, and require the programmer to write code to establish authorization before every type of action.
Also, use code tests to verify that a user can invoke a given task only when they have the right privileges, and a non-privileged user cannot invoke that task, and that they receive an appropriate error message. Require programmers to write tests for any functionality they touch.
I do prefer only one table with all parameters and a column with keys to reference them from upper layers.
No. OTLT is a bad design. See above.
What do you think about having a source file with the same information as the parametric tables? I´ve seen some projecs having source code with every pk related to a parameters... is this a good practice?
No. The point of storing parameters in the database is so that you can update them by accessing the database, and they'll automatically take effect in other pages that use the data. If you have to update your code anyway to work with new values, then there's no advantage to having stored the parameters in a database. If that's true, you might as well store the values only in the code.
About the source code with parameter data: how then to reference specific prm from client code? suppose I have in bussiness layer some logic regarding gender that uses prm data to work... how to relate both data (I used to create constant tables)? I think I need at least a key hard-coded at BL...
I'm guessing you use a surrogate key for everything...
Do you know how to use JOINs in SQL? You can join your table to the lookup table and search for the value instead of its surrogate key.
SELECT ... FROM People JOIN Genders USING (gender_id) WHERE gender = 'M'
For lookup tables, I like to use natural keys. Then you can search on the value 'M' instead of the surrogate key for that value.
SELECT ... FROM People WHERE gender = 'M'
Do you think it is relevant to create a caching scheme to keep parameter data?
Yes. Parameter data probably changes infrequently, and your performance can benefit from reducing the number of queries your app uses to fetch them. When you update the values, invalidate the respective entry in the cache.
发布评论
评论(1)
如果不看例子,很难理解你所描述的内容。但听起来您正在谈论一种数据库设计反模式,通常称为一个真实查找表。
此设计的问题包括:
您无法使用最适合的 SQL 数据类型,因为
value
列必须容纳整数、日期、字符串——所有可能的参数类型的所有值.您不能使用约束来限制给定参数类型的值。例如,您可能需要一个 CHECK 约束来确保邮政编码符合特定模式。但你不能,因为这样所有类型的所有参数都必须符合相同的模式。
您不能将此表用作查找表来约束其他表。例如,如果您希望另一个表中的
country
列引用国家/地区列表并且不允许使用其他值。通过将所有查找数据存储在同一列中,您的country
列可以引用任何参数类型的任何值。没有国家/地区'M'
或'F'
,但您的数据库允许country
引用它。无论如何,您都无法创建真正的
FOREIGN KEY
约束,除非Parameters.value
列具有UNIQUE
约束,这会导致不能,因为'M'
可能是“男性”或“中等衬衫尺码”,因此会在参数表中出现两次。您根本无法保证给定的键存在,因为无法强制表包含具有特定主键的行。这就是传统数据库设计中
NOT NULL
的用途,其中参数属于列,而不是行。表可能会变大,从而导致查找正确属于小集合的值变得低效。您的应用程序配置参数列表与国家/地区和邮政编码等列表一起存储。
尽管理查德·哈里森 在他对你另一个问题的回答中写道,他错了。 OTLT 是一个糟糕的设计,你会后悔使用它的。
另请参见 OTLT 和 EAV:两者所有初学者都会犯的重大设计错误
关于您的具体问题:
关于在 Web 应用程序的用户可见层中泄露 pks 的最常见警告是,它可以向攻击者提供有关如何处理部分数据的信息。听起来您仍在 Web 应用程序中显示 pk 值,但您只是将 GUID 换成了整数。我不明白这有什么更好的地方。
更新:需要明确的是,如果您的代码允许用户通过了解 PK 值来执行非法操作,那么如果用户知道映射到 PK 值的某些代理值,它也将允许相同的非法操作。您尚未通过使用代理值添加任何保护。
防御性编程。 不要假设用户只会点击为他们提供的链接,他们可能会编辑链接以指定其他一些 pk 值并提交。攻击者很擅长这种事情。
例如:
您的 PHP 脚本应检查当前用户的会话是否登录到有权更改帐户 12345 密码的帐户。不要因为 Web 应用程序不会向用户显示链接而认为没问题做一些他们没有特权做的事情。即使应用程序是正确的,攻击者也可以将值更改为他们想要的任何值,然后提交。
您必须在应用程序中编写代码来检查用户是否有权使用他们请求的数据,假设他们即使没有查看数据的权限也可以请求数据。如果您能做到这一点,就可以降低暴露 pk 值的风险。
更新:隐藏 PK 值是隐匿性安全,这不是一种有效的安全策略。您的代码需要检查用户是否有权查看或更改该 PK 的记录。如果您正确执行此操作,则攻击者在尝试执行不应执行的操作时应该会收到“访问被拒绝”错误。
如果您的程序员犯了错误,那么将您的应用程序构建为假设用户没有权限,并要求程序员在每种类型的操作之前编写代码来建立授权。
此外,使用代码测试来验证用户是否仅在拥有正确的权限时才能调用给定任务,并且非特权用户无法调用该任务,并且他们会收到相应的错误消息。要求程序员为他们接触的任何功能编写测试。
不。OTLT 是一个糟糕的设计。见上文。
不会。将参数存储在数据库中的目的是为了您可以通过访问数据库来更新它们,并且它们会在使用该数据的其他页面中自动生效。如果您无论如何都必须更新代码才能使用新值,那么将参数存储在数据库中就没有任何优势。如果这是真的,您不妨将这些值仅存储在代码中。
我猜你对所有事情都使用代理密钥...
你知道吗 如何在 SQL 中使用 JOIN?您可以将表连接到查找表并搜索值而不是其代理键。
对于查找表,我喜欢使用自然键。然后,您可以搜索值
'M'
而不是该值的代理键。是的。参数数据可能很少更改,并且您的性能可以通过减少应用程序用于获取它们的查询数量而受益。当您更新值时,会使缓存中的相应条目无效。
It's hard to understand what you are describing without seeing an example. But it sounds like you're talking about a database design antipattern that's often called One True Lookup Table.
The problems with this design include:
You can't use the SQL data type that fits best, because the
value
column must accommodate integers, dates, strings -- all values for all possible parameter types.You can't use constraints to limit the values for a given parameter type. For example you might want a CHECK constraint that makes sure postal codes fit a certain pattern. But you can't, because then all parameters of all types would have to conform to the same pattern.
You can't use this table as a lookup table to constrain other tables. For example, if you want a
country
column in another table to reference the list of countries and permit no other value. By storing all lookup data in the same column, yourcountry
column could reference any value of any parameter type. There are no countries'M'
or'F'
but your database would permitcountry
to reference it.You can't create a real
FOREIGN KEY
constraint anyway, unless theParameters.value
column has aUNIQUE
constraint on it, which it couldn't because'M'
could be either "male" or "medium shirt size" and therefore appear twice in the Parameters table.You can't guarantee that a given key exists at all, because there's no way to force a table to contain a row with a certain primary key. This is what
NOT NULL
is for in a conventional database design, where parameters belong in columns, not rows.The table may grow large, so that it becomes inefficient to look up values that rightly belong in a small set. Your list of application configuration parameters is stored with lists of countries and postal codes and so on.
Despite what Richard Harrison wrote in his answer to your other question, he's wrong. OTLT is a bad design and you'll regret using it.
See also OTLT and EAV: the two big design mistakes all beginners make
Regarding your specific questions:
The most common warning about revealing pks in the user-visible layer of a web app is that it can give attackers information about how to address parts of your data. It sounds like you're still showing the pk values in your web app, but you've simply traded GUIDs for integers. I don't see how this is better.
update: To be clear, if your code allows users to do something illicit by knowing the PK value, then it would also allow the same illicit action if the user knows some surrogate value that maps to a PK value. You haven't added any protection by using a surrogate value.
Defensive programming. Don't assume that users will only click on links provided for them, they could edit the link to specify some other pk value and submit that. Attackers are adept at this sort of thing.
For example:
Your PHP script should check that the current user's session is logged into an account that has privilege to change the password for account 12345. Don't just assume that it's okay because the web app wouldn't have shown the user a link to do something they don't have privilege to do. Even if the app is correct, an attacker can change the values to anything they want, and submit that.
You must write code in your app to check that the user has privilege to use the data they request, assuming they can request data even if they don't have privilege to see it. If you can do this, you reduce the risk of exposing pk values.
update: Hiding PK values is security by obscurity, which is not an effective security strategy. Your code needs to check that the user has privilege to see or change the records for that PK. If you do this correctly, an attacker should get an "access denied" error if they try to do something they shouldn't.
If you have programmers who make mistakes, then structure your application to assume a user has no privileges, and require the programmer to write code to establish authorization before every type of action.
Also, use code tests to verify that a user can invoke a given task only when they have the right privileges, and a non-privileged user cannot invoke that task, and that they receive an appropriate error message. Require programmers to write tests for any functionality they touch.
No. OTLT is a bad design. See above.
No. The point of storing parameters in the database is so that you can update them by accessing the database, and they'll automatically take effect in other pages that use the data. If you have to update your code anyway to work with new values, then there's no advantage to having stored the parameters in a database. If that's true, you might as well store the values only in the code.
I'm guessing you use a surrogate key for everything...
Do you know how to use JOINs in SQL? You can join your table to the lookup table and search for the value instead of its surrogate key.
For lookup tables, I like to use natural keys. Then you can search on the value
'M'
instead of the surrogate key for that value.Yes. Parameter data probably changes infrequently, and your performance can benefit from reducing the number of queries your app uses to fetch them. When you update the values, invalidate the respective entry in the cache.