好的面向对象设计可以像好的关系数据库设计那样形式化吗?
在数据库世界中,我们有标准化。您可以从设计开始,逐步推进步骤,最后得到正常形式的数据库。这是在数据语义的基础上完成的,可以被认为是一系列的设计重构。
在面向对象的设计中,我们有 SOLID 原则和各种其他关于良好设计的临时指南。
您是否认为可以为 OO 定义等价的正常形式,以便一系列重构步骤可以将一段程序代码(或分解不佳的 OO 设计)转换为正确的(在某种明确定义的意义上)公式:相同的功能?
(注:很高兴制作这个社区维基)
In the database world, we have normalisation. You can start with a design, crank the steps and end up with a normal form of the database. This is done on the basis of the semantics of the data and can be thought of as a series of design refactorings.
In object orientated design, we have the SOLID principals and various other adhoc guidelines towards good design.
Do you think it is possible to define the equivalent of normal forms for OO, such that a series of refactoring steps could move a procedural piece of code (or poorly factored OO design) into a correct (in some well-defined sense) formulation with the same functionality?
(NB. Happy to make this community wiki)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
这是有可能的,但可能性很小。
背景
首先,在关系模型出现的时代,从事IT工作的人受过更高的教育,他们尊重标准。计算机资源非常昂贵,人们总是在寻找使用这些资源的最佳方式。像科德和戴特这样的人是高科技行业中的巨头。
Codd 并没有发明标准化,早在关系数据库出现之前我们就已经对非关系数据库进行了标准化。规范化是一种理论和实践,以“完全规范化原理”的形式出版。我们正在标准化我们的程序,我们认为子程序(方法)的意外重复是一个严重错误。如今,它被称为“永不重复任何东西”或“不要重复自己”,但最近的版本并不承认背后合理的学术理论,因此它的力量尚未实现。
Codd 所做的(其中包括)是专门为关系数据库定义正式的范式。从那时起,这些都在不断进步和完善。但他们也被非学者劫持以出售他们的装备。
由 Codd 和 Chen 发明、Brown 完成的数据库建模有着坚实的基础。在过去的 25 年里,它已经实现了标准化,并由许多具有扎实基础的其他人进一步完善和进步。
面向对象之前的世界
让我们以面向对象之前的编程世界为例。我们有许多标准和约定,用于对我们的程序进行建模,以及特定于语言和平台的实现。你的问题在当时根本不适用。整个行业深刻地认识到数据库设计和程序设计是两门不同的科学,并为它们使用不同的建模方法以及所应用的标准。人们不会讨论他们是否实施了标准,而是讨论了他们遵守标准的程度;他们没有讨论是否他们对数据和程序进行建模,而是讨论了他们对数据和程序进行建模的程度。这就是我们把人类送上月球的方式,特别是在 1969 年。
面向对象的黎明
面向对象出现了,并且表现得好像在它之前不存在其他编程语言或设计方法一样。它没有使用现有的方法论并扩展或改变它们,而是否认它们的存在。因此,毫不奇怪,我们花了20年的时间从无到有地制定新的方法论,并慢慢将其发展到SOLID和Agile的地步,但还不够成熟;你的问题的原因。很能说明问题的是,二十多种这样的方法论在此期间出现并消亡。
即使适用于任何编程语言的 UML 本来可以成为彻底的赢家,但也遭遇了同样的疾病。它试图满足所有人的一切,同时否认成熟方法论的存在。
行业的消亡
随着MS的出现,“任何人都可以做任何事情”的态度(暗示:你不需要正规教育或资格),那种品质和职业自豪感已经消失。人们现在从头开始发明一些东西,就好像这个星球上从来没有人做过一样。当今的 IT 行业技术含量非常低。您知道,但大多数阅读这些页面的人并不知道有一种关系建模方法和一种标准。他们不建模,不实施。然后重新执行。并重新实施。按你说的重构。
面向对象的支持者
问题在于,提出这些面向对象方法的人并不是专业人士中的巨头;他们只是想提出一些面向对象的方法。他们只是非学术群体中最直言不讳的一个。因出版书籍而出名,而不是因同行认可而出名。 不熟练且无意识。他们的工具包中有一锤子,并且每个问题看起来像钉子。由于他们没有受过正规教育,他们不知道数据库设计和程序设计实际上是两门不同的科学;数据库设计相当成熟,已经建立了强有力的方法和标准,他们只是将闪亮的新锤子应用于每个问题,包括数据库。
因此,由于他们忽略了编程方法和数据库方法,从头开始发明轮子,所以这些新方法进展得很慢。并且在类似人群的帮助下,没有良好的学术基础。
如今的程序有数百种未使用的方法。我们现在有程序可以检测到这一点。而使用成熟的方法,我们可以防止这种情况发生。瘦客户端不是一个要实现的目标,我们有一门科学来实现它。我们现在有程序可以检测“脏”数据并“清理”它。而在高端数据库市场,我们一开始就不允许“脏”数据进入数据库。
我同意您将数据库设计视为一系列重构,我理解您的意思。对我来说,这是一门科学(方法论、标准),无需重构。即使接受重构也是一个响亮的信号,表明旧的编程方法是未知的。当前的 OO 方法还不成熟。与面向对象的人一起工作令人厌烦的危险在于,方法本身培养了人们对“一锤子”心态的信心,当代码崩溃时,他们没有一条腿可以依靠;当系统损坏时,整个系统都损坏了,这不是一个可以修复或更换的小部件。
以斯科特·安布勒和敏捷为例。 Ambler 花了 20 年的时间公开与数据库行业的巨头们激烈争论,反对规范化。现在他有了敏捷,虽然不成熟,但有希望。但其背后的秘密是标准化。他已经改变了轨道。由于他过去的战争,他无法诚实地出来宣布这一点,并给予其他人应有的信任,所以这仍然是一个秘密,而你只能在不公开其基础的情况下弄清楚敏捷。
预测
这就是我这么说的原因,因为过去 20 年来 OO 世界取得了明显的微小进展;大约 20 种失败的 OO 方法;由于该方法的浅薄,当前的面向对象方法不太可能达到(单一)数据库设计方法的成熟度和接受度。至少还需要 10 年,更可能是 20 年,并且它将被 OO 的某种替代品所取代。
要使其成为可能,需要满足两件事:
面向对象的支持者需要正规的高等教育。良好的编程科学基础。当然,任何人都可以做任何事,但要做伟大的事情,我们需要良好的基础。这将使人们认识到重构是不必要的,它可以被科学消除。
他们需要打破对其他编程方法和标准的否认。这将为在此基础上构建面向对象或获取其基础并将其合并到面向对象中打开大门。这将带来坚实而完整的面向对象方法。
真实世界的面向对象
显然,我是根据经验说的。在我们的大型项目中,我们使用成熟的分析和设计方法,一种用于数据库,另一种用于功能。当我们进入代码切割阶段时,我们让 OO 团队使用他们喜欢的任何东西,仅用于他们的对象,这通常意味着 UML。体系结构、结构、性能、臃肿软件、一锤子或数百个未使用的对象都没有问题,因为所有这些都是在面向对象之外处理的。后来,在 UAT 期间,找到错误来源或快速进行所需的更改都没有问题,因为整个结构已经记录了结构;这些块可以更改。
It is a possibility, but highly unlikely.
Context
First, in the days when the Relational Model came out, people who worked in IT were more educated and they esteemed standards. Computer resources were expensive, and people were always looking for the best way to use those resources. People like Codd and Date were giants in an industry where people were high tech.
Codd did not invent Normalisation, we were Normalising our non-relational databases long before Relational came along. Normalisation is a theory and practice, published as the Principle of Full Normalisation. We were Normalising our programs, we considered accidental duplication of a subrotine (method) a serious error. Nowadays it is known as Never Duplicate Anything or Don't Repeat Yourself, but the recent versions do not acknowledge the sound academic theory behind, and are therefore its power is unreallised.
What Codd did (among many things) was define formal Normal Forms specifically for Relational Databases. And these have progressed and been refined since then. But they have also been hijacked by non-academics for the purpose of selling their gear.
The database modelling that was invented by Codd and Chen, and finished by Brown had a solid grounding. In the last 25 years, its has achieved Standardisation and been further refined and progressed by many others who had solid grounding.
The World Before OO
Let's take the programming world before OO. We had many standards and conventions, for modelling our programs, as well as for language- and platform-specific implementation. Your question simply would not apply in those days. The entire industry understood deeply that database design and program design were two different sciences, and used different modelling methodologies for them, plus whatever standards applied. People did not discuss if they implemented standards, they discussed the extent to which they complied with standards; they did not discuss if they modelled their data and programs, they discussed the extent to which they modelled their data and programs. That is how we put men on the Moon, notably in 1969.
Dawn of OO
OO came along and presented itself as if no other programming language or design methodology existed before it. Instead of using existing methodologies and extending or changing them, it denied their existence. So, not surprisingly, it has taken 20 years to formulate the new methodologies from scratch and slowly progress them to the point of SOLID and Agile, which is not mature; the reason for your question. It is telling that more than twenty such methodologies have flashed up and died during that time.
Even UML, which could have been an outright winner, applicable to any programming language suffered the same disease. It tried to be everything to everyone while denying that mature methodologies existed.
Demise of the Industry
With the advent of MS, the attitude of "anyone can do anything" (implication: you do not need formal education or qualifications), that quality and pride of profession has been lost. People now invent things from scratch as if no one on the planet has ever done it before. The IT industry today is very low tech. You kow, but most people reading these pages do not know, that there is one Relational Modelling methodology, and one Standard. They do not model, the implement. Then re-implement. And re-implement. Re-factoring as you say.
OO Proponents
The problem was that the people who came up with these OO methods were not giants among professionals; they were simply the most vocal of an un-academic lot. Famous due to publishing books, not due to peer acknowledgement. Unskilled and unaware. They had One Hammer in their toolkit, and every problem looked like a nail. Since they were not formally educated they did not know that actually database design and program design are two different sciences; that database design was quite mature, had strongly established methodologies and standards, and they simply applied their shiny new hammer to every problem, including databases.
Therefore, since they were ignoring both programming methodologies and database methodologies, inventing the wheel from scratch, those new methodologies have progressed very slowly. And with assistance from a similar crowd, without sound academic basis.
Programs today have hundreds of methods that are not used. We now have programs to detect that. Whereas with the mature methodologies, we prevent that. Thin client was not a goal to be achieved, we had a science that produced it. We now have programs to detect "dirty" data and to "clean" it. Whereas in the upper end of the database market, we simply do not allow "dirty" data into the database in the first place.
I accept that you see database design as a series of re-factorings, I understand what you mean. To me it is a science (methodology, standards) that eliminates ever having to re-factor. Even the acceptance of re-factoring is loud signal that the older programming methodologies are unknown; that the current OO methodologies are immature. The danger, what makes it annoying to work with OO people, is that the methodology itself fosters a confidence in the One Hammer mentality, and when the code breaks, they have not one leg to stand on; when the system breaks, the whole system breaks, it is not one small piece that can be repaired or replaced.
Take Scott Ambler and Agile. Ambler spend 20 years publicly and vociferously arguing with the giants of the database industry, against Normalisation. Now he has Agile, which although immature, has promise. But the secret behind it it Normalisation. He has switched tracks. And because of his past wars, he cannot come out and declare that honestly, and give others due credit, so it remains a secret, and you are left to figure out Agile without its fundaments being declared.
Prognosis
That is why I say, given the evidenced small progress in the OO world over the last 20 years; the 20 or so OO methodologies that have failed; the shallowness of the approach, it is highly unlikely that the current OO methodologies will achieve the maturity and acceptance of the (singular) database design methodology. It will take at least another 10 years, more likely 20, and it will be over taken by some replacement for OO.
For it to be a possibility two things need to happen:
The OO proponents need formal tertiary education. A good grounding in the science of programming. Sure, anyone can do anything, but to do great things, we need a great grounding. That will lead to the understanding that re-factoring is not necessary, that it can be eliminated by science.
They need to break their denial of other programming methodologies and standards. That will open the door to either building OO on top of that, or taking the fundaments of that and merging it into OO. That will lead to a solid and complete OO methodology.
Real World OO
Obviously I speak from experience. On our large projects we use the mature analysis and design methodologies, one for database and another for function. When we get to the code-cutting stage, we let the OO team use whatever they like, for their objects only, which usually means UML. No problems with architecture or structure or performance or bloatware or One Hammer or hundreds of unused objects, because all that was taken care of outside OO. And later, during UAT, no problems with finding the source of bugs or making the required changes quickly, because the entire structure, has documented structure; the blocks can be changed.
我认为这是一个有趣的问题,因为它假设科德范式实际上是“正确”设计的定义。并不是试图用该语句引发一场激烈的战争,但我想我的观点是,许多数据库没有完全规范化(例如连接性能)有很好的理由,这让我认为现实世界中的面向对象规范化是等效的空间可能是设计模式或(如您所说)实体。在这两种情况下,您都在谈论理想化的指导方针,必须以适当的批判眼光来应用这些指导方针,而不是像教条一样盲目地遵循。
I think this is an interesting question, because it presumes that Codd's Normal Forms are actually the definition of "correct" design. Not trying to start a flame war with that statement, but I guess my point is that there are very good reasons that many DB's aren't fully normalized (e.g. join performance) leads me to think that the real-world equivalent of normalization in OO space is probably design patterns or (as you said) SOLID. In both cases you're talking about idealized guidelines that have to be applied with a suitably critical eye, rather than slavishly followed as dogma.
我不仅完全同意保罗的观点,而且我还会更进一步。
模型只是模型。关系数据库使用的规范化模型只是存储和管理数据的一种方法。事实上,请注意,虽然 RDBMS 对于数据操作操作(标准 CRUD)很常见,但我们现在已经发展了用于整合、分析和报告的数据仓库。而且它绝对不符合 DML 领域中的标准化模型。
现在我们还有 Google 的 BigTable 架构和 Apache 的 Hadoop。这些较新的建模系统反映了由分布式数据库理念驱动的景观变化。正常化也不需要适用于这个俱乐部。
我们只能将一个成功的模型应用到它变得不那么成功的地步,或者被更适合设计师需求的模型所取代的时候。请注意,我们人类通过物理学/天文学来模拟宇宙的多种方式,你有什么。建模试图以谨慎的方式描述系统,但随着系统或系统需求的变化,模型也必须发生变化。
OOP 一直是一种非常非常成功的计算机应用程序建模方法。然而,应用程序设计者的需求与数据库设计者的需求不同。大多数时候,应用程序的设计者必须考虑到他的程序将与人类交互。与数据库设计人员不同,数据库设计人员的工作(主要)是与其他代码交互,而程序员的工作是使用机器并使其可供更随机的人访问。这种艺术并不能很好地映射到标准化等标准。
尽管如此,n 层、MVC、MVVC 和其他范式确实建立了一些准则。但最终,应用程序设计的问题空间通常不像关系数据库那样容易适应这种离散的建模步骤。
哇。抱歉长度。如果这违反了礼仪,请告诉我。 。 。
Not only do I fully agree with Paul, but I will go a step further.
Models are just that - only models. The Normalization models used by Relational Databases are only one approach to storing and managing data. In fact, note that while RDBMS's are common for Data Manipulation operations (the standard CRUD), we have now evolved the DataWarehouse for consolidation, analysis, and reporting. And it most definitely does NOT adhere to the normalization models found in DML land.
Now we also have Google with their BigTable architecture, and Apache with Hadoop. These newer modeling systems reflect a change in the landscape, driven by the idea of the DISTRIBUTED database. Normalization need not apply for this club either.
We can apply a successful model ony to the point at which it becomes not-so-successful, or is supplanted by an model which better suits the needs of the designer. Note the many ways we humans have modelled our universe through physics/Astronomy what have you. Modelling attmpts to describe a system in discreet terms, but as the system, or the needs of the system change, so must the model.
OOP is and has been a very, very successfulk way to model computer applications. However, the needs of the application designer are different from thos eof Database designers. MOST of the time, there is a point at which the designer of an application must consider that his program will be interacted with by humans. Unlike the database designer, whose work will (mostly) be expected to interact with other code, the programmer's job is to take the machine and make it accessible to a much more random human-being. This art does not map quite so well to such standards like normalization.
All that said, n-tier, MVC, MVVC, and other paradims DO establish some guidelines. But in the end, the problem-space of application design is usually not as easy to fit into such discrete modelling steps as a relational databse.
Wow. Apologies for the length. If this is a breach of ettiquette here, do let me know . . .