Web 开发 - 对象数据库与关系数据库
使用对象数据库或关系数据库进行涉及大量 CRUD 的常规 Web 开发有何优缺点?
更新:我重新打开了赏金奖励,以便给纳威。
Whats the cons and pros of using a object database or relational database for regular web development which involves a lot of CRUD?
UPDATE: I reopened the bounty reward in order to give Neville it.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
OODBMS 的概念已经被打破,过去几十年中出现的各种商业和免费产品几乎没有在市场上产生影响。
就您可以向数据提出的问题类型而言,关系模型比对象模型更强大。不幸的是,SQL 失去了关系模型所具有的大部分表达能力,但即使以这种稀释的形式,用 SQL 表达查询仍然比在典型的 OO 数据库(无论是 ORM 还是 OODBMS)中更容易。
OODBMS 中的查询主要由导航运算符驱动,这意味着如果您的销售数据库中有销售人员拥有其销售额,那么查询给定 SKU 的每月销售额不仅可能非常慢,而且很难表达。还要考虑允许员工访问建筑物的安全模型。哪种表达方式是正确的?员工应该持有他们可以访问的建筑物的集合,还是建筑物应该持有可以访问它们的员工的集合?更重要的是,为什么任何一个类都必须将另一个类的集合融入到其设计中?而且,无论您选择哪一个,您如何询问哪对员工拥有多于一栋可以共用的大楼?没有简单的导航模式可以回答这样的问题。明智的解决方案——“访问”对象——本质上是回归到正确规范化的关系模式,它需要某种大量借用关系代数的查询语言,以便在没有大量过度的情况下回答问题。有线数据传输。
还要考虑 OODBMS 所吹捧的另一个主要优势:方法,尤其是虚拟方法的继承。运动诊所可能针对不同类型的运动员有不同的受伤风险指标。在 ORM 世界中,这将自动表示为类层次结构,以
Athlete
为根,并由每个派生类实现一个虚拟方法int InjuryRiskScore()
。问题在于,这种方法总是在客户端实现,而不是在后端实现,因此,如果您想在您的诊所中找到所有运动项目中风险最高的 10 名运动员,唯一的方法就是获取整个项目中的所有运动员。连接并通过客户端优先级队列传递它们。我也不了解 OODBMS 世界,但我认为也会出现同样的问题,因为存储引擎通常只存储足够的数据来重新水化客户端编程语言中的对象。在关系模型或 SQL 中,您可以将受伤风险评分表示为视图,这可以只是每个运动员类型视图的并集。然后,你只需提出问题即可。或者您可以提出更复杂的问题,例如“自上个月检查以来,谁的受伤风险增加最多?”甚至“哪种风险评分已被证明是去年受伤的最佳预测指标?”。最重要的是,这些问题都可以在 DBMS 内得到解答,只需通过网络传输问题和答案即可。关系模型允许 DBMS 基于谓词逻辑以高度提炼的方式表达知识,这允许您存储在其中的事实的各个维度以完全临时的方式进行连接、投影、过滤、分组、总结和重新排列。方式。它允许您以系统最初设计时没有预料到的方式轻松地生成数据。因此,关系模型允许我们所知道的最纯粹的知识表达。简而言之,关系模型包含纯粹的事实——不多不少(当然也不是对象或其代理)。
从历史的角度来看,关系模型的出现是为了应对当时现有网络和分层 DBMS 的灾难性状况,并且在很大程度上(并且正确地)取代了除一小部分应用领域之外的所有应用领域(甚至这些可能的应用领域)。仍然存在很大程度上是因为 SQL 未能发挥 RM 的能力)。极具讽刺意味的是,业界现在基本上都在向往网络理论数据库的“美好旧时光”,而这正是 OODBMS 和当前的 NoSQL 数据库所要回归的。这些努力正确地批评了 SQL 未能满足当今的需求,但不幸的是,他们假设(错误地,并且可能纯粹出于无知)SQL 是关系模型的高保真表达。因此,他们甚至忽略了关系模型本身,而关系模型本身几乎没有任何限制,正是这些限制导致许多人放弃 SQL,而往往转向 OODBMS。
The concept of an OODBMS is quite broken, and the various commercial and free offerings that have emerged over the last few decades have barely made a dint in the marketplace.
The relational model is more powerful than object models in terms of the kinds of questions you can ask of your data. Unfortunately, SQL threw out much of the expressive power that the relational model is capable of, but even in this diluted form, it is still easier to express queries in SQL than in a typical OO database (be it ORM or OODBMS).
Queries in an OODBMS are predominantly driven by navigational operators, which means that if your sales database has sales people owning their sales, then querying for the monthly sales for a given SKU is not only likely to be cripplingly slow, but very awkward to express. Consider also a security model that grants employees access to buildings. Which is the correct way to express this? Should employees hold a collection of buildings they can access, or should buildings hold a collection of employees that have access to them? More to the point, why should either class have to have a collection of the other baked into its design? And, whichever one you choose, how would you ask which pairs of employees have more than one building they can access in common? There is no straightforward navigational pattern that can answer such a question. The sensible solution — an "Access" object — is essentially a reversion back to a properly normalised relational schema, and it requires some kind of query language that borrows heavily from the relational algebra in order to answer the question without a massive over-the-wire data transfer.
Also consider another major strength touted for the OODBMS: methods, especially inheritance with virtual methods. A sports clinic might have different risk-of-injury metrics for different kinds of athlete. In the ORM world, this would be automatically expressed as a class hierarchy, with
Athlete
at the root, and a virtual method,int InjuryRiskScore()
implemented by each derived class. The problem is that this method is invariably implemented on the client, not at the back end, so if you want to find the 10 highest risk athletes across all sports at your clinic, the only way to do it is to fetch all athletes across the wire and pass them through a client-side priority queue. I don't know the OODBMS world as well, but I think the same problem occurs, since the storage engines generally only store enough data to rehydrate objects in the client's programming language. In the relational model or SQL, you would express risk-of-injury scoring as a view, which could be simply the union of per-athlete-type views. Then, you just ask the question. Or you can ask more complicated questions like, "Who had the greatest increase in their risk-of-injury since last month's checkup?" or even, "Which risk score has proven to be the best predictor of injury over the last year?". Most importantly, these questions can all be answered inside the DBMS with nothing more than the question and the answer travelling across the wire.The relational model allows the DBMS to express knowledge in a highly distilled manner based on predicate logic, which allows the various dimensions of the facts you store therein to be joined, projected, filtered, grouped, summarised, and otherwise rearranged in a completely ad hoc manner. It allows you to easily cook up the data in ways that weren't anticipated when the system was originally designed. The relational model thus permits the purest expression of knowledge that we know of. In short, the relational model holds pure facts — nothing more, nothing less (and certainly not objects, or proxies thereof).
On a historical note, the relational model emerged in response to a disastrous state of affairs with the existing network and hierarchical DBMSs of the time, and largely (and rightly) displaced them for all but a small niche of application areas (and even these probably remained largely because SQL failed to deliver on the RMs power). It is deeply ironic that much of the industry is now essentially yearning for the "good old days" of network-theoretical databases, which is essentially what OODBMSs and the current crop of NoSQL databases are going back to. These efforts rightly criticise SQL for its failure to deliver on today's needs, but unfortunately they assumed (wrongly, and probably out of pure ignorance) that SQL is a high-fidelity expression of the relational model. Hence they neglected to even consider the relational model itself, which has virtually none of the limitations that has driven so many away from SQL, often towards OODBMSs.
关系数据库:
优点:
工具、开发人员、资源
产品
站点和非常高的吞吐量
缺点:
OOBDMS
优点:
。 缺点:
Relational database:
Pros:
tools, developers, resources
products
sites, and very high throughput
Cons:
OOBDMS
Pros:
Cons:
我可以回答你关于我熟悉的一个对象数据库的问题:ZODB。
ZODB 允许您几乎完全透明地保存数据模型。它的用法相当于:
您必须花很长时间才能找到 RDMBS 的这种可读性。在 Web 应用程序中使用 ZODB 有一个很大的优点。
正如马塞洛所概述的那样,最大的缺点是缺乏强大的查询。这在一定程度上是上述习语的便利性带来的副作用。以下是完全可以的,所有内容都会持久化到数据库中:
但是,这种灵活性使得跨不同模型优化复杂查询变得困难。仅列出所有黄色汽车与邻居的列表将需要
O(n)
时间,除非您滚动自己的索引。所以,这取决于你所说的“常规网络开发”是什么意思。很多网站实际上并不需要复杂的多维查询,线性时间内的搜索完全没有问题。在这些情况下,我认为使用 RDBMS 可能会使您的代码过于复杂。我仅使用对象数据库编写了许多 CMS 类型的应用程序。很多 CRUD 并没有特别涉及到它; ZODB 非常成熟,可扩展性和缓存都非常好。
然而,如果您正在编写一个 Web 应用程序,需要按照 Google Analytics 的方式进行复杂的业务报告,或者某种具有数 TB 数据的仓库库存管理系统,那么您肯定会需要 RDBMS。
总而言之,对象数据库可以为您提供可读性和可维护性,但代价是复杂的查询性能。当然,可读性是一个见仁见智的问题,你不能忽视这样一个事实:了解 SQL 的开发人员比了解各种对象数据库方言的开发人员要多得多。
I can answer your question with respect to one Object database I know well: ZODB.
The ZODB allows you to persist your data models almost completely transparently. Its usage amounts to something like:
You'll have to look a long time to find that kind of readability with an RDMBS. That there is the big pro to using ZODB in a web application.
The big downside, as Marcello outlines, is lack of powerful querying. That's partly a side-effect of the convenience of the idiom above. The following is completely OK, and everything will get persisted to the database:
However, this kind of flexibility makes it hard to optimize complex queries across different models. Just making list of all yellow cars with neighbours will require
O(n)
time unless you roll your own index.So, it depends what you mean by "regular web development". Many websites don't actually require complex multi-dimensional queries and searches in linear time are no problem at all. In those cases, using an RDBMS can in my opinion over-complicate your code. I've written many CMS-type applications using solely an object database. Lots of CRUD doesn't particularly come into it; ZODB is very mature, and scales and caches pretty well.
However, if you're writing a web application that needs to do complex business reporting along the lines of Google Analytics, or some kind of warehouse inventory management system with many terabytes of data, then you're pretty definitely going to want an RDBMS.
To summarise, an object database can give you readability and maintainability at the cost of complex query performance. Of course, readability is a matter of opinion, and you can't ignore the fact that very many more developers know SQL than the various object database dialects.
在常规 Web 开发中,我使用 Seaside on Gemstone。对于大多数应用程序,这意味着我编写零数据库连接代码。它的性能、可扩展性、开发速度大约是原来的五倍。
我唯一一次再次使用关系数据库进行 Web 开发是当我必须连接到现有数据库时。
优点:
缺点:
In regular web development I use Seaside on Gemstone. For most applications, that means I write zero database connection code. It performs, it scales, development is about five times faster.
The only time I will ever use a relational database again for web development is when I have to connect to an existing one.
The advantages:
The disdadvantages:
关系数据库
缺点
对象数据库
缺点
对象关系数据库(你可能已经见过UDT!)
不同的应用程序可能需要不同的方法(OO、关系数据库或 OODB)
参考资料
大型语料库使用关系数据库的优势
关系数据库 关系数据库
OODMS 宣言
ODMG
关系数据库的好处
面向对象的数据库系统宣言
面向对象的数据库系统
DBMS 中的对象关系数据库
对象关系数据库系统的完整性标准
比较
http ://en.wikipedia.org/wiki/Comparison_of_relational_database_management_systems
http://en.wikipedia.org /wiki/Comparison_of_object_database_management_systems
http://en.wikipedia.org/wiki/Comparison_of_object-relational_database_management_systems
Relational db
Cons
Object DB
Cons
Object-Relational databases (You might have seen UDTs!)
Different approaches (OO, Relational DB or OODB) may be necessary for different applications
References
The advantage of using relational databases for large corpora
Relational Database Relational Database
OODMS manifesto
ODMG
Benefits of a relational database
The Object-Oriented Database System Manifesto
Object Oriented Database Systems
Object Relational Databases in DBMS
Completeness Criteria for Object-Relational Database Systems
Comparisons
http://en.wikipedia.org/wiki/Comparison_of_relational_database_management_systems
http://en.wikipedia.org/wiki/Comparison_of_object_database_management_systems
http://en.wikipedia.org/wiki/Comparison_of_object-relational_database_management_systems
我认为一切都取决于您问题的具体情况。 (我知道,我真的很冒险。)
我们所知道的是,您想要使用数据库进行 Web 开发,并且您将对数据进行大量操作。
要问自己的相关问题之一是,数据库与您操作的对象紧密集成有多重要?越是必要,面向对象的数据库就越推荐自己。
另一方面,如果您的数据很容易适合关系模型,那么关系数据库可能会更好。
考虑一下您需要执行的操作。您需要分析具有不同属性的各种物品吗?您需要多少资金来保证您的数据库面向未来?
我应该补充一点,如果您的数据库可能相当小,那么性能不会成为主要问题。但如果性能实际上是一个问题,那么除了 OO 与关系数据库之外,您还有很多事情需要担心。 (仅举关系数据库世界中的一个例子,您应该使用哪种规范化形式?这是一个非常重要且复杂的问题。您在维护操作系统还是数据仓库?您是否提前知道某些查询将被执行?是最重要的,还是可以忽略不计的?&c.)
除了数据库性能和与对象模型的集成问题之外,还有其他现实问题需要问。您有磁盘空间/服务器/带宽限制吗?您是否只向网络用户提供少量操作,或者您甚至不认识的人可能会创建自己的查询/编辑?
对于其他更重要的现实问题,您将与谁合作?他们已经知道(或喜欢)什么?如果您还没有领域知识,也许您个人的好奇心将您推向一个方向?如果您正在开始一个个人项目,那么遵循自己的喜好比在开始之前担心性能更能帮助您取得成功。
如果您能够回答这些问题和类似的问题,即使答案是“我不知道”,您也将能够获得更好的指导来继续进行。
I think that everything depends on the specifics of your issue. (I'm really going out on a limb here, I know.)
All we know is that you want to use the DB for web development, and you'll be doing a lot of operations on the data.
One of the relevant questions to ask yourself is, how important is it that the DB be tightly integrated with the objects you manipulate? The more it's necessary, the more an object-oriented DB recommends itself.
On the other hand, if your data easily lends itself to the relational model, a relational DB might be better.
Think about the operations you'll need to do. Will you need analysis of all sorts of items with different attributes? How much will you need to future-proof your DB?
I should add that if your DB is likely to be fairly small, performance will not be a major issue. But if performance is, in fact, an issue, you have lots of things to worry about beyond just OO vs. relational DBs. (Just to pick one example from the relational DB world, what normalization form should you use? This is an exceedingly important, and complex, question. Are you maintaining an operational system or data warehouse? Do you know ahead of time that certain queries are of paramount importance, or of negligible importance? &c.)
Beyond the question of DB performance and integration with your object model, there are other real-world questions to ask. Do you have diskspace / server / bandwidth limitations? Will you offer only a small number of operations to web users, or might people you don't even know be creating their own queries/edits?
For other, more important, real-world questions, whom will you be working with? What do they already know (or prefer)? And if you don't have domain knowledge yet, maybe you have personal curiosity pushing you in one direction? If you're starting on a personal project, following your own preferences is a better guide to success than worrying over performance before you even start.
If you can answer these and similar questions, even if the answer is "I don't know," you will be able to get much better direction in how to proceed.
与马塞洛深入而深思熟虑的回应相反,我想说,根据你的问题“常规网络开发”的措辞,我的即兴回应是,你很难找到足够的专业人士证明使用对象数据库而不是传统关系数据库的合理性,因为一个简单的事实是,有更多的资源/开发人员/教程/等更熟悉传统的关系模型,以及如何利用它来实现“常规 Web 开发”。
也就是说,我认为使用一些现代 ORM,您可以两全其美,因为您的底层数据存储在一个易于理解的 RDBMS 中(可能是稳定的、受支持的等),但是您可以仍然抽象出一些对象建模功能,这些功能(可以说)更适合开发 CRUD 应用程序。
我承认我不太熟悉现代 OODBMS 的当前功能,但是除非您所在的领域完全适合实现您领域的完美对象表示(并且您拥有可以利用的对象建模才能) ),那么我会坚持使用 RDBMS 作为持久存储。
希望有帮助!
In contract to Marcelo's in depth and well thought out response, I'd say that based on the phrasing of your question "regular web development", my off the cuff response would be to say that you'd be hard pressed to find enough pro's to justify using a Object DB over a traditional relational db, for the simple fact that are more resources/developers/tutorials/etc that are more familiar with the traditional relational model, and how to utilize that to achieve "regular web development".
That said, I think that with some of the modern ORMs you get a little of the best of both worlds, in that your underlying data is stored in a well-understood RDBMS (that is likely stable, supported, etc), but you can still abstract away some of the Object modeling capabilities that can (arguably) be more suited to developing CRUD applications.
I'll admit that I'm not well versed in the current capabilities of modern OODBMSs, however unless you are in a field that is completely suited to achieving a perfect object representation of your domain (and you have the object modeling talent to take advantage), then I'd stick with a RDBMS for your persistent storage.
Hope that helps!
这几乎解释了优点和缺点:
http://en.wikipedia.org/wiki/Object_database
This pretty much explains the pros and cons:
http://en.wikipedia.org/wiki/Object_database