多值数据库的优点和缺点
我刚刚开始一份新工作,我将不得不使用多值数据库(UniVerse)做大量的工作。我在关系数据库(SqlServer)方面拥有的数据库经验很少,我正在寻找一些有关 MVD 与关系数据库相比的优缺点的公正信息。
办公室里的每个人要么有关系数据库背景(并且讨厌 UniVerse),要么已经在这里工作多年并喜欢它。
I've just started a new job where I'm going to have to do a fair amount of work with a multi-value database (UniVerse). What little database experience I have is in relational databases (SqlServer) and I'm looking for some un-biased information about what the pros and cons of a MVD are compared to relational databases.
Everyone in the office either comes from a relational database background (and hates UniVerse) or has been here for years and loves it.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
首先,免责声明。我与 UniData(UniVerse 的姊妹数据库)合作,偶尔关于它的博客,所以我不能声称完全是公正;不过,我会尝试。
以下是您需要考虑的一些要点:
SQL DB 和多值 DB 之间的一个很大区别是 MVDB 不遵守 1NF。这有优点也有缺点。它可能(而且通常是)被滥用,但有时它也可能非常有用。最大的好处是,这意味着您并不总是需要连接表,它可以使某些查询更快。
与常规 SQL 数据库相比,它以一种全新的方式存储元数据。每个文件/表没有具体的模式。相反,它有 1 个或多个“字典”文件,这些文件由告诉您如何解释数据的记录组成。这不仅允许您存储数据的多种解释(原始/大写/小写、组合字段等),还允许您执行枚举和连接的等效操作。如果做得好,它可以非常强大。
遗憾的是,虽然这个概念有很大的潜力,但 DBMS 的工具集却很缺乏。发展是被驱动的,但极少数的商业案例似乎是由现有和未来的“保持光明”心态驱动的。在此基础上构建的老化软件系统。尽管它有集成工具(例如 .NET 连接器、用于 SQL 查询的 ODBC 接口等),但它们确实存在问题。例如,UniObjects .NET 接口在安全性方面缺乏任何粒度(基本上是全有或全无)。
它不仅仅是一个 DBMS,而且本质上是一个完整的应用程序平台。尽管 UniBasic 不像基于 .NET 的语言那么强大,但它确实胜过 T-SQL,并且可以快速生成业务规则。
First, a disclaimer. I work with UniData (the sister DB of UniVerse) and occasionally blog on it, so I cannot claim to be completely unbiased; I will try, however.
Here are some points of consideration for you:
A big difference between an SQL DB and a Multivalue DB is that the MVDB does not adhere to 1NF. This has pros and cons to it. It can be (and commonly is) abused, but there are times when it can be extremely functional. The biggest benefit is that it means you don't always need a join table which can make certain queries much faster.
It stores meta-data in a completely novel manner when compared to regular SQL DBs. Each file/table does not have a concrete schema. Instead, it has 1 or more 'dictionary' files which is made up of records that tell you how the data should be interpreted. This allows you to have not only store multiple interpretations of the data (raw/uppercase/lowercase, combined fields, etc) but also allows you to perform the equivalent of enums and joins. It can be extremely powerful if done right.
Sadly, although the concept has much potential, the DBMS's toolset is lacking. Development is driven but a extremely small set of businesses cases that appear to be driven by a 'keep-the-lights-on' mentality of existing & aging software systems that were built on it. Although it has tools for integration (such as .NET connectors, ODBC interface for SQL queries, etc) they do have issues. For example, the UniObjects .NET interface lacks any granulation in security (basically all or nothing).
It isn't just a DBMS, but is essentially an entire application platform. Even though UniBasic isn't as powerful as say a .NET based language, it sure beats the pants off T-SQL and has a fast turn around for pumping out business rules.
正如 Dave 所建议的,当您知道要检索的记录的密钥时,MV 数据库的设计效果确实最佳。有些人将它们称为基于记录的数据库系统,而不是 SQL,后者是基于集合的数据库系统。
这实际上取决于您想要做什么、需要如何构建数据以及您可以使用哪些其他工具。我大部分时间都花在 MV 上(主要是 Revelation 产品),我们定期处理超过 10,000,000 个记录集,而且速度还不错。
MV 数据库的优势在于数据是流动的。我们发现大多数客户将其用于法律、医疗和金融产品等应用;关系复杂并且会随着时间的推移而快速且剧烈变化的应用程序。
您可能想了解一下 no SQL 运动,它共享许多相同的概念,尽管 MV 和 no SQL 实际上不是同一件事。
MV 的主要缺点不在于它的结构,而在于它的工具。您通常会发现,由于开发人员基础较小,可用的工具包和帮助也较小。您可能还会发现大多数产品为您提供的嵌入式基本语言缺乏您习惯的对象样式编码。有时,甚至 JavaScript 作为一门语言似乎也具有更多功能。
话虽如此,由于 MV 数据库主要是巨型字符串,因此语言的字符串处理非常出色。它们非常适合直接操作 HTML 和 XML 字符串。
我想我最大的问题是,你有具体的问题吗?我不会说这就像从 Windows 迁移到 Linux 或 Mac,甚至从 Debian 迁移到 Red Hat,但结构和系统不同,因此它们有不同的概念、优势、限制和目的。如果您尝试像 SQL 一样处理 MV 数据库(您可以),您会发现它不是最合适的。设计不当的 MV 数据库可能会令人沮丧。一个设计良好的 MV 数据库可以是一件美丽的事情。
As Dave suggested, MV databases are really designed to work best when you know the key of the record you are trying to retrieve. Some people refer to them as record based database systems, as opposed to SQL, which is a set based database system.
It really depends on what you are trying to do, how the data needs to be structured, and what other tools you have available. I spend most of my time working in MV (Revelation products, mostly) and we handle records sets in the 10,000,000+ regularly, and the speed is fine.
The MV database strength is when the data is fluid. We find that most of our clients use it for applications like legal, medical and financial products; applications where the relationships are complex and can change rapidly and drastically over time.
You might want to look over the no SQL movement, which shares much of the same concepts, even though MV and no SQL really aren't the same thing.
The main downside of MV is less in it's structure, than it's tools. You'll generally find that since the developer base is smaller, the toolkit and help available is smaller. You might also find that the embedded basic language that most of the offerings give you lacks the object style coding you're used to. There are times even JavaScript seems like it has more functionality as a language.
Having said that, since the MV databases are primarily giant strings, the string handling of the languages are excellent. They're great for manipulating HTML and XML string directly.
I suppose the big question I have, is do you have specific questions? I'm not going to open a war saying it's like moving from Windows to Linux or a Mac, or even moving from Debian to Red Hat, but the structures and systems are different, so they have different concepts, strengths, limitations, and purposes. If you try handling a MV database like SQL (which you can), you'll find it's not the best fit. A poorly designed MV database can be an exercise in frustation. A well designed MV database can be a thing of beauty.
MV 数据库以从相对低功耗的服务器中获得出色的性能而闻名。
他们使用链接哈希文件系统,将大多数文件访问操作减少为数学运算,并在记录密钥已知时读取单个磁盘。在正确配置的系统中,只要记录密钥已知,从具有 1,000,000,000 条记录的文件文件中读取所需的时间不会比从具有 1,000 条记录的文件中读取的时间长。
记录键需要是唯一的,并且在可以以算法或编程方式确定记录键的方式设置记录键的应用程序中,数据库访问所涉及的开销可以是最小的。当然,这通常涉及以可能不被视为“关系”的方式访问数据库。
MV databases are know for squeezing awesome performance out of relatively low-powered servers.
They use a link-hash filing system that reduces most file access operations to a mathematical operation and a single disk read when the record key is known. In a properly configured system, reads from a file file with 1,000,000,000 records take no longer than those from a file with 1,000 records as long as the record key is known.
Record keys need to be unique, and in applications where a record key can be set up in such a way that it can be determined algorithmically or programatically, the overhead involved in database access can be minimal. But of course, this usually involves accessing the database in ways that would probably not be considered "relational".
本身没有优点和缺点——它们只是使用不同的方法来存储值。 UniVerse 使用分隔符来分隔值(IIRC 它使用 char(254) 和 char(253) 来分隔字段中的多个值,并使用 char(255) 来分隔数据文件中的实际记录。我可能是错的不过——距离我上次使用它已经过去了 10 多年)。有些人喜欢这种存储数据的方法,就像有些人仍然喜欢老式汽车而不是新型汽车,或者有些人更喜欢使用马车而不是现代机动车辆一样。 (当然这只是我的观点)。
在一个字段中存储多个值意味着您没有 SQLServer 会使用的额外表,您实际上具有一定程度的非规范化。如果使用原生支持 UniVerse 的技术(我们过去使用称为 CueBIC 的窗口系统),那么使用这些多值是非常容易和好的,但是当从 C++ 或 VB 等其他语言连接到数据库时,它就变成了 PITA - 然后你就可以使用这些多值。必须自己读取记录并分离出值。这意味着搜索这些多值也很困难。
但话又说回来,也许自从我上次使用它以来事情已经发生了变化,也许有人编写了一个很好的驱动程序,这样您就可以轻松地从 .Net 平台与 UniVerse 进行交互。我希望为了你,他们有。
There are no pros and cons as such - they just simply use different methods to store values. UniVerse uses a delimiter to separate the values (IIRC it uses char(254) and char(253) to split up the multiple values in a field, and char(255) to separate the actual records in the data file. I might be wrong though - it has been more than 10 years since i last used it). Some people love this method of storing data, just like some people still prefer vintage cars over late model ones, or some people prefer to use a horse and cart instead of a modern motor vehicle. (Of course this is just my opinion).
Storing multiple values within a field means that you don't have the extra table that SQLServer would have used, you effectively have a level of denormalisation. Using these multi values is all easy and good if using a technology that natively goes with UniVerse (we used to use a windowing system called CueBIC), but it becaomes a PITA when connecting to the database from another language like C++ or VB - you then have to read a record and separate out the values yourself. This means it was also difficult to search on those multi values.
But then again, maybe things have moved on since i last used it, maybe someone has written a nice driver so you can easily interface with UniVerse from a .Net platform. I hope for your sake they have.
扩展到文件中的大量项目(记录)效果很好。扩展到记录中的大量值或子值会产生性能问题。应用程序设计需要对限制值和低于几千个阈值的子值列表敏感。
字符串处理非常出色。整数处理也是如此。 MV Basic 语言是松散类型的,因此不要期望编译器执行太多操作。也就是说,由于 MV Basic 源项目就像任何其他数据一样,而编译器只是 DB 环境中的另一个动词,因此编写代码生成器和预编译器是一件轻而易举的事。这是在应用程序下面构建工具层的良好环境。
Scaling to lots of items (records) in files works well. Scaling to lots of values or sub-values within records will create performance issues. Application design needs to be sensitive to limiting value and sub values lists below the several 1000's threshold.
String handling is excellent. As is integer handling. The MV Basic languages are loosely typed so don't expect too much enforcement much from the compiler. That said since MV Basic source items are just like any other data and the compiler is just another verb in the DB environment, writing code generators and pre-compilers is a breeze. It is a good environment for building a tools layer below your application.