我想要 ORM 吗?
我们有一个在三个应用程序中使用的对象模型。两个程序收集数据,另一个程序读取数据并生成报告。该系统非常断开连接,因此我们无法让所有程序都使用一个数据库。
目前,程序仅使用通用库来填充对象模型并序列化/反序列化到磁盘。具体来说,我们使用 XML 序列化。
这个模型有几个问题。 1) XML 可能被认为是浪费。文件可能会变得很大并且难以处理。老实说,文件大小现在并不是一个大问题。 2)我最关心的是内存占用。整个文件被加载到对象模型中,进行操作,然后保存。
希望我已经表达了我的担忧,在某些时候我们会在运行时遇到此应用程序的内存问题。足够的数据将被收集到单个“数据库”(xml 文件)中,因此无法一次全部加载到内存中。
我想要的是访问由文件存储而不是内存支持的对象模型。我希望对对象模型的更改最小化。当访问一个对象时,它来自磁盘,当它被设置时,它会被保存(如果可能的话,自动保存)。
我们已经研究了 NHibernate 与 SQLite、SQL Compact 4.0 和 EF 4 以及 LINQ to XML(简要)。我过去也使用 db4o 将对象缓存到磁盘,但那是一个不相关的项目。
在我投入时间学习其中一项之前,我想知道我的想法是否有意义。我是否可以拥有一个对象模型,它可以“神奇地”缓存到存储介质,而不是无限膨胀我的内存占用?完成这项工作的最短路径是什么,即使它不是最优雅的?
还有其他技术可以帮助我吗?内存映射文件、linq-to-sql、Lazy(T)(仅在需要时从文件中获取对象)。
我意识到这是一个开放式问题。如果有人有这样做的实际经验,我正在寻找大局回应和详细信息。链接会有帮助...
谢谢。
We have an object model being used across three applications. Two programs collect data, another reads it and generates reports. The system is very disconnected, so we cannot have a single database all the programs talk to.
Right now, the programs just use a common library to populate an object model and serialize/deserialize to the disk. Specifically, we're using XML serialization.
There are a couple problems with this model. 1) XML could be considered wasteful. The files could get large and unwieldy. Honestly, file size isn't a huge concern right now. 2) My biggest concern is memory foot print. The entire file is loaded into an object model, operated on, then saved.
Hopefully I've conveyed my worry, at some point we will run into memory issues with this application during runtime. Enough data will get collected into a single "database" (xml file) that it cannot be loaded into memory all at once.
What I would like to have, is access to my object model backed by file storage instead of memory. I want the changes to the object model to be minimal. When an object is accessed, it comes from the disk and when it is set, it is saved (automatically, if possible).
We have looked into NHibernate with SQLite, SQL Compact 4.0 and EF 4, and LINQ to XML (briefly). I've also used db4o in the past for caching objects to disk, but that was an unrelated project.
Before I dive in and commit time to learning one of these, I'd like to know if my idea makes sense. Can I have an object model that'll "magically" cache to a storage medium instead of just bloating my memory footprint infinitely? What's the shortest path to get this done, even if it isn't the most elegant?
Are there other technologies that could help me? Memory mapped files, linq-to-sql, Lazy(T) (for only fetching objects from files when needed possibly).
I realize this is an open ended question. I'm looking for a big picture response and details if someone out there has real world experience doing this. Links would be helpful...
Thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
是的,ORM 将对象模型映射到关系模型。它们在隐藏所有管道工作方面做得非常好,这些工作包括在数据库中读取和写入数据、缓存数据、管理内存等。它们也非常有能力缓存对象图的某些部分,并且可以做一些事情来提高性能比如延迟加载数据。
Yes, ORM's map an object model to a relational model. They do a pretty good job of hiding all the plumbing work that goes into reading and writing data to the database, caching data, managing memory, etc. They are also very capable of caching parts of an object graph and can do things to improve performance such as lazy loading data.
出于同样的考虑,我刚刚完成将 XML 文件支持的(主要是“继承的”)Web 应用程序迁移到 NHibernate。我也遇到过同样的情况,以前从未使用过 NHibernate,但想在这个过程中学习。
这个想法确实有道理。您确实能够仅在内存中加载您实际需要的部分(而不是整个数据库),并且还有更多好处。
根据您要求的其他内容(对对象模型进行最小的更改,易于从实际应用程序迁移到基于 ORM 的应用程序),我不确定您是否会如此轻松地获得它们。
对于 NHibernate 和 EF4 等 ORM,模型类非常轻量级:它们基本上只不过是属性容器。基于 XML 文件的应用程序往往在模型中直接包含更多逻辑:您可能必须将这些逻辑移至数据访问层。重新设计模型和数据访问层可能是您将面临的最耗时的任务。我知道这是给我的。
我从你的问题中推断出的另一件事(你说你不能让所有三个程序都与同一个数据库通信,并且你提到了 SQLite 和 SQL Compact),你通过物理复制文件来在三个应用程序之间复制数据。您如何检测变化以及您需要 3 个 DB 的对齐程度如何?您目前如何合并更改(如果您拥有的 3 个应用程序中有 2 个可以写入数据)?
ORM 可能会也可能不会帮助您,具体取决于您复制数据的方式。
根据您的意见编辑更多要点
I've just finished migrating a (mostly "inherited") web application backed by XML files to NHibernate exactly out of the same concerns. I also was in the same situation of having never used NHibernate before and wanting to learn in the process.
The idea does make sense. You will indeed be able to load in memory just the parts you actually need (as opposed to the whole DB) and many more benefits than that.
Based on the other things you ask for (minimal changes to your object model, easy to migrate from actual to ORM-based applications) I'm not sure you'll get them so easily.
With ORMs like NHibernate and EF4, model classes are very lightweight: they're basically little more than property containers. Applications based on XML files tend to have more logic directly in the model: that logic you will likely have to move to the data access layer. Redesigning the model and data access layer will likely be the most time consuming task you will face. I know it was for me.
Another thing I infer from your question (you say you can't have all three programs talking to the same DB and you mention SQLite and SQL Compact) is that you're replicating your data among your three applications by copying the file physically. How do you detect changes and how aligned do you need the 3 DBs to be? How do you currently merge changes (in case 2 of the 3 apps you have can write data)?
Depending on how you replicate data, this is something an ORM may or may not help you with.
Edit some more points based on your comments
我的建议是您使用文档数据库。 RavenDB 会给你带来很多好处。您将能够存储对象,而无需将它们转换为关系模型,就像 db4o 所做的那样。
然而,使用 RavenDB,您有非常丰富的可能性使用 Map/Reduce 查询数据。另一个好处是它是使用 .NET for .NET 编写的,因此您将喜欢在 Linq 中进行查询。它可以作为 Windows 服务或在 IIS 中运行。它还可以嵌入运行,这非常适合测试目的。对于您的数据收集应用程序来说,这可能是个好主意。
RavenDB 还支持复制,以便您可以将数据存储在一个实例中并将其复制到另一个实例。也许这可以解决您的分布式设置。
但是,根据您的分布式设置的性质,我认为您最好使用服务总线。您需要数据收集应用程序中的状态吗?如果没有,只需在服务总线上发布一条包含数据的消息,以供系统的另一部分使用。听起来可能很复杂,但实际上并非如此。看看nServicebus。
My suggestion is that you use a document database. RavenDB would give you a lot of benefits. You would be able to store the objects without converting them into a relational model, much like db4o would do.
However, with RavenDB you have a very rich possibility to query the data using Map/Reduce. Another benefit is that it is written using .NET for .NET, so you will enjoy querying in Linq. It can run as a Windows Service or in IIS. It can also run embedded which is great for testing purposes. It might be a good idea for your data collecting applications.
RavenDB also support replication, so that you can store data in one instance and replicate it to another. Maybe that could solve the distributed setup you have.
However, depending of the nature of your distributed setup I think you would be better off using a service bus. Do you need state in the data collecting applications? If not, just publish a message containing the data on the service bus for another part of the system to consume. It might sound complex, but it's actually not. Have a look at nServicebus.