构建可扩展的服务器

发布于 2024-10-01 00:26:36 字数 351 浏览 5 评论 0原文

所以我正在开发一个服务器应用程序,它必须存储数十万个(在某些情况下高达几百万个)类,将它们序列化到 SQL 数据库,然后将它们加载回几次,并且看起来存储了这么多类对象到列表是什么引发内存不足异常?我认为。

因此,这就带来了问题:

  • 我如何才能避免此类错误,同时仍然处理所有大约一百万个类?
  • 拥有这么多数据还会带来其他问题吗?
  • 我还可以做哪些其他事情来确保我的服务器完全可扩展并最终能够处理和管理尽可能多的数据?

这个问题的要点是,我需要这么多的类都在内存中运行,因为我需要不断更新它们,这比我想要序列化到 SQL 数据库所需的时间要长。目前,我使用的内存甚至比我最终需要的还要少!

So i'm developing a server application, that has to store hundreds of thousands (up to a few million in some cases) of classes, serialize them to an SQL database, and load them back several times, and it appears that storing that many class objects to a List is whats throwing an out of memory exception? i think.

So that brings the questions

  • how can i avoid such errors while still handling all of my million or so classes?
  • are there other problems that can come from having this much data?
  • what other things can i do to make sure my server is fully scalable and can ultimately handle and manage as much data possible?

The point of this question being, i will need this many classes all running in memory, as i will need to be continually updating them in such a way that would take longer than i'd like to serialize to an SQL database. Right now, currently, im using less memory then i'd ultimately need even!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

零時差 2024-10-08 00:26:36

您可能指的是对象,而不是类;-)

可扩展的处理架构通常涉及以下内容:

在任何时间点,内存中只有有限数量的对象(可以是一个,可以是十个,可以是一百个,但是如果它必须是“无论我需要多少”,那么你必须重新考虑你的设计)。这可确保您永远不会耗尽内存,因为最大内存使用量是固定的。

所有对象都存储在数据库中。当您需要内存中没有的对象时,请从数据库加载它。除非它是前面提到的简短对象列表的一部分,否则不要保留它。

要利用短列表未使用的内存,请在代码和数据库之间插入一个缓存层,这样,如果您最终多次获取相同的对象,则这样做的成本将会降低。缓存策略意味着,如果有可用内存,您的软件只会用内存来换取速度。

尝试使用小型事务来工作,读取一些内容,写回一些内容,然后重新开始。如果在处理过程中发生崩溃或中断,这可以帮助您的软件从离开的位置恢复。该数据库应该足以从它离开的地方重新开始。

通过使用独立的事务,可以让多个工作进程(在同一台计算机上或在计算网格上)在同一个数据库上工作。如果可以的话,实现基于事务性工作线程的模型对于性能非常有利,并且可以更轻松地投入更多计算机来解决问题。

You probably mean objects, not classes ;-)

A scalable processing architecture usually involves the following:

At any point in time, have only a limited number of objects in memory (could be one, could be ten, could be a hundred, but if it has to be "however many I'll need" then you must rethink your design). This ensures that you never run out of memory, because the maximum memory usage is fixed.

All objects are stored in a database. When you need an object that's not in memory, load it from the database. Don't keep it around unless it's part of the previously mentioned short list of objects.

To take advantage of memory not used by your short list, insert a caching layer between your code and the database, so that if you end up fetching the same object a lot, the cost for doing so will be reduced. The cache strategy means your software will only trade memory for speed if there's memory available.

Try to work using small transactions that read some things, write some things back, then start again. This helps your software resume from where it left, should a crash or outage happen while it's processing. The database should be enough to start over again from where it left.

By working with independent transactions, it's possible to have multiple worker processes (either on the same computer, or on a computing grid) working on the same database. If you can, implementing a transactional worker-based model is great for performance, and makes it much easier to just throw more computers at the problem.

£冰雨忧蓝° 2024-10-08 00:26:36

首先,显而易见的是:确保您有足够的内存。分析您的代码以找出(大约)内存中同时有多少个对象,然后使用内存分析器。请参阅此相关问题:C#/.NET 对象使用多少内存?

其次,如果您确实需要数百万个对象,那么重新考虑您的设计可能是有意义的。在许多情况下,像大型多维数组这样简单的东西可能比复杂的 .net 类树更高效(并且在内存方面更可预测)。这个建议是否适用于你的情况,我无法根据手头的数据来判断。

第三,如果不需要将所有这些数据同时存储在内存中,那么就不要这样做。如今,SQL 数据库速度相当快(并且使用智能缓存机制),因此仅在列表中包含当前需要的对象(而不是将所有内容加载到内存中)可能是有意义的。此外,通过 SQL 数据库索引进行搜索甚至可能比遍历巨大的内存列表更快。

Firstly, the obvious: Make sure you have enough RAM. Analyze your code to find out (approximately) how many objects you will have in memory at the same time and then use a memory profiler. See this related question: How much memory does a C#/.NET object use?

Secondly, if you really need millions of objects, it might make sense to rethink your design. In many cases, something simple like a large, multi-dimensional array might be more efficient (and more predictable memory-wise) than a complex tree of .net classes. Whether this advice applies to your case or not, I cannot say with the data at hand.

Thirdly, if it's not necessary to have all this data in memory at the same time, then don't do it. SQL databases are quite fast nowadays (and use smart caching mechanisms), so it might make sense to have only the objects in your list that you currently need (rather than loading everything into memory). In addition, searching through an SQL database index might even be faster than traversing a huge in-memory list.

§普罗旺斯的薰衣草 2024-10-08 00:26:36

缓存一些经常读入 Memcached 之类的数据库数据可能是值得的。 http://en.wikipedia.org/wiki/Memcached

It may be worth caching some of your database data that is frequently read into something like Memcached. http://en.wikipedia.org/wiki/Memcached

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文