有数据库支持自动/定时关闭数据库吗?

发布于 2024-08-20 02:01:40 字数 408 浏览 5 评论 0原文

我们希望实现一种应用程序架构,其中磁盘上有大量数据库,每个客户都有一个数据库文件。

当用户请求进入时,数据库将被打开(如果尚未打开)。

经过一段时间没有活动后,数据库将被服务器自动关闭,从而释放数据库服务器资源。

通过这种架构,我们应该能够在磁盘上拥有大量数据库,但任何时候只能将其中的一个子集加载到数据库服务器中。

问题是很少有数据库似乎支持自动关闭数据库的概念。看起来也许 Microsoft SQL Server 允许这样做,但我们正在使用所有开源技术,因此 SQL Server 不是一个选项。

我们将考虑任何免费或开源数据库技术,但我看不到任何支持自动关闭功能的技术。

有谁知道有什么不同吗?

更新:寻找基于 Linux 而不是 Windows 的解决方案。

谢谢

We would like to implement an application architecture in which we have large numbers of databases on the disk, one database file for each customer.

When a user request comes in the database is opened (if it isn't already).

After a certain period of time with no activity, the database is closed automatically by the server, thereby freeing up database server resources.

With this architecture we should be able to have large numbers of databases on the disk, but only a subset of them loaded into the database server at any time.

The problem is that few databases seem to support the concept of automatically closing a database. It appears that maybe MIcrosoft SQL server allows this but we're using all open source technologies so SQL server is not an option.

We'll consider any free or open source database technology at all but I can't see any that support auto close functionality.

Anyone know any different?

UPDATE: looking for a solution based on Linux not Windows.

Thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

恍梦境° 2024-08-27 02:01:41

据我所知,您可能有足够的客户来处理文件句柄短缺的流程。数据库连接池怎么样?

当用户请求到达时,查看该用户的数据库是否打开。如果是,则使用该连接并重置上次访问时间标志。

如果该用户的数据库未打开,则打开该连接,设置上次访问时间,然后使用该连接(如果没有可用的连接,则抛出错误)。另外,分叉一个进程/线程/轻量级进程/无论您在环境中如何称呼它,都会检查:

如果池中有足够数量的未使用连接,则线程已完成

如果没有,则扫描最旧的最后访问的 5%-25% ,或在最后一分钟/小时/天未使用的连接(无论适合您的用户请求模式)并关闭它们,移至未使用的池

确保在未使用的池中保留足够的可用连接来处理传入请求。

I understand that you could have enough customers for a process to get short on file handles. How about a pool of DB connections?

When a user request arrives, see if that user's DB is opened. If so, use the connection and reset a time of last access flag

If that user's DB is not open, open the connection, set the last access time, and use the connection (if there's no available connection, throw an error). Also, fork a process/thread/lightweight process/whatever you call it in your environment that checks:

If the pool has an adequate number of unused connections, the thread is done

If not, scan for the oldest last accessed 5%-25%, or those not used in the last minute/hour/day (whatever is appropriate for your user request pattern) and close them, moving to the unused pool

Make sure you keep enough available connections in the unused pool to handle incoming requests.

暮年 2024-08-27 02:01:41

带有 cron 作业的 mySql。

此外,mySql 的占用空间非常小(与 Sql Server 相比)...一个例子是,它不会占用内存(是的,我知道可以限制 Sql Server 内存使用量)。

mySql 还具有非常高效且有用的连接池。

mySql with a cron job.

In addition, mySql has a very small footprint (compared to Sql Server)... one example, is that it does not hog memory (and yes I know one can cap Sql Server memory usage).

mySql also has connection pooling which is very efficient and useful.

如日中天 2024-08-27 02:01:41

我认为“关闭数据库”是指他们会释放缓存?因为“关闭”磁盘上的实际文件确实没有任何好处,所以它们的资源使用量可以忽略不计。

某些数据库引擎使用操作系统的磁盘缓存。 MySQL 的 MyISAM 存储引擎就是一个例子,但不提供完整性保证,因此排除了它的许多用途。然而,MySQL 的其他引擎(例如 InnoDB)不提供此功能。

PostgreSQL原生使用操作系统的缓存作为二级缓存;虽然一级缓存 (shared_buffers) 仍然始终消耗内存,但即使在性能关键型服务器上,通常也将其设置为仅占内存的 10-25%。其余部分免费用于操作系统级缓存,并在需要时分配给数据库,并在其他应用程序需要时可供使用。

I assume that by "closing databases" you mean that they would free their cache memory? Because there really is no benefit to "closing" actual files on disk, their resource usage is negligible.

Some database engines use the operating system's disk cache. MySQL's MyISAM storage engine is an example, but doesn't offer integrity guarantees, so that excludes many of its uses. MySQL's other engines like InnoDB do not offer this, however.

PostgreSQL natively uses the operating system's cache as a second-level cache; while the first-level cache (shared_buffers) still consumes memory all the time, it's common to set it to only 10-25% of your memory even on performance-critical servers. The rest is free for OS-level caching, and will be allocated to the database when needed, and is available to other applications when they need it.

泪是无色的血 2024-08-27 02:01:40

您确定这确实是一个问题吗?我只是提到,因为开放数据库的成本可能非常小,特别是“开放”很可能包括同步等待数据库的任何未完成的事务,并进行基本的一致性检查(特别是加载存储的几页数据)在磁盘上)。

一旦完成,没有活动,服务器上就没有真正需要维护的数据了。

如果您考虑一下,数据库系统最基本的功能就是使用内存管理数据库页面的缓存。当对一段数据发出请求时,系统会定位其所在的实际页面,并检查 RAM 以查看它是否已加载。如果没有,它将从磁盘加载。

如果您还注意到,大量的数据库“元”数据存储在数据库中。这意味着当系统想要了解任何信息时,它会有效地利用自身来定位信息,特别是数据页缓存子系统。

与任何其他缓存一样,当数据过期且不再需要时,它会被刷新回磁盘并在必要时重新获取。

因此,这意味着一旦数据库被“打开”,维护其状态真正必需的任何信息都可能通过数据缓存子系统进行维护,而对于未使用的数据库,则将其释放回磁盘以为当前流量腾出空间。

这就是为什么我很好奇您是否测试过候选数据库,看看是否遇到了与此相关的问题,或者数据库是否具有“打开数据库”的概念。

当我们作为客户端讨论这个问题时,重点往往是与数据库服务器的连接。但是,一旦所有这些都关闭,我认为系统不会保留有关不活动的特定数据库的任何大量内存数据。

毕竟,数据库中的所有(ALL)数据都“相同”存储,表是表是表,索引是索引是索引,特别是在管理所有数据页的中央服务器上作为一个大数据“汤”。

您可能遇到的唯一问题是,您的数据库是否碰巧专门为每个数据库创建了一个文件,并且该文件保持打开状态。最终你可能会用完文件描述符。

但大多数现代系统不会这样做,它们将所有内容存储在一大堆文件中,无论它们位于什么数据库或模式中(当然,除非您进行或服务器允许的特定表空间分配)。

所以,本质上,我不认为这是一个问题,因为我认为现代数据库并没有真正做出你在内部讨论的那种区别。多个数据库或模式是系统内的逻辑工件,而不是技术实现,并且所有数据页最终都位于同一缓存中并使用相同的资源,无论它们来自什么模式、数据库、表或索引。

我会对您选择的数据库进行一些测试,看看这是否是一个问题。例如,您可以创建 1M 数据库,尽可能降低数据库的内存,然后开始循环浏览它们,一次打开您认为合适的数量(10、100、1000,等等),看看是否可以你有任何问题。

最后,我不“知道”任何特定数据库的任何内容,这只是对历史数据库如何实现的直觉。

Have you ascertained that this is really an issue? I only mention that because the cost of an open database is likely quite small, specifically "opening" most likely consists of syncing up any outstanding transactions waiting for the database, and doing a basic consistency check (notably loading a couple of pages of data stored on the disk).

Once that is done, without activity, there isn't really a lot of data to maintain on the server.

If you think about it, the most base functionality of a DB system is to manage the caching of database pages with memory. When a request is made for a piece of data, the system locates the actual page its on, and checks RAM to see if it's loaded. If not, it loads it from disk.

If you also notice, the vast bulk of DB "meta" data is stored -- in the database. That means that when the system wants to know anything, it effectively uses itself to locate the information, specifically the data page caching subsystem.

Like any other cache, as data is expired and no longer needed, it gets flush back out to disk and refetched when necessary.

So, this implies that once a database has been "opened", any information really necessary to maintain its state will likely be maintained via the data cache subsystem, and for unused database, released back to disk to make room for current traffic.

This is why I'm curious if you've tested your candidate DBs to see if you run in to issues about this, or if the database even has the concept of "opening a database".

When, as a client, we discuss this, the focus tends to be on connections to the database server. But once all of those are closed, I do not think that the system is going to retain any significant amount of in memory data about a particular database that is inactive.

After all, all (ALL) of the data in a database is stored "the same", a table is a table is a table, an index is an index is an index, particularly on central server where all of the data pages are managed as a single big "soup" of data.

The only issue you might run in to is if your database happens to create a file specifically for each database, and that file remains open. Eventually you may run out of file descriptors.

But most of the modern systems don't do that, they store everything in a big blob of files irrespective of what database or schema they're in (barring specific table space allocations that you make or the server allows, of course).

So, essentially, I don't think this is a problem, as I don't think the modern databases really make the kind of distinctions that you're talking about internally. That multiple databases or schema are a logical artifact within the system, not a technical implementation, and that all of the data pages end up in the same cache and use the same resources regardless of what schema, database, table, or index they come from.

I would do some testing on your database of choice to see if this is a problem. For example, you could create 1M databases, lower the memory as much as possible for the database, and then just starting cycling through them, opening however many at a time you feel is appropriate (10, 100, 1000, whatever) and see if you have any problems.

Finally, I don't "know" any of this for any specific database, it's just gut instinct on how historically databases are implemented.

挽手叙旧 2024-08-27 02:01:40

我有这个想法,假设您使用的是 Windows:

  1. 您的数据库将作为服务运行,并且每个客户端都有自己唯一的服务名称。
  2. 您编写一个批处理文件,它将启动/停止该服务。
  3. 每当您需要时,都会从您的服务器调用批处理文件。

I have this idea and assuming you are using Windows:

  1. Your database will be running as a service, and each client has his own unique service name.
  2. You write a batch file, that will start/stop that service.
  3. the batch file will be called from your server whenever you want.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文