强制 PostgreSQL 释放分配的内存
我的 Postgres 在负载下达到了最大允许内存 (500MB) 并运行 14 个进程。一旦加载结束,Postgres 仍然保留分配的内存并运行 14 个进程。由于我有 Apache 和 Tomcat 在同一台机器上运行,我想 Postgresql 释放分配的内存。是否可以?
谢谢!
My Postgres hits the max permitted memory under the load (500MB) and runs 14 processes. Once load is over, Postgres still keeps the allocated memory and runs 14 processes. Since I have Apache and Tomcat running on the same machine, I'd like to Postgresql release the allocated memory. Is it possible?
Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
基本上,您需要通过两种方式查看系统内存,特别是 Postgres 内存。
首先是运行应用程序所需的内存。这本质上是所有应用程序在生产环境中运行时的静态内存负载。如果您没有足够的内存来满足所需的负载,那么您就没有足够的内存。现代系统可以在“危机”时期利用交换,但仅此而已。简而言之,如果您使用交换,您就会遇到内存危机,您应该希望它很快就会消失。
一旦拥有应用程序所需的基本内存,所有剩余的系统内存基本上都专用于磁盘缓存。
对于托管 Postgres 的系统,您有两种磁盘缓存。您有内核文件系统缓存,并且有 Postgres 内部缓存。
Postgres 内部缓存不会被释放。事情不是这样的。您在配置中告诉它,它可以使用 XXX 数量的 RAM 来实现其目的,并且它将保留它。在这种状态下,Postgres 不关心系统上还有什么。
如果缓存是内核缓存,并且文件系统活动突然激增(不是 Postgres),那么内核将缓存最近的页面并刷新旧页面。内核缓存将看到整个系统,而 Postgres 只能看到数据库活动。
所以。
在 Postgres 的世界观中,内核缓存正在与其缓冲区缓存竞争。考虑这种情况。 Postgres 请求一块磁盘。内核获取该块并缓存它。同时,Postgres 从内核中获取该块,并将其缓存。现在,该块已被冗余缓存。如果内核发现该缓存内存块有更好的用途,它会将其从 Postgres 块中清除,并加载新的。同时,Postgres 会将该块保留在其内部缓存中。
如果您有一台专用的 Postgres 机器,则没有理由拥有太多的内核缓存。由于所有磁盘 I/O 都是 Postgres I/O,因此内核缓存是冗余的,并且效率低于 Postgres 缓存。当 Postgres 缓存一个块时,它必须封送字节,更新其内部结构以及它执行的其他操作。一旦缓存,它就不再需要执行任何这些操作。因此,通过这种方式,Postgres 缓存的块比内核缓存的同一块更有效,因为将块从内核缓存移动到 Postgres 会产生一些费用。
然而,如果你有一台混合用途的机器,那么内核和 Postges 缓存将不得不进行斗争。如果您有一个足够小的数据库,可以将大部分日常操作数据放入 RAM,那么您应该有足够的缓冲区空间来在 Postgres 中处理该数据,以便从内存中处理大部分操作。这样,Postgres 将加载一次正常的“繁忙”页面,并将它们缓存起来,然后再也不会向内核请求它们。一旦完成,内核就可以使用其缓冲区高速缓存来处理所有辅助的其他系统请求。
在另一个极端,您为 Postgres 提供很少的专用缓冲区缓存,并让它仅依赖于内核缓存。这样,每次从内核缓存中读取每个块的成本都会稍高一些,但它比每次从磁盘读取它要便宜得多。这样内核就可以判断哪些进程更值得缓存关注。
在实践中,为 Postgres 确定一个良好的可操作、稳定的状态,然后就这样。任何活动峰值(比如对报告或其他内容进行大表扫描)都将通过内核缓存来缓解,当峰值结束时,内核可以恢复该内存以供其他用途。
因此,最重要的是,Postgres 不会归还任何内存。只给予你所能奉献的一切。
Basically, you need to look at your system memory, particularly Postgres memory, in two ways.
The first, is the memory necessary to run the applications. This is the, essentially, static memory load for all of the applications while they're running in production. If you don't have enough memory at the desired load, then you don't have enough memory. Modern systems can utilize swap in times of "crisis", but that's all. Simply, if you're using swap, you have a memory crisis and you should hope it goes away soon.
Once you have the base memory necessary for the applications, all of the remaining system memory is basically dedicated to disk cache.
With a system hosting Postgres, you have two kinds of disk cache. You have the kernels file system cache, and you have Postgres internal cache.
Postgres internal cache is not going to be released. That's not how it works. You told it in your configuration that it could use XXX amount of RAM for its purposes, and it's going to keep it. In this state, Postgres doesn't care about what else is on the system.
If the cache was kernel cache, and you had some sudden spike in file system activity that is NOT Postgres, then the kernel will cache the recent pages and flush the older pages. The kernel cache will see the entire system, whereas Postgres only sees DB activity.
So.
In the Postgres world view, kernel cache is competing with it's buffer cache. Consider this scenario. Postgres asks for a block of disk. The kernel grabs that block and caches it. At the same time, Postgres grabs takes that block from the kernel, and also caches it. Now, that block is redundantly cached. If the kernel find a better use for that block of cache memory, it will flush it out the Postgres block, and load the new one. Meanwhile, Postgres will retain that block in its internal cache.
If you have a dedicated Postgres machine, there's little reason to have much kernel cache. Since all of the disk I/O will be Postgres I/O, the kernel cache is redundant, and less efficient, than the Postgres cache. When Postgres caches a block, it must marshal the bytes in, update it's internal structures and whatever else it does. Once cached, it no longer has to do any of that. So in that way, a block cached by Postgres is more efficient than the same block cached by the kernel, as there is some expense in moving the block from the kernel cache in to Postgres.
However, if you have a mixed use machine, then the kernel and Postges caches are going have to fight it out. If you have a small enough DB that fits most of the daily operational data into RAM, then you should have enough buffer space to handle that within Postgres so most of its operations from memory. In that way, Postgres will load up its normal, "busy" pages once, and cache them, and then never ask the kernel for them again. Once that is done, the kernel can use its buffer cache to handle all of the ancillary other system requests.
At the other extreme, you give Postgres very little dedicated buffer cache, and have it rely solely on the kernels cache. In this way, each block is a little more expensive coming from the kernel cache each time, but it's far far cheaper than reading it from disk each time. In this way, the kernel can judge which processes are more worthy of cache attention.
In practice, decide on a good operational, steady state for Postgres and leave it at that. Any spikes in activity (say a big table scan for a report or whatever) will be mitigated by the kernels cache, and when that spike is over, the kernel can recover that memory for other uses.
So, bottom line, Postgres isn't going to give back any memory. Only give as much as you afford to dedicate to it.
除了一些常见的进程之外,PostgreSQL 为每个连接运行一个进程,因此您可能应该检查一下您的应用程序(或连接池)设置 - 连接在一段时间内不使用时是否被释放。
但我不认为这会在内存方面对你有很大帮助,因为 PG 分配的内存的主要部分是共享缓冲区,并且这部分内存永远不会被释放。
Except a few common processes, PostgreSQL runs process per connection so you should probably take a look on your application (or connection pooler) settings - if the connections are released if not used for some time.
But I don't expect this will help you a lot in the context of memory cause major part of PG allocated memory are shared buffers and this part of memory will never be released.
Postgres 使用大量内存通常是一件好事,因为它需要更少的磁盘空间。如果它可以将索引和最流行的数据块存储在内存中,您就可以实现非常高的缓存命中率,这可以转化为快速的性能,并减少 Java 应用程序服务器在等待事务完成时阻塞请求的麻烦。
所以一般来说你会希望 postgresql 的缓存能够快速预热并保留在内存中。
话虽如此,您当然还需要内存来运行其他应用程序。您可以通过减少连接池上的最大连接数来减少每个请求的内存消耗。这反过来又会限制 postgres 为处理请求和存储基于会话的临时数据而启动的进程数量。
当然,您可以调低共享缓冲区以获得较小的缓存。
另一种方法是减少 Apache 和 tomcat 允许处理的并发请求数。如果您的响应时间很快,您可能能够通过较少的并发请求并在 apache 中对请求进行排队来获得更高的吞吐量。然后,您可以限制整个堆栈中的内存消耗,并通过对负载设置上限,使请求处理时间保持相对恒定。
Postgres using lots of memory usually is a good thing, because it will have to go to disk less. If it can store the indexes and most popular data block in memory you can achieve very high cache hit ratios which translates in snappy performance and less trouble in the Java app server with requests which block while waiting for a transaction to complete.
So in general you would want for postgresql's cache to be warmed up quickly and stay in memory.
That being said, you of course also need memory for your other apps to run. You can reduce the per request memory consumption by reducing the maximum number of connections on your connection pool. This in turn will limit the number of processes postgres fires up to handle the requests and store the session based temporary data.
You can of course tune down the shared buffers for a smaller cache.
An alternative aproach is to reduce the number of concurrent requests the Apache and tomcat are allowed to handle. If you have snappy response times you might be able to get a higher throughput with less concurrent requests and queuing the requests in the apache. You then limit the memory consumption in the complete stack, and by placing a cap on the load, you'll keep the request handling time relatively constant.
在理想的设置中,您永远不会将数据库放在 Web 服务器和应用程序服务器上。就这一点而言,这些系统中的每一个都可能位于自己的服务器上。
In an ideal setup you would never have your database sitting on your web server and application server. For that matter each one of those systems would probably be on their own server.