PostgreSQL 持久连接消耗大量内存

发布于 2024-07-29 06:11:49 字数 398 浏览 9 评论 0原文

我有一个 C++ 应用程序,它在 Windows 上使用 PostgreSQL 8.3。 我们使用 libpq 接口。

我们有一个多线程应用程序,其中每个线程打开一个连接并在没有 PQFinish 的情况下继续使用它。

我们注意到,对于每个查询(尤其是 SELECT 语句),postgres.exe 的内存消耗都会增加。 它高达 1.3 GB。 最终,postgres.exe 崩溃并强制我们的程序创建一个新连接。

以前有人遇到过这个问题吗?

编辑:shared_buffer 当前在我们的conf中设置为128MB。 文件。

EDIT2:我们现在采取的解决方法是为每笔交易调用 PQfinish。 但是,这会稍微减慢我们的处理速度,因为每次建立连接都非常慢。

I have a C++ application which is making use of PostgreSQL 8.3 on Windows. We use the libpq interface.

We have a multi-threaded app where each thread opens a connection and keeps using without PQFinish it.

We notice that for each query (especially the SELECT statements) postgres.exe memory consumption would go up. It goes up as high as 1.3 GB. Eventually, postgres.exe crashes and forces our program to create a new connection.

Has anyone experienced this problem before?

EDIT: shared_buffer is currently set to be 128MB in our conf. file.

EDIT2: a workaround that we have in place right now is to call PQfinish for every transaction. But then, this slows down our processing a bit since establishing a connection every time is quite slow.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

不必了 2024-08-05 06:11:49

在 PostgreSQL 中,每个连接都有一个专用的后端。 该后端不仅保存连接和会话状态,而且还是一个执行引擎。 放置后端并不是特别便宜,即使在空闲时,它们也会消耗内存和同步开销。

对于任何给定的 Pg 服务器在任何给定的工作负载上,都有一个最佳数量的活跃​​工作后端,其中添加更多的工作后端会减慢速度而不是加快速度。 您想要找到那个点,并将后端的数量限制在该水平附近。 不幸的是,这没有什么神奇的方法,它主要涉及对您的硬件和工作负载进行基准测试。

如果您需要更多连接,则应使用代理或池系统,该系统允许您将“连接状态”与“执行引擎”分开。 两个流行的选择是 PgBouncerPgPool-II 。 您可以维护从应用程序到代理/池程序的轻量级连接,并让它安排工作负载以保持数据库服务器以最佳负载运行。 如果传入的查询过多,有些查询会在执行前等待,而不是竞争资源并减慢服务器上的所有查询。

请参阅 postgresql wiki

请注意,如果您的工作负载主要是读取,特别是如果它包含不经常更改的项目,您可以为其确定可靠的 缓存失效方案,您还可以使用memcached或Redis来减少数据库工作负载。 这需要更改应用程序。 PostgreSQL 的 LISTENNOTIFY 将帮助您进行合理的缓存失效。

许多数据库引擎在核心数据库引擎的设计中内置了执行引擎和连接状态的某种分离。 Sybase ASE 确实如此,我认为 Oracle 也如此,但我对后者不太确定。 不幸的是,由于 PostgreSQL 的每个连接一个进程的模型,在后端之间传递工作并不容易,这使得 PostgreSQL 更难在本地执行此操作,因此大多数人使用代理或池。

我强烈建议您阅读 PostgreSQL 高性能。 我与 Greg Smith 或出版商没有任何关系/隶属关系*,我只是认为它很棒,并且如果您关心数据库的性能,将会非常有用。


* ...好吧,我写这篇文章时没有。 我现在在同一家公司工作。

In PostgreSQL, each connection has a dedicated backend. This backend not only holds connection and session state, but is also an execution engine. Backends aren't particularly cheap to leave lying around, and they cost both memory and synchronization overhead even when idle.

There's an optimum number of actively working backends for any given Pg server on any given workload, where adding more working backends slows things down rather than speeding it up. You want to find that point, and limit the number of backends to around that level. Unfortunately there's no magic recipe for this, it mostly involves benchmarking - on your hardware and with your workload.

If you need more connections than that, you should use a proxy or pooling system that allows you to separate "connection state" from "execution engine". Two popular choices are PgBouncer and PgPool-II . You can maintain light-weight connections from your app to the proxy/pooler, and let it schedule the workload to keep the database server working at its optimum load. If too many queries come in, some wait before being executed instead of competing for resources and slowing down all queries on the server.

See the postgresql wiki.

Note that if your workload is read-mostly, and especially if it has items that don't change often for which you can determine a reliable cache invalidation scheme, you can also potentially use memcached or Redis to reduce your database workload. This requires application changes. PostgreSQL's LISTEN and NOTIFY will help you do sane cache invalidation.

Many database engines have some separation of execution engine and connection state built in to the core database engine's design. Sybase ASE certainly does, and I think Oracle does too, but I'm not too sure about the latter. Unfortunately, because of PostgreSQL's one-process-per-connection model it's not easy for it to pass work around between backends, making it harder for PostgreSQL to do this natively, so most people use a proxy or pool.

I strongly recommend that you read PostgreSQL High Performance. I don't have any relationship/affiliation with Greg Smith or the publisher*, I just think it's great and will be very useful if you're concerned about your DB's performance.


* ... well, I didn't when I wrote this. I work for the same company now.

找个人就嫁了吧 2024-08-05 06:11:49

内存使用不一定是问题。 PostgreSQL 使用共享内存进行某些缓存,并且该内存在实际使用之前不会计入进程内存使用的大小。 使用该进程的次数越多,共享缓冲区的较大部分将在其地址空间中处于活动状态。

如果您的shared_buffers值很大,就会发生这种情况。 如果它太大,进程可能会耗尽地址空间并崩溃,是的。

The memory usage is not necessarily a problem. PostgreSQL uses shared memory for some caching, and this memory does not count towards the size of the process memory usage until it's actually used. The more you use the process, the larger parts of the shared buffers will be active in it's address space.

If you have a large value for shared_buffers, this will happen. If you have it too large, the process can run out of address space and crash, yes.

笨死的猪 2024-08-05 06:11:49

问题可能是你没有关闭交易,
在 PostgreSQL 中,即使你只选择而不使用 DML,它也会在需要回滚的事务中运行。
通过在事务结束时添加回滚将减少您的内存问题

The problem is probably that you don't close the transaction,
In PostgreSQL even if you do only selects without DML it runs in transaction which need to be rollback.
By adding rollback at the end of the transaction will reduce your memory problem

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文