Postgres 通过 ODBC 查询速度慢一个数量级?
我们有一个应用程序通过 psqlodbc 09.00.0200 驱动程序通过以下方式从 PostgreSQL 9.0.3 数据库获取一些数据:
1) SQLExecDirect 和 START TRANSACTION
2) SQLExecDirect 与
DECLARE foo SCROLL CURSOR FOR
SELECT table.alotofcolumns
FROM table
ORDER BY name2, id LIMIT 10000
3) SQLPrepare 与
SELECT table.alotofcolumns, l01.languagedescription
FROM fetchcur('foo', ?, ?) table (column definitions)
LEFT OUTER JOIN languagetable l01 ON (l01.lang = 'EN'
AND l01.type = 'some type'
AND l01.grp = 'some group'
AND l01.key = table.somecolumn)
[~20 more LEFT OUTER JOINS in the same style, but for an other column]
4) SQLExecute,参数 1 设置为 SQL_FETCH_RELATIVE,参数 2 设置为 1
5) SQLExecute,参数 1 设置为 SQL_FETCH_RELATIVE,参数 2 设置为 -1
6) SQLExecute,参数 1 设置为 SQL_FETCH_RELATIVE,参数 2 设置为 0
7) 释放全部,关闭游标,结束事务
函数 fetchcur 执行 FETCH RELATIVE $3 IN $1 INTO rec,其中 rec 是一条记录并返回该记录。步骤 4-6 会根据用户请求一次又一次执行,并且同时在此事务中执行更多查询。发出另一个用户请求之前也可能需要相当长的时间。通常查询需要这么长时间:
4) ~ 130 ms
5) ~ 115 毫秒
6) ~ 110 ms
对于快速的用户体验来说,这通常太慢。所以我在 psql 命令行中使用 \timing 尝试了相同的语句。对于步骤 3-6,我使用了以下语句:
3)
PREPARE p_foo (INTEGER, INTEGER) AS
SELECT table.alotofcolumns, l01.languagedescription
FROM fetchcur('foo', $1, $2) table (column definitions)
LEFT OUTER JOIN languagetable l01 ON (l01.lang = 'EN'
AND l01.type = 'some type'
AND l01.grp = 'some group'
AND l01.key = table.somecolumn)
[~20 more LEFT OUTER JOINS in the same style, but for an other column]
4-6)
EXPLAIN ANALYZE EXECUTE p_foo (6, x);
对于第一次 EXECUTE,它花费了 89 毫秒,但随后下降到约 7 毫秒。即使我在执行之间等待几分钟,每个查询的时间也会保持在 10 毫秒以下。那么额外的 100 毫秒会去哪里呢?应用程序和数据库位于同一系统上,因此网络延迟不应该成为问题。每个 LEFT OUTER JOIN 仅返回一行,并且仅将该结果的一列添加到结果集中。因此,结果是一行大约有 200 列,大部分是 VARCHAR 和 INTEGER 类型。但在同一台机器上传输的数据量不应大到需要 100 毫秒左右。所以任何提示都会有帮助。
该机器有 2 GB RAM,参数设置为:
shared_buffers = 512MB
effective_cache_size = 256MB
work_mem = 16MB
maintenance_work_mem = 256MB
temp_buffers = 8MB
wal_buffers= 1MB
编辑:我刚刚发现如何从 psqlodbc 创建 mylog,但我找不到其中的计时值。
EDIT2:还可以为每一行添加时间戳。这确实表明 psqlodbc 驱动程序收到响应需要 >100 毫秒的时间。因此,我再次尝试使用 psql
并添加选项 -h 127.0.0.1
以确保它也通过 TCP/IP 进行传输。 psql
的结果是 <10ms。这怎么可能?
00:07:51.026 [3086550720][SQLExecute]
00:07:51.026 [3086550720]PGAPI_Execute: entering...1
00:07:51.026 [3086550720]PGAPI_Execute: clear errors...
00:07:51.026 [3086550720]prepareParameters was not called, prepare state:3
00:07:51.026 [3086550720]SC_recycle_statement: self= 0x943b1e8
00:07:51.026 [3086550720]PDATA_free_params: ENTER, self=0x943b38c
00:07:51.026 [3086550720]PDATA_free_params: EXIT
00:07:51.026 [3086550720]Exec_with_parameters_resolved: copying statement params: trans_status=6, len=10128, stmt='SELECT [..]'
00:07:51.026 [3086550720]ResolveOneParam: from(fcType)=-15, to(fSqlType)=4(23)
00:07:51.026 [3086550720]cvt_null_date_string=0 pgtype=23 buf=(nil)
00:07:51.026 [3086550720]ResolveOneParam: from(fcType)=4, to(fSqlType)=4(23)
00:07:51.026 [3086550720]cvt_null_date_string=0 pgtype=23 buf=(nil)
00:07:51.026 [3086550720] stmt_with_params = 'SELECT [..]'
00:07:51.027 [3086550720]about to begin SC_execute
00:07:51.027 [3086550720] Sending SELECT statement on stmt=0x943b1e8, cursor_name='SQL_CUR0x943b1e8' qflag=0,1
00:07:51.027 [3086550720]CC_send_query: conn=0x9424668, query='SELECT [..]'
00:07:51.027 [3086550720]CC_send_query: conn=0x9424668, query='SAVEPOINT _EXEC_SVP_0x943b1e8'
00:07:51.027 [3086550720]send_query: done sending query 35bytes flushed
00:07:51.027 [3086550720]in QR_Constructor
00:07:51.027 [3086550720]exit QR_Constructor
00:07:51.027 [3086550720]read 21, global_socket_buffersize=4096
00:07:51.027 [3086550720]send_query: got id = 'C'
00:07:51.027 [3086550720]send_query: ok - 'C' - SAVEPOINT
00:07:51.027 [3086550720]send_query: setting cmdbuffer = 'SAVEPOINT'
00:07:51.027 [3086550720]send_query: returning res = 0x8781c90
00:07:51.027 [3086550720]send_query: got id = 'Z'
00:07:51.027 [3086550720]QResult: enter DESTRUCTOR
00:07:51.027 [3086550720]QResult: in QR_close_result
00:07:51.027 [3086550720]QResult: free memory in, fcount=0
00:07:51.027 [3086550720]QResult: free memory out
00:07:51.027 [3086550720]QResult: enter DESTRUCTOR
00:07:51.027 [3086550720]QResult: exit close_result
00:07:51.027 [3086550720]QResult: exit DESTRUCTOR
00:07:51.027 [3086550720]send_query: done sending query 1942bytes flushed
00:07:51.027 [3086550720]in QR_Constructor
00:07:51.027 [3086550720]exit QR_Constructor
00:07:51.027 [3086550720]read -1, global_socket_buffersize=4096
00:07:51.027 [3086550720]Lasterror=11
00:07:51.133 [3086550720]!!! poll ret=1 revents=1
00:07:51.133 [3086550720]read 4096, global_socket_buffersize=4096
00:07:51.133 [3086550720]send_query: got id = 'T'
00:07:51.133 [3086550720]QR_fetch_tuples: cursor = '', self->cursor=(nil)
00:07:51.133 [3086550720]num_fields = 166
00:07:51.133 [3086550720]READING ATTTYPMOD
00:07:51.133 [3086550720]CI_read_fields: fieldname='id', adtid=23, adtsize=4, atttypmod=-1 (rel,att)=(0,0)
[last two lines repeated for the other columns]
00:07:51.138 [3086550720]QR_fetch_tuples: past CI_read_fields: num_fields = 166
00:07:51.138 [3086550720]MALLOC: tuple_size = 100, size = 132800
00:07:51.138 [3086550720]QR_next_tuple: inTuples = true, falling through: fcount = 0, fetch_number = 0
00:07:51.139 [3086550720]qresult: len=3, buffer='282'
[last line repeated for the other columns]
00:07:51.140 [3086550720]end of tuple list -- setting inUse to false: this = 0x87807e8 SELECT 1
00:07:51.140 [3086550720]_QR_next_tuple: 'C' fetch_total = 1 & this_fetch = 1
00:07:51.140 [3086550720]QR_next_tuple: backend_rows < CACHE_SIZE: brows = 0, cache_size = 0
00:07:51.140 [3086550720]QR_next_tuple: reached eof now
00:07:51.140 [3086550720]send_query: got id = 'Z'
00:07:51.140 [3086550720] done sending the query:
00:07:51.140 [3086550720]extend_column_bindings: entering ... self=0x943b270, bindings_allocated=166, num_columns=166
00:07:51.140 [3086550720]exit extend_column_bindings=0x9469500
00:07:51.140 [3086550720]SC_set_Result(943b1e8, 87807e8)
00:07:51.140 [3086550720]QResult: enter DESTRUCTOR
00:07:51.140 [3086550720]retval=0
00:07:51.140 [3086550720]CC_send_query: conn=0x9424668, query='RELEASE _EXEC_SVP_0x943b1e8'
00:07:51.140 [3086550720]send_query: done sending query 33bytes flushed
00:07:51.140 [3086550720]in QR_Constructor
00:07:51.140 [3086550720]exit QR_Constructor
00:07:51.140 [3086550720]read -1, global_socket_buffersize=4096
00:07:51.140 [3086550720]Lasterror=11
00:07:51.140 [3086550720]!!! poll ret=1 revents=1
00:07:51.140 [3086550720]read 19, global_socket_buffersize=4096
00:07:51.140 [3086550720]send_query: got id = 'C'
00:07:51.140 [3086550720]send_query: ok - 'C' - RELEASE
00:07:51.140 [3086550720]send_query: setting cmdbuffer = 'RELEASE'
00:07:51.140 [3086550720]send_query: returning res = 0x877cd30
00:07:51.140 [3086550720]send_query: got id = 'Z'
00:07:51.140 [3086550720]QResult: enter DESTRUCTOR
00:07:51.140 [3086550720]QResult: in QR_close_result
00:07:51.140 [3086550720]QResult: free memory in, fcount=0
00:07:51.140 [3086550720]QResult: free memory out
00:07:51.140 [3086550720]QResult: enter DESTRUCTOR
00:07:51.140 [3086550720]QResult: exit close_result
00:07:51.140 [3086550720]QResult: exit DESTRUCTOR
EDIT3:我意识到我之前在 psql
测试中没有使用来自 mylog 的相同查询。看来psqlodbc
没有对SQLPrepare 和SQLExecute 使用PREPARE
。它添加参数值并发送查询。正如 araqnid 建议的那样,我将 log_duration 参数设置为 0,并将 postgresql 日志的结果与应用程序和 psql 的结果进行比较。结果如下:
psql/app pglog
query executed from app: 110 ms 70 ms
psql with PREPARE/EXECUTE: 10 ms 5 ms
psql with full SELECT: 85 ms 70 ms
那么如何解释这个值呢?似乎最多的时间花在将完整查询发送到数据库(10000 个字符)并生成执行计划。如果确实如此,将对 SQLPrepare 和 SQLExecute 的调用更改为通过 SQLExecDirect 执行的显式 PREPARE/EXECUTE 语句可以解决该问题。有异议吗?
We have an application which gets some data from a PostgreSQL 9.0.3 database through the psqlodbc 09.00.0200 driver in the following way:
1) SQLExecDirect with START TRANSACTION
2) SQLExecDirect with
DECLARE foo SCROLL CURSOR FOR
SELECT table.alotofcolumns
FROM table
ORDER BY name2, id LIMIT 10000
3) SQLPrepare with
SELECT table.alotofcolumns, l01.languagedescription
FROM fetchcur('foo', ?, ?) table (column definitions)
LEFT OUTER JOIN languagetable l01 ON (l01.lang = 'EN'
AND l01.type = 'some type'
AND l01.grp = 'some group'
AND l01.key = table.somecolumn)
[~20 more LEFT OUTER JOINS in the same style, but for an other column]
4) SQLExecute with param1 set to SQL_FETCH_RELATIVE and param2 set to 1
5) SQLExecute with param1 set to SQL_FETCH_RELATIVE and param2 set to -1
6) SQLExecute with param1 set to SQL_FETCH_RELATIVE and param2 set to 0
7) deallocate all, close cursor, end transaction
The function fetchcur executes FETCH RELATIVE $3 IN $1 INTO rec
where rec is a record and returns that record. Step 4-6 are executed again and again on user request and there are a lot more querys executed in this transaction in the meantime. It can also take quite some time before another user request is made. Usually the querys takes that long:
4) ~ 130 ms
5) ~ 115 ms
6) ~ 110 ms
This is normally too slow for a fast user experience. So i tried the same statements from psql command line with \timing on. For step 3-6 i used that statements:
3)
PREPARE p_foo (INTEGER, INTEGER) AS
SELECT table.alotofcolumns, l01.languagedescription
FROM fetchcur('foo', $1, $2) table (column definitions)
LEFT OUTER JOIN languagetable l01 ON (l01.lang = 'EN'
AND l01.type = 'some type'
AND l01.grp = 'some group'
AND l01.key = table.somecolumn)
[~20 more LEFT OUTER JOINS in the same style, but for an other column]
4-6)
EXPLAIN ANALYZE EXECUTE p_foo (6, x);
For the first EXECUTE it took 89 ms, but then it went down to ~7 ms. Even if i wait several minutes between the executes it stays at under 10 ms per query. So where could the additional 100 ms be gone? The application and database are on the same system, so network delay shouldn't be an issue. Each LEFT OUTER JOIN only returns one row and only one column of that result is added to the result set. So the result is one row with ~200 columns mostly of type VARCHAR and INTEGER. But that shouldn't be so much data to take around 100 ms to transfer on the same machine. So any hints would be helpful.
The machine has 2 GB of RAM and parameters are set to:
shared_buffers = 512MB
effective_cache_size = 256MB
work_mem = 16MB
maintenance_work_mem = 256MB
temp_buffers = 8MB
wal_buffers= 1MB
EDIT: I just found out how to create a mylog from psqlodbc, but i can't find timing values in there.
EDIT2: Also could add a timestamp to every line. This really shows that it takes >100ms until a respond is received by the psqlodbc driver. So i tried again with psql
and added the option -h 127.0.0.1
to make sure it also goes over TCP/IP. The result with psql
is <10ms. How is this possible?
00:07:51.026 [3086550720][SQLExecute]
00:07:51.026 [3086550720]PGAPI_Execute: entering...1
00:07:51.026 [3086550720]PGAPI_Execute: clear errors...
00:07:51.026 [3086550720]prepareParameters was not called, prepare state:3
00:07:51.026 [3086550720]SC_recycle_statement: self= 0x943b1e8
00:07:51.026 [3086550720]PDATA_free_params: ENTER, self=0x943b38c
00:07:51.026 [3086550720]PDATA_free_params: EXIT
00:07:51.026 [3086550720]Exec_with_parameters_resolved: copying statement params: trans_status=6, len=10128, stmt='SELECT [..]'
00:07:51.026 [3086550720]ResolveOneParam: from(fcType)=-15, to(fSqlType)=4(23)
00:07:51.026 [3086550720]cvt_null_date_string=0 pgtype=23 buf=(nil)
00:07:51.026 [3086550720]ResolveOneParam: from(fcType)=4, to(fSqlType)=4(23)
00:07:51.026 [3086550720]cvt_null_date_string=0 pgtype=23 buf=(nil)
00:07:51.026 [3086550720] stmt_with_params = 'SELECT [..]'
00:07:51.027 [3086550720]about to begin SC_execute
00:07:51.027 [3086550720] Sending SELECT statement on stmt=0x943b1e8, cursor_name='SQL_CUR0x943b1e8' qflag=0,1
00:07:51.027 [3086550720]CC_send_query: conn=0x9424668, query='SELECT [..]'
00:07:51.027 [3086550720]CC_send_query: conn=0x9424668, query='SAVEPOINT _EXEC_SVP_0x943b1e8'
00:07:51.027 [3086550720]send_query: done sending query 35bytes flushed
00:07:51.027 [3086550720]in QR_Constructor
00:07:51.027 [3086550720]exit QR_Constructor
00:07:51.027 [3086550720]read 21, global_socket_buffersize=4096
00:07:51.027 [3086550720]send_query: got id = 'C'
00:07:51.027 [3086550720]send_query: ok - 'C' - SAVEPOINT
00:07:51.027 [3086550720]send_query: setting cmdbuffer = 'SAVEPOINT'
00:07:51.027 [3086550720]send_query: returning res = 0x8781c90
00:07:51.027 [3086550720]send_query: got id = 'Z'
00:07:51.027 [3086550720]QResult: enter DESTRUCTOR
00:07:51.027 [3086550720]QResult: in QR_close_result
00:07:51.027 [3086550720]QResult: free memory in, fcount=0
00:07:51.027 [3086550720]QResult: free memory out
00:07:51.027 [3086550720]QResult: enter DESTRUCTOR
00:07:51.027 [3086550720]QResult: exit close_result
00:07:51.027 [3086550720]QResult: exit DESTRUCTOR
00:07:51.027 [3086550720]send_query: done sending query 1942bytes flushed
00:07:51.027 [3086550720]in QR_Constructor
00:07:51.027 [3086550720]exit QR_Constructor
00:07:51.027 [3086550720]read -1, global_socket_buffersize=4096
00:07:51.027 [3086550720]Lasterror=11
00:07:51.133 [3086550720]!!! poll ret=1 revents=1
00:07:51.133 [3086550720]read 4096, global_socket_buffersize=4096
00:07:51.133 [3086550720]send_query: got id = 'T'
00:07:51.133 [3086550720]QR_fetch_tuples: cursor = '', self->cursor=(nil)
00:07:51.133 [3086550720]num_fields = 166
00:07:51.133 [3086550720]READING ATTTYPMOD
00:07:51.133 [3086550720]CI_read_fields: fieldname='id', adtid=23, adtsize=4, atttypmod=-1 (rel,att)=(0,0)
[last two lines repeated for the other columns]
00:07:51.138 [3086550720]QR_fetch_tuples: past CI_read_fields: num_fields = 166
00:07:51.138 [3086550720]MALLOC: tuple_size = 100, size = 132800
00:07:51.138 [3086550720]QR_next_tuple: inTuples = true, falling through: fcount = 0, fetch_number = 0
00:07:51.139 [3086550720]qresult: len=3, buffer='282'
[last line repeated for the other columns]
00:07:51.140 [3086550720]end of tuple list -- setting inUse to false: this = 0x87807e8 SELECT 1
00:07:51.140 [3086550720]_QR_next_tuple: 'C' fetch_total = 1 & this_fetch = 1
00:07:51.140 [3086550720]QR_next_tuple: backend_rows < CACHE_SIZE: brows = 0, cache_size = 0
00:07:51.140 [3086550720]QR_next_tuple: reached eof now
00:07:51.140 [3086550720]send_query: got id = 'Z'
00:07:51.140 [3086550720] done sending the query:
00:07:51.140 [3086550720]extend_column_bindings: entering ... self=0x943b270, bindings_allocated=166, num_columns=166
00:07:51.140 [3086550720]exit extend_column_bindings=0x9469500
00:07:51.140 [3086550720]SC_set_Result(943b1e8, 87807e8)
00:07:51.140 [3086550720]QResult: enter DESTRUCTOR
00:07:51.140 [3086550720]retval=0
00:07:51.140 [3086550720]CC_send_query: conn=0x9424668, query='RELEASE _EXEC_SVP_0x943b1e8'
00:07:51.140 [3086550720]send_query: done sending query 33bytes flushed
00:07:51.140 [3086550720]in QR_Constructor
00:07:51.140 [3086550720]exit QR_Constructor
00:07:51.140 [3086550720]read -1, global_socket_buffersize=4096
00:07:51.140 [3086550720]Lasterror=11
00:07:51.140 [3086550720]!!! poll ret=1 revents=1
00:07:51.140 [3086550720]read 19, global_socket_buffersize=4096
00:07:51.140 [3086550720]send_query: got id = 'C'
00:07:51.140 [3086550720]send_query: ok - 'C' - RELEASE
00:07:51.140 [3086550720]send_query: setting cmdbuffer = 'RELEASE'
00:07:51.140 [3086550720]send_query: returning res = 0x877cd30
00:07:51.140 [3086550720]send_query: got id = 'Z'
00:07:51.140 [3086550720]QResult: enter DESTRUCTOR
00:07:51.140 [3086550720]QResult: in QR_close_result
00:07:51.140 [3086550720]QResult: free memory in, fcount=0
00:07:51.140 [3086550720]QResult: free memory out
00:07:51.140 [3086550720]QResult: enter DESTRUCTOR
00:07:51.140 [3086550720]QResult: exit close_result
00:07:51.140 [3086550720]QResult: exit DESTRUCTOR
EDIT3: I realized i didn't used the same query from the mylog in the psql
test before. It seems psqlodbc
doesn't use a PREPARE
for SQLPrepare and SQLExecute. It adds the param value and send the query. As araqnid suggested i set the log_duration
param to 0 and compared the results from postgresql log with that from the app and psql. The result are as follows:
psql/app pglog
query executed from app: 110 ms 70 ms
psql with PREPARE/EXECUTE: 10 ms 5 ms
psql with full SELECT: 85 ms 70 ms
So how to interpret that values? It seems the most time is spend sending the full query to the database (10000 chars) and generating a execution plan. If that is true changing the calls to SQLPrepare and SQLExecute to explicit PREPARE/EXECUTE statements executed over SQLExecDirect could solve the problem. Any objections?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我终于找到了问题,那就是psqlodbc的SQLPrepare/SQLExecute默认不执行PREPARE/EXECUTE。驱动程序本身构建 SELECT 并发送它。
解决方案是将
UseServerSidePrepare=1
添加到 odbc.ini 或 SQLDriverConnect 的 ConnectionString。从应用程序测量的一个查询的总执行时间从 > 100 毫秒下降到 5-10 毫秒。I finally found the problem and it was that psqlodbc's SQLPrepare/SQLExecute by default doesn't execute a PREPARE/EXECUTE. The driver itself builds the SELECT and sends that.
The solution is to add
UseServerSidePrepare=1
to the odbc.ini or to the ConnectionString for SQLDriverConnect. The total execution time for one query measured from the application dropped from >100ms to 5-10ms.我不认为 psql 和你的程序之间的时间具有可比性。
也许我遗漏了一些东西,但在 psql 中你只是准备语句,但从未真正获取数据。 EXPLAIN PLAN 也不发送数据
,因此时间差很可能是将所有行从服务器发送到客户端所需的网络流量。
减少这个时间的唯一方法是获得更快的网络或选择更少的列。您真的需要“alotofcolumns”中包含的所有列吗?
I don't think the timing between psql and your program are comparable.
Maybe I'm missing something, but in psql you are only preparing the statements, but never really fetching the data. EXPLAIN PLAN is not sending data either
So the time difference is most probably the network traffic that is needed to send all rows from the server to the client.
The only way to reduce this time is to either get a faster network or to select fewer columns. Do you really need all the columns that are included in "alotofcolumns"?