值得使用临时表来复制行的开销吗?
当面临复制数据库中的记录并仅修改少数值的任务时,我倾向于使用临时表来避免写出所有未更改的列。有谁知道这会如何影响大型系统的性能?
一个简单的例子(这也说明了为什么我更喜欢临时表方法):
假设我有一个包含 50 列 col1
...col50
的表 mytbl
代码>.我想插入一条新记录,该记录与 col5 = 'Some Value'
的行完全相同,但 col45
将设置为 'Some other value '
。
方法 1
CREATE GLOBAL TEMPORARY TABLE tmptbl AS
SELECT * FROM myschema.mytbl;
INSERT INTO tmptbl
(SELECT *
FROM myschema.mytbl
WHERE mytbl.col5 = 'Some Value');
UPDATE tmptbl
SET col45 = 'Some Other Value';
INSERT INTO myschema.mytbl
(SELECT * FROM tmptbl);
DROP TABLE tmptbl;
方法 2
INSERT INTO myschema.mytbl (col1,
col2,
col3,
col4,
col5,
col6,
col7,
col8,
col9,
col10,
col11,
col12,
col13,
col14,
col15,
col16,
col17,
col18,
col19,
col20,
col21,
col22,
col23,
col24,
col25,
col26,
col27,
col28,
col29,
col30,
col31,
col32,
col33,
col34,
col35,
col36,
col37,
col38,
col39,
col40,
col41,
col42,
col43,
col44,
col45,
col46,
col47,
col48,
col49,
col50)
SELECT col1,
col2,
col3,
col4,
col5,
col6,
col7,
col8,
col9,
col10,
col11,
col12,
col13,
col14,
col15,
col16,
col17,
col81,
col19,
col20,
col21,
col22,
col23,
col24,
col25,
col26,
col27,
col28,
col29,
col30,
col31,
col32,
col33,
col34,
col35,
col36,
col37,
col38,
col39,
col40,
col41,
col42,
col43,
col44,
'Some Other Value',
col46,
col47,
col48,
col49,
col50
FROM myschema.mytbl
WHERE col5 = 'Some Value';
创建/删除临时表会带来多少开销?例如,如果这是生产规模系统上日常流程的一部分,那么额外的开销会很明显吗?我意识到这在很大程度上取决于系统的具体情况,但总体思路会很棒。
When faced with the task of copying a record in a database and modifying just a handful of values, I tend to use a temporary table to avoid writing out all of the unchanged columns. Does anyone know how this would affect performance on a large scale system?
A quick example (which also shows why I prefer the temporary table method):
Let's say I have a table mytbl
with 50 columns col1
...col50
. I want to insert a new record that is an exact duplicate of the row where col5 = 'Some Value'
except that col45
will be set to 'Some other value'
.
Method 1
CREATE GLOBAL TEMPORARY TABLE tmptbl AS
SELECT * FROM myschema.mytbl;
INSERT INTO tmptbl
(SELECT *
FROM myschema.mytbl
WHERE mytbl.col5 = 'Some Value');
UPDATE tmptbl
SET col45 = 'Some Other Value';
INSERT INTO myschema.mytbl
(SELECT * FROM tmptbl);
DROP TABLE tmptbl;
Method 2
INSERT INTO myschema.mytbl (col1,
col2,
col3,
col4,
col5,
col6,
col7,
col8,
col9,
col10,
col11,
col12,
col13,
col14,
col15,
col16,
col17,
col18,
col19,
col20,
col21,
col22,
col23,
col24,
col25,
col26,
col27,
col28,
col29,
col30,
col31,
col32,
col33,
col34,
col35,
col36,
col37,
col38,
col39,
col40,
col41,
col42,
col43,
col44,
col45,
col46,
col47,
col48,
col49,
col50)
SELECT col1,
col2,
col3,
col4,
col5,
col6,
col7,
col8,
col9,
col10,
col11,
col12,
col13,
col14,
col15,
col16,
col17,
col81,
col19,
col20,
col21,
col22,
col23,
col24,
col25,
col26,
col27,
col28,
col29,
col30,
col31,
col32,
col33,
col34,
col35,
col36,
col37,
col38,
col39,
col40,
col41,
col42,
col43,
col44,
'Some Other Value',
col46,
col47,
col48,
col49,
col50
FROM myschema.mytbl
WHERE col5 = 'Some Value';
How much overhead does creating/dropping a temporary table introduce? If, for example, this were part of a daily process on a production sized system, would the additional overhead be noticeable? I realize that depends a lot on the specifics of the system, but a general idea would be fantastic.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
首先,这不是临时表在 Oracle 中的工作方式。全局临时表是永久结构,只是其中的数据是临时的。因此不需要为每个操作创建和删除表。这只会增加开销。
其次,临时表合理的情况实际上很少见。在大多数情况下,一个简单的变量就足够了。此语法(插入单行)从 9iR2 开始有效:
请注意,此公式中不需要括号。
First of all, this is not how temporary tables work in Oracle. Global temporary tables are permanent structures, it's just the data in them which is temporary. So there is no need to create and drop the table for each operation. That just adds overhead.
Secondly, the situations where temporary tables are justified are actually quite rare. In most scenarios a simple variable will suffice. This syntax - which inserts a single row - is valid from 9iR2 onwards:
Note that parentheses are not required in this formulation.
我不明白使用临时表如何可能更快。
即使不考虑临时表创建(和销毁)的开销(两项操作),您也将执行三个单独的数据库操作,而不是一个组合操作。此外,您正在执行两个涉及对非临时表进行单独锁定的操作,这会导致比单个组合操作更少的潜在并发性。诚然,这在某种程度上是特定于获取实际锁的数据库的。
最终,我认为方法 1 的开销比方法 2 多得多,甚至没有考虑临时表创建/销毁的不可忽略的成本。
I can't see how this could possibly be faster with a temporary table.
Even with the overhead of temporary table creation (and destruction) aside (two operations), you are performing three separate DB operations instead of one combined one. Also, you are performing two operations that involve individual locking on your non-temporary table, leading to less potential concurrency than a single combined op. Granted, this is somewhat database specific with regard to which actual locks are acquired.
Ultimately, I think method 1 has much more overhead than method 2, without even considering the certainly non-negligible cost of temporary table creation / destruction.
我担心这可能是过早优化的情况。看来这种方法“解决”了一个可能存在也可能不存在的问题,但肯定会增加系统相当程度的复杂性。我的建议是
分享并享受。
I fear this may be a case of premature optimization. It appears that this approach "solves" a problem which may or may not exist, but certainly adds a fair degree of complexity to the system. My suggestion is
Share and enjoy.
我假设您的方法的主要目标是避免必须将所有 50 列写出两次,我认为这有一个缺点,如果将另一列添加到表中,您必须进行更改。其他人对源表结构、临时表开销和过早优化等发表了评论。这实际上不是你的问题,但基于这个假设,我可以看到第三种可能有效的方法:
I'm assuming the main goal of your approach is to avoid having to write out all 50 columns twice, which I suppose has the disadvantage of you having to make change if another column is added to the table. Others have commented on the source table structure, temporary table overhead and premature optimisation etc. This wasn't really your question, but on that assumption, I can see a third method that might work:
你可以这样做。
插入 myschema.mytbl
(SELECT * FROM myschema.mytbl WHERE mytbl.col5 = '某些值');
更新 myschema.mytbl
SET col45 = '其他一些值'
WHERE col5 = '某个值' AND rownum<=1;
这甚至比使用临时表更短,并且没有临时表的缺点。当然,如果您的实际问题比您所展示的更复杂,则这可能不起作用。
You could do it this way.
INSERT INTO myschema.mytbl
(SELECT * FROM myschema.mytbl WHERE mytbl.col5 = 'Some Value');
UPDATE myschema.mytbl
SET col45 = 'Some Other Value'
WHERE col5 = 'Some Value' AND rownum<=1;
This is even shorter than using a temporary table and has none of the drawbacks of temporary tables. Of course if your actual problem is more complex than what you have shown this may not work.
如果您的过程将处理少于 250 行,您可能需要考虑使用表变量而不是临时表,因为表变量使用内存而不是物理写入到 tempdb。
If your procedure will be working with fewer than 250 rows, you may want to consider using a table variable instead of a temp table, as a table variable uses memory instead of physically writing out to tempdb.