没有事务如何使用Oracle?
MySQL有特殊的表类型MyISAM,不支持事务。 Oracle有类似的东西吗?我想创建只写数据库(用于日志记录),该数据库需要非常快(将存储大量数据)并且不需要事务。
MySQL has special table type MyISAM that does not support transactions. Does Oracle has something like this? I'd like to create write-only database(for logging) that needs to be very fast(will store a lot of data) and doesnt need transactions.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
事务是 SQL 数据库操作的关键。它们无疑是 Oracle 的基础。如果不发出提交,就无法永久写入 Oracle 表,瞧!有交易。
Oracle允许我们将表指定为NOLOGGING,这样就不会生成重做日志。这仅适用于批量加载(使用 INSERT /*+ APPEND */ 提示),建议切换到 LOGGING 并尽快返回。因为未记录的数据是不可恢复的。如果您不想恢复它,为什么还要费心编写它呢?
另一种方法是在内存中批量写入,然后使用批量插入来写入它们。这速度相当快。
下面是一个简单的日志表和一个概念验证包:
日志记录保存在 PL/SQL 集合中,这是一个具有会话范围的内存结构。 INIT() 过程初始化缓冲区。 FLUSH() 过程将缓冲区的内容写入 LOG_TABLE。 WRITE() 过程将一个条目插入缓冲区,如果缓冲区具有所需数量的条目,则调用 FLUSH()。
写入日志表使用 AUTONOMOUS_TRANSACTION 编译指示,因此发生 COMMIT 时不会影响触发刷新的周围事务。
对 DBMS_OUTPUT.PUT_LINE() 的调用可以轻松监控进度。那么,让我们看看它的速度有多快……
嗯,0.12 秒内记录了 3456 条记录,这也不算太寒酸。这种方法的主要问题是需要刷新缓冲区以舍入松散的记录;例如,在一次训练结束时,这是一种痛苦。如果某些原因导致服务器崩溃,未刷新的记录就会丢失。在内存中执行操作的另一个问题是它会消耗内存(durrrr),因此我们不能让缓存太大。
为了进行比较,我在包中添加了一个过程,每次调用它时都会将一条记录直接插入到 LOG_TABLE 中,再次使用自主事务:
以下是它的计时:
众所周知,挂钟计时是不可靠的,但批处理方法是 2比单记录方法快-3倍。即便如此,我仍然可以在(远非顶级)笔记本电脑上在不到半秒的时间内执行三千多个离散交易。所以,问题是:日志记录有多大程度的瓶颈?
为了避免任何误解:
@JulesLt 在我进行 PoC 时发布了他的答案。尽管我们的观点有相似之处,但我认为建议的解决方法的差异值得发布。
我的时间安排略有不同。用最后的单个 COMMIT 替换每次写入的 COMMIT 大约使经过的时间减半。仍然比批量方法慢,但没有那么慢。
这里的关键是是基准测试。我的概念验证的运行速度比 Jules 的测试快六倍(我的表有一个索引),这可能有多种原因 - 机器规格、数据库版本(我')。 m 使用 Oracle 11gR1)、表结构等。换句话说,YMMV 的
教学内容是:首先决定对您的应用程序执行哪些操作,然后针对您的环境进行基准测试。仅当您的基准测试建议时才考虑采用不同的方法。 Knuth 关于过早优化的警告适用。
Transactions are key to SQL database operations. They are certainly fundamental in Oracle. There is no way to write permanently to Oracle tables without issuing a commit, and lo! there is the transaction.
Oracle allows us to specify tables to be NOLOGGING, which doesn't generate redo log. This is only meant to be for bulk loading (using the
INSERT /*+ APPEND */
hint), with the advice to switch to LOGGING and take a back as soon as possible. Because data which is not logged is not recoverable. And if you don't want to recover it, why bother writing it in the first place?An alternative approach is to batch up the writes in memory, and then use bulk inserts to write them. This is pretty fast.
Here is a simple log table and a proof of concept package:
The log records are kept in a PL/SQL collection, which is an in-memory structure with a session scope. The INIT() procedure initialises the buffer. The FLUSH() procedure writes the contents of the buffer to LOG_TABLE. The WRITE() procedure inserts an entry into the buffer, and if the buffer has the requisite number of entries calls FLUSH().
The write to log table uses the AUTONOMOUS_TRANSACTION pragma, so the COMMIT occurs without affecting the surrounding transaction which triggered the flush.
The call to DBMS_OUTPUT.PUT_LINE() is there to make it easy to monitor progress. So, let's see how fast it goes....
Hmmm, 3456 records in 0.12 seconds, that's not too shabby. The main problem with this approach is the need to flush the buffer to round up loose records; this is a pain e.g. at the end of a session. If something causes the server to crash, unflushed records are lost. The other problem with doing stuff in-memory is that it consumes memory (durrrr), so we cannot make the cache too big.
For the sake of comparison I added a procedure to the package which inserts a single record directly in to LOG_TABLE each time it is called, again using the autonomous transactions:
Here are its timings:
Wall clock timings are notoriously unreliable but the batched approach is 2-3 times faster than the single record appraoch. Even so, I could execute well over three thousand discrete transactions in less than half a second, on a (far from top-of-the-range) laptop. So, the question is: how much of a bottleneck is logging?
To avoid any misunderstanding:
@JulesLt had posted his answer while I was working on my PoC. Although there are similarities in our views I think the differences in suggested workaround merits posting this.
My timings suggest something slightly different. Replacing a COMMIT per write with a single COMMIT at the end roughly halves the elapsed time. Still slower than the bulked approach, but not by nearly as much.
The key thing here is benchmarking. My proof of concept is running about six times faster than Jules's test (my table has one index). There are all sorts of reasons why this might be - machine spec, database version (I'm using Oracle 11gR1), table structure, etc. In other words, YMMV.
So the teaching is: first decide what the right thing to do for your application, then benchmark that for your environment. Only consider a different approach if your benchmark suggests a serious performance problem. Knuth's warning about premature optimization applies.
最接近的可能是创建一个 NOLOGGING 表空间,并使用 NOLOGGING 选项在其中创建表 - 尽管这可能仅适用于批量操作(即需要 INSERT /*+ APPEND */ 提示)。
这会删除 REDO,但如果数据库出现故障,则会导致完整性和数据丢失。
我不知道它实际上会“更快”,您还应该考虑并发性(如果您有许多进程尝试写入同一个表,那么您最好使用将挂起的更新写入重做日志的事务,而不是尝试全部更新“真实”表)。
不过,我还没有真正研究过 NOLOGGING - 我很少遇到应用程序瓶颈是 INSERT 速度的情况 - 当我遇到这种情况时,更新索引的成本而不是表才是问题所在。
我刚刚对我的功能相当不足的开发数据库(启用了 REDO)进行了快速测试。对每一行使用自治事务 - 因此每一行都会启动一个新事务并以提交结束,我可以在 1 秒内将 1000 多行写入/提交到索引日志表,而在不提交的情况下执行 1000 次插入大约需要 0.875 秒。
使用批量操作在单次命中中插入 1000 行只需要一秒钟的时间 - 因此,如果您可以将日志批量化,请执行此操作。
其他一些想法:
外部表是否可以完成这项工作 - 即写入日志文件,然后在需要读取日志文件时将其挂载为 Oracle 中的外部表?
The closest may be creating a NOLOGGING tablespace, and using the NOLOGGING option of creating the table within it - although this may only applies for bulk operations (i.e. INSERT /*+ APPEND */ hint required).
That removes the REDO, at the cost of a loss of integrity and data if the DB goes down.
I don't know that it would actually be 'faster' and you should also consider concurrency (if you have many processes trying to write to the same table, you may be better off using transactions that write pending updates into the redo logs than trying to all update the 'real' table).
I've not really investigated NOLOGGING though - I've rarely hit a point where the application bottleneck has been INSERT speed - when I have, it has been the cost of updating indexes rather than the table that has been the issue.
I've just done a quick test and on my quite underpowered development DB (with REDO enabled). Using an autonomous transaction for each row - so each row starts a new transaction and ends with a commit, I can write/commit over 1000 rows to an indexed log table in 1 second vs about .875 seconds doing 1000 inserts without commit.
Doing an insert of 1000 rows in a single hit using a bulk operation is a small fraction of a second - so if you can possibly bulk the logs up, do it.
Some other thoughts :
Would an external table do the job - i.e. write to a log file which you then mount as an external table in Oracle when/if you need to read from it?
我的经验是,最好将日志记录到平面文件中。我的观点是,日志通常并不是特别重要——直到出现问题时,它们才变得至关重要。因此,我不想对我的日志记录进行事务控制。如果因为出现问题而需要回滚事务,我真的不希望日志数据回滚,因为这就是我将用来帮助确定问题所在的数据。另外,如果日志存储在无法连接的数据库中,如何记录连接数据库出现问题?
分享并享受。
My experience is that logging is best done to a flat file. My view is that logs are generally not particularly important - UNTIL something goes wrong, at which time they become critical. Because of this I don't want transactional control of my logging. If I need to roll back a transaction because there's a problem I really don't want the logging data rolled back because that's what I'm going to use to help identify what the problem was. In addition, how do you log that there's a problem connecting to the database if the log is stored in the database that you can't connect to?
Share and enjoy.
“这需要非常快”
(有时)在快速和可恢复之间需要权衡。
在 Oracle 中,可恢复性是通过重做日志文件实现的。每次提交时,数据库“日志编写器”都会执行同步调用以将未完成的更改写入文件。我所说的同步是指它等待文件系统确认写入已成功,然后才说提交已成功。
如果您正在进行大量日志记录(尤其是同时记录大量会话),并且日志文件中的每一行都是独立提交的(ag 自治事务),那么这很可能是一个瓶颈。
如果您不需要这种级别的可恢复性(即,在发生重大故障时,您可以承受丢失日志中最后几行日志数据的损失),请查看 NOWAIT 提交选项。
如果您不能失去任何东西,那么您最好的选择就是真正快速的存储(可能是电池支持的缓存)。
"that needs to be very fast"
There is a trade-off (sometimes) between fast and recoverable.
In Oracle recoverability is achieved by the redo log file. Evey time you commit, the database 'log writer' does a synchronous call to write outstanding changes to the file. By synchronous, I mean it waits for the file system to confirm that the write has been successful before saying that the commit has been successful.
If you are doing lots of logging (especially from lots of sessions at once) with each line in the log file being committed independently (ag autonomous transaction) then this could well be a bottleneck.
If you don't need that level of recoverability (ie you can afford to lose the last few rows of your log data from your logs in the event of a major failure), look at the NOWAIT option of commit.
If you can't afford to lose anything, then your best bet is REALLY fast storage (which may be a battery backed cache).
对于类似的情况,我会做的是将日志写入文件(附加到文件可能是存储日志的最快方法),然后让进程批量将这些日志定期插入数据库中。当然,除非直接插入数据库足够快......但你必须测试......
What I would do for a similar case, would be to write the logs to a file (appending to a file is probably the fastest way to store your logs) and then have a process batch insert those logs into the DB at regular intervals. Unless of course inserting directly in the DB is fast enough ... but you'll have to test ...
这似乎是寻找问题的解决方案。
您对性能进行了基准测试吗? Oracle 的速度对您来说足够快吗?事务管理内置于 Oracle 的工作方式中,尝试解决它似乎是在为自己创造工作。
您似乎已经将事务管理确定为一个问题,但实际上并不知道是否存在问题。当你有多个作家在场时会发生什么?还是读者挡住了作者?
This seems like a solution in search of a problem.
Have you benchmarked the performance? Is Oracle fast enough for you as it is? Transaction management is built-in to the way that Oracle works and trying to work around it seems like you are creating work for yourself.
You seem to have identified transaction management as a problem without really knowing if there is a problem. What happens later on when you then have multiple writers on the table? Or readers blocking writers?
PRAGMA AUTONOMOUS_TRANSACTION
这将允许您记录和提交日志,而不会影响周围的事务。日志记录是自主事务极少数可接受的用例之一。它按照它所说的那样,允许您编写一个 pl/sql 函数/过程,可以提交其工作,而不会影响它可能已经参与或尚未参与的事务。它是“自治的”。
Oracle 文档:
PRAGMA AUTONOMOUS_TRANSACTION
This will allow you to log and commit your log without impacting the surrounding transaction(s). Logging is one of the very few acceptable use cases for autonomous transactions. It does what it says, allows you to write a pl/sql function/procedure that can commit its work without impacting the transaction it may or may not be already participating in. It is 'autonomous'.
Oracle docs:
如果您需要极高的性能,另一个选择是考虑 Oracle 的 TimesTen In-Memory 数据库: http://www.oracle.com/technology/products/timesten/index.html
Another option if you need extremely high performance is to consider Oracle's TimesTen In-Memory database: http://www.oracle.com/technology/products/timesten/index.html