使用 applyBatch 插入数千个联系人条目速度很慢
我正在开发一个应用程序,需要在其中插入大量联系人条目。目前大约有 600 个联系人,总共 6000 个电话号码。最大的联系人有 1800 个电话号码。
截至今天的状态是,我已经创建了一个自定义帐户来保存联系人,因此用户可以选择在联系人视图中查看联系人。
但触点的插入速度非常慢。我使用 ContentResolver.applyBatch 插入联系人。我尝试过不同大小的 ContentProviderOperation 列表(100、200、400),但总运行时间约为。相同。插入所有联系人和号码大约需要 30 分钟!
我发现的大多数关于 SQlite 中插入缓慢的问题都会引发事务。但由于我使用 ContentResolver.applyBatch 方法,我无法控制这一点,并且我假设 ContentResolver 会为我处理事务管理。
所以,对于我的问题:我做错了什么吗?或者我可以做些什么来加快速度?
安德斯
编辑: @jc温格: 我懂了。很好的解释!
因此,我必须首先插入 raw_contacts 表,然后插入包含姓名和号码的数据表。我将丢失的是对我在 applyBatch 中使用的 raw_id 的反向引用。
所以我必须获取新插入的 raw_contacts 行的所有 id 来用作数据表中的外键?
I'm developing an application where I need to insert lots of Contact entries. At the current time approx 600 contacts with a total of 6000 phone numbers. The biggest contact has 1800 phone numbers.
Status as of today is that I have created a custom Account to hold the Contacts, so the user can select to see the contact in the Contacts view.
But the insertion of the contacts is painfully slow. I insert the contacts using ContentResolver.applyBatch. I've tried with different sizes of the ContentProviderOperation list(100, 200, 400), but the total running time is approx. the same. To insert all the contacts and numbers takes about 30 minutes!
Most issues I've found regarding slow insertion in SQlite brings up transactions. But since I use the ContentResolver.applyBatch-method I don't control this, and I would assume that the ContentResolver takes care of transaction management for me.
So, to my question: Am I doing something wrong, or is there anything I can do to speed this up?
Anders
Edit:
@jcwenger:
Oh, I see. Good explanation!
So then I will have to first insert into the raw_contacts table, and then the datatable with the name and numbers. What I'll lose is the back reference to the raw_id which I use in the applyBatch.
So I'll have to get all the id's of the newly inserted raw_contacts rows to use as foreign keys in the data table?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
仅供本主题读者参考。
即使使用 applyBatch(),我也面临性能问题。
就我而言,其中一张表上写有数据库触发器。
我删除了桌子及其繁荣的触发器。
现在我的应用程序插入行的速度很快。
Just for the information of the readers of this thread.
I was facing performance issue even if using applyBatch().
In my case there was database triggers written on one of the table.
I deleted the triggers of the table and its boom.
Now my app insert rows with blessing fast speed.
我为你找到基本的解决方案
在批量操作中使用“屈服点”。
使用批处理操作的另一面是,大批量操作可能会长时间锁定数据库,从而阻止其他应用程序访问数据,并可能导致 ANR(“应用程序未响应”对话框)。
为了避免此类数据库锁定,请确保在批次中插入“屈服点”。屈服点向内容提供者表明,在执行下一个操作之前,它可以提交已经做出的更改、屈服于其他请求、打开另一个事务并继续处理操作。
屈服点不会自动提交事务,但前提是数据库上有另一个请求正在等待。通常,同步适配器应在批次中每个原始接触操作序列的开头插入屈服点。请参阅 withYieldAllowed(boolean)。
我希望它对你有用。
I get the basic solution for you,
use "yield points" in batch operation.
The flip side of using batched operations is that a large batch may lock up the database for a long time preventing other applications from accessing data and potentially causing ANRs ("Application Not Responding" dialogs.)
To avoid such lockups of the database, make sure to insert "yield points" in the batch. A yield point indicates to the content provider that before executing the next operation it can commit the changes that have already been made, yield to other requests, open another transaction and continue processing operations.
A yield point will not automatically commit the transaction, but only if there is another request waiting on the database. Normally a sync adapter should insert a yield point at the beginning of each raw contact operation sequence in the batch. See withYieldAllowed(boolean).
I hope it's may be useful for you.
这是在 30 秒内插入相同数据量的示例。
日志:
02-17 12:48:45.496 2073-2090/com.vayosoft.mlab D/BatchInsertionTest:开始批量插入:2016 年 2 月 17 日星期三 12:48:45 GMT+02:00
02-17 12:49:16.446 2073-2090/com.vayosoft.mlab D/BatchInsertionTest:批量插入结束,已过去:00:30.951
Here is am example of inserting same data amount within 30 seconds.
The log:
02-17 12:48:45.496 2073-2090/com.vayosoft.mlab D/BatchInsertionTest﹕ Starting batch insertion on: Wed Feb 17 12:48:45 GMT+02:00 2016
02-17 12:49:16.446 2073-2090/com.vayosoft.mlab D/BatchInsertionTest﹕ End of batch insertion, elapsed: 00:30.951
bulkInsert:对于那些感兴趣的人,这里是我能够尝试的代码。请注意我们如何避免对 int/long/floats 进行一些分配:)这可以节省更多时间。
bulkInsert: For those interested, here is the code that I was able to experiment with. Pay attention to how we can avoid some allocations for int/long/floats :) this could save more time.
有关如何重写
bulkInsert()
以加快多次插入速度的示例,请参见 此处An example of on how to override the
bulkInsert()
, in order to speed up multiples insert, can be found here@jcwenger 首先,读完你的帖子后,我认为这就是原因
bulkInsert比ApplyBatch快,但是读完Contact Provider的代码后,我不这么认为。
1.你说ApplyBatch使用事务,是的,但是bulkInsert也使用事务。这是它的代码:
也就是说,bulkInsert也使用transations。所以我认为不是这个原因。
2.你说bulkInsert将一大堆值应用到同一个表中。很抱歉我在froyo的源代码中找不到相关代码。我想知道你是如何找到的?你能告诉我吗?
我认为的原因是:
bulkInsert使用mDb.yieldIfContendedSafely()而applyBatch使用
mDb.yieldIfContishedSafely(SLEEP_AFTER_YIELD_DELAY)/*SLEEP_AFTER_YIELD_DELAY = 4000*/
阅读SQLiteDatabase.java的代码后,我发现,如果在yieldIfContishedSafely中设置一个时间,它会进行睡眠,但如果不设置时间,它会不会睡觉。你可以参考下面的代码,这是SQLiteDatabase.java的一段代码
我认为这就是bulkInsert比applyBatch更快的原因。
有任何问题请联系我。
@jcwenger At first, after read your post, I think that's the reason of
bulkInsert is quicker than ApplyBatch, but after read the code of Contact Provider, I don't think so.
1.You said ApplyBatch use transactions, yes, but bulkInsert also use transactions. Here is the code of it:
That is to say, bulkInsert also use transations.So I don't think that's the reason.
2.You said bulkInsert applies a whole pile of values to the same table.I'm sorry I can't find related code in the source code of froyo.And I want to know how could you find that?Could you tell me?
The reason I think is that:
bulkInsert use mDb.yieldIfContendedSafely() while applyBatch use
mDb.yieldIfContendedSafely(SLEEP_AFTER_YIELD_DELAY)/*SLEEP_AFTER_YIELD_DELAY = 4000*/
after reading the code of SQLiteDatabase.java, I find that, if set a time in yieldIfContendedSafely, it will do a sleep, but if you don't set the time, it will not sleep.You can refer to the code below which is a piece of code of SQLiteDatabase.java
I think that's the reason of bulkInsert is quicker than applyBatch.
Any question please contact me.
使用
ContentResolver.bulkInsert (Uri url, ContentValues[] value)
而不是ApplyBatch()
ApplyBatch (1) 使用事务,并且 (2) 它会锁定 ContentProvider 一次批量而不是每次操作锁定/解锁一次。因此,它比一次执行一个(非批量)要快一些。
但是,由于批处理中的每个操作都可以有不同的 URI 等,因此会产生大量开销。 “哦,一个新操作!我想知道它进入哪个表......在这里,我将插入一行......哦,一个新操作!我想知道它进入哪个表......”无穷无尽。由于将 URI 转换为表的大部分工作都涉及大量字符串比较,因此显然非常慢。
相比之下,bulkInsert 将一整堆值应用于同一个表。它是这样的,“批量插入...找到表,好的,插入!插入!插入!插入!插入!”快得多。
当然,它需要您的 ContentResolver 有效地实现bulkInsert。大多数都是这样做的,除非您自己编写,在这种情况下,需要一些编码。
Use
ContentResolver.bulkInsert (Uri url, ContentValues[] values)
instead ofApplyBatch()
ApplyBatch (1) uses transactions and (2) it locks the ContentProvider once for the whole batch instead locking/unlocking once per operation. because of this, it is slightly faster than doing them one at a time (non-batched).
However, since each Operation in the Batch can have a different URI and so on, there's a huge amount of overhead. "Oh, a new operation! I wonder what table it goes in... Here, I'll insert a single row... Oh, a new operation! I wonder what table it goes in..." ad infinitium. Since most of the work of turning URIs into tables involves lots of string comparisons, it's obviously very slow.
By contrast, bulkInsert applies a whole pile of values to the same table. It goes, "Bulk insert... find the table, okay, insert! insert! insert! insert! insert!" Much faster.
It will, of course, require your ContentResolver to implement bulkInsert efficiently. Most do, unless you wrote it yourself, in which case it will take a bit of coding.