插入忽略或插入不在其中的位置
我有一个 900 万行的表,由于其庞大的规模,我正在努力处理所有这些数据。
我想要做的是将导入 CSV 添加到表中而不覆盖数据。
在我做这样的事情之前; INSERT if not in(select email from tblName where source = "number" and email != "email") INTO (email...) VALUES ("email"...)
但我担心我会崩溃再次服务器。我希望能够将 10,000 行插入到表中,但前提是它不在源=“number”的表中。
否则我会在电子邮件列上使用唯一的。
简而言之,我想通过检查两件事尽快插入而不向表中引入重复项。如果 email != "email" AND source != "number" 则插入表中,否则不执行任何操作。我也不想要错误报告。
我很抱歉我的措辞不好,而且这个问题听起来有点愚蠢。
我只是很难适应无法通过下载备份并上传(如果出现问题)来测试数据。我讨厌大数据集:)
谢谢大家的宝贵时间 -大事
I have a 9 million rows table and I'm struggling to handle all this data because of its sheer size.
What I want to do is add IMPORT a CSV to the table without overwriting data.
Before I would of done something like this; INSERT if not in(select email from tblName where source = "number" and email != "email") INTO (email...) VALUES ("email"...)
But I'm worried that I'll crash the server again. I want to be able to insert 10,000s of rows into a table but only if its not in the table with source = "number".
Otherwise I would of used unique on the email column.
In short, I want to INSERT as quickly as possible without introducing duplicates to the table by checking two things. If email != "email" AND source != "number" then insert into table otherwise do nothing. And I dont want errors reports either.
I'm sorry for my bad wording and the question sounding a little silly.
I'm just having a hard time adabting to not been able to test it out on the data by downloading backups and uploading if it goes wrong. I hate large datasets :)
Thank-you all for your time
-BigThings
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
如果这些字段上有唯一键,则可以使用 LOAD DATA INFILE 和 IGNORE 选项。它比逐行插入更快,也比多重插入更快。
看看 http://dev.mysql.com/doc/refman /5.1/en/load-data.html
If you have unique keys on these fields you can use LOAD DATA INFILE with IGNORE option. It's faster then inserting row by row, and is faster then multi-insert as well.
Look at http://dev.mysql.com/doc/refman/5.1/en/load-data.html
对
email
和source
列设置UNIQUE
约束。然后执行:
INSERT IGNORE
不会通知您任何类型的错误。我不会推荐它。我也不推荐INSERT ... WHERE NOT IN
。 MySQL 已经为此提供了很好的优化功能。这就是为什么INSERT ... ON DUPLICATE KEY UPDATE
存在的原因。Set a
UNIQUE
constraint onemail
andsource
columns.Then do:
INSERT IGNORE
will not notify you of any kind of error. I would not recommend it. Neither would I recommendINSERT ... WHERE NOT IN
. MySQL has an already well optimized functionality for that. That's whyINSERT ... ON DUPLICATE KEY UPDATE
is there.