查找 CSV 中哪些值不在数据库中的有效方法?
供应商正在向我们提供其产品的 CSV 文件。文件中的特定列(例如第 3 列)是样式编号。该文件有数千个条目。
我们有一个产品数据库表,其中有一列名为“manufacturer_num”,它是供应商的款式编号。
我需要找到我们目前没有哪些供应商的产品。
我知道我可以循环抛出 CSV 文件中的每一行并提取 style_number 并检查它是否在我们的数据库中。但随后我为每一行调用数据库。这将是对数据库的数千次调用。我认为这是低效的。
我还可以构建一个样式编号列表(作为字符串或数组)来进行一次数据库调用。 类似于: WHERE manufactuer_num IN(...)
但是如果列表太大,PHP 不会耗尽内存吗?实际上,这会给我我们拥有的,而不是我们没有的。
有什么有效的方法可以做到这一点?
A vendor is feeding us a CSV file of their products. A particular column on the file (eg column 3) is the style number. This file has thousands on entries.
We have a data-base table of products with a column called manufacturer_num which is the vendors style number.
I need to find which of the vendor's products we do not currently have.
I know I can loop throw each line in the CSV file and extract the style_number and check to see if it is in our data-base. But then I am making a call to the data-base for each line. This would be thousands of calls to the data-base. I think this is inefficient.
I could also build a list of the style numbers (either as a string or array) to make one DB call.
Something like: WHERE manufactuer_num IN(...)
But won't PHP run out of memory if the list is too big? And actually this would give me the ones we do have, not the ones we don't have.
Whats an efficient way to do this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
将 CSV 批量加载到临时表中,执行
LEFT JOIN
,然后获取连接的 RHS 为NULL
的记录。Bulk load the CSV into a temporary table, do a
LEFT JOIN
, then get the records where the RHS of the join isNULL
.