通过三步过程处理大型数组的 csv 导入
我的一个项目需要一些帮助。这是关于 DVD 数据库的。目前我计划实现一个 csv 数据函数来导入包含文件中所有信息的 DVD。 我将分三步完成此操作。
第 1 步
- 显示我想要导入的数据,构建数组
- 导入数据,构建会话数组
第2步
- 编辑信息
第3步
- 显示更新前的结果 - 更新数据
到目前为止它可以工作,但我在处理大文件时遇到问题。 csv 数据有 20 列(标题、流派、情节等),对于 csv 中的每一行,我创建了一些数组以在后续步骤中使用它。
当我有超过 500 行时,浏览器在导入时经常崩溃。我没有得到任何回应。
无论如何,现在我尝试将其作为 ajax 调用过程来执行。优点是,我可以定义系统处理每个调用的过程数量,并且用户可以看到系统仍在工作,就像下载/上传文件时的状态栏一样。
目前,我尝试找到一些有用的示例来说明我如何做到这一点,但到目前为止我找不到有用的东西。
也许您有一些技巧或示例,说明每次调用处理 20 行,构建数组。
之后我想使用相同的函数来构建下一步中使用的会话数组,依此类推。
一些信息:
我使用 fgetcsv() 从文件中读取行。我遍历行和每一列,我有不同的查询,例如项目 ID 是否唯一、标题是否存在、描述是否存在等。 因此,如果未输入这些数据之一,我会收到错误发生在哪一行和哪一列的错误。
如果您有任何帮助,我将不胜感激
i need some help with a project of mine. It is about a dvd database. In the moment i am planning to implement a csv data function to import dvds with all information from a file.
I will do this in three steps.
Step 1
- show data i want to import, building array
- import data, building session arrays
Step 2
- edit informations
Step 3
- showing result before update
- update data
so far it works but i have a problem with large files. the csv data has 20 columns (title, genre, plot etc.) and for each line in the csv there are some arrays i create to use it in the next steps.
When i have more about 500 lines the browser often collapse while importing. I get no response.
Anyway now i trying to do this as an ajax call process. The advantage is, that i can define how many procedures the system handle each call and the user can see that the system is still working, like an statusbar when down/uploading a file.
In the moment i try to find some usefull example illustrating how i can do this, but i could not find something useful till now.
Maybe you have some tipps or an example how this could work, saying processing 20 lines each call, building the array.
After i would like to use the same function to build the session arrays using in the next step and so on.
Some information:
i use fgetcsv() to read the rows from the file. i go through the rows and each column i have different querys like is the item id unique, the title exist, description exist etc.
So if one of these data is not entered i get an error which row and column the error occures.
I´d appreciate any help from you
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
使用“LOAD DATA INFILE”语法。我在 500MB 以上、300 万行的文件上使用过它,并且需要几秒钟,而不是几分钟。
http://dev.mysql.com/doc/refman/5.0 /en/load-data.html
use 'LOAD DATA INFILE' syntax. ive used it on files upwards of 500mb with 3mil rows and it takes seconds, not minutes.
http://dev.mysql.com/doc/refman/5.0/en/load-data.html
虽然这不是您正在寻找的直接答案,但
500 行不应该花费太长时间来处理,所以..这是您的另一个想法。
创建具有正确字段结构的临时表
然后,您可以使用 select 语句从中提取情节、流派等的各种独特条目,而不是一路制作一堆数组
mysql 导入数据的速度会非常快,
然后您可以根据需要对其进行编辑,最后将来自临时但现已验证的表的数据插入最终表中。
就使用ajax执行此操作而言,您必须执行重复的定时事件来刷新状态,问题是而不是20行,它需要是一个特定的时间段,因为您的浏览器无法知道,假设csv 已上传,您可以按 20 行块对其进行处理。
如果你在一个很大的文本框中输入 csv,你可以通过获取前 20 行,将剩余的传递到下一页等来处理,这会让我觉得潜在的混乱。
因此,虽然我知道我没有直接回答你的问题,但我希望我能够为你提供思考的替代方案和可能更实用的替代方案
While this is not the direct answer you were looking for
500 lines shouldnt take too long to process, so.. heres another thought for you.
Create a temporary table with the right structure of fields
you can then extract from it using select statements the various unique entries for the plot, genre etc rather than making a bunch of arrays along the way
mysql import would be very fast of your data
You can then edit it as required, and finally insert into your final table the data you have from your temporary but now validated table.
In terms of doing it with ajax, you would have to do a repeating timed event to refresh the status, the problem is rather than 20 lines, it would need to be a specific time period, as your browser has no way to know, assuming the csv is uploaded and you can process it in 20 line chunks.
If you enter the csv in a big big textbox, you could work on by taking the first 20 lines, passing it the remainder to the next page etc, would strike me as potential mess.
So, while I know ive not answered your question directly, I hope I gave you food for thought as to alternative and possibly more practical alternatives