从 Excel 插入/更新 Doctrine 对象
在我目前正在工作的项目中,我必须读取一个 Excel 文件(超过 1000 行),提取所有文件并插入/更新到数据库表。
就性能而言,最好将所有记录添加到
Doctrine_Collection
中,并在使用fromArray()
方法后插入/更新它们,对吧?另一种可能的方法是为每一行创建一个新对象(Excel 行将是一个对象)并保存它,但我认为它在性能方面最差。每次上传 Excel 时,都需要将其行与数据库中的现有对象进行比较。如果该行不作为对象存在,则应插入,否则更新。我的第一个方法是将对象和行都转换为数组(或 Doctrine_Collections);然后在执行所需操作之前比较两个数组。
谁能建议我任何其他可能的方法?
On the project which I am currently working, I have to read an Excel file (with over a 1000 rows), extract all them and insert/update to a database table.
in terms of performance, is better to add all the records to a
Doctrine_Collection
and insert/update them after using thefromArray()
method, right? One other possible approach is to create a new object for each row (a Excel row will be a object) and them save it but I think its worst in terms of performance.Every time the Excel is uploaded, it is necessary to compare its rows to the existing objects on the database. If the row does not exist as object, should be inserted, otherwise updated. My first approach was turn both object and rows into arrays (or
Doctrine_Collections
); then compare both arrays before implementing the needed operations.
Can anyone suggest me any other possible approach?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我们最近在一个项目中使用 CSV 数据做了一些这样的事情。这相当无痛。有一个 symfony 插件 tmCsvPlugin,但我们对其进行了相当多的扩展,因此插件存储库中的版本已经过时了。必须将其添加到 @TODO 列表中:)
问题 1:
我没有明确了解性能,但我猜测将记录添加到 Doctrine_Collection 然后调用 Doctrine_Collection::save() 会是最巧妙的方法。我确信如果某个地方抛出异常并且您必须回滚上次保存,这会很方便。
问题 2:
如果您可以使用行字段作为唯一标识符,(让我们假设一个用户名),那么您可以搜索现有记录。如果找到一条记录,并假设导入的行是一个数组,请使用 Doctrine_Record::synchronizeWithArray() 更新该记录;然后将其添加到 Doctrine_Collection 中。完成后,只需调用 Doctrine_Collection::save()
一个相当粗糙的“n”就绪实现:
相当粗糙,但这样的东西对我来说效果很好。这是假设您可以以某种方式使用导入的行数据作为唯一标识符。
注意:如果您使用的是 sf1.2/doctrine 1.0,请警惕 SynchronizeWithArray() - 如果我没记错的话,它没有正确实现。不过它在原则 1.2 中工作得很好。
We did a bit of this in a project recently, with CSV data. it was fairly painless. There's a symfony plugin tmCsvPlugin, but we extended this quite a bit since so the version in the plugin repo is pretty out of date. Must add that to the @TODO list :)
Question 1:
I don't explicitly know about performance, but I would guess that adding the records to a Doctrine_Collection and then calling Doctrine_Collection::save() would be the neatest approach. I'm sure it would be handy if an exception was thrown somewhere and you had to roll back on your last save..
Question 2:
If you could use a row field as a unique indentifier, (let's assume a username), then you could search for an existing record. If you find a record, and assuming that your imported row is an array, use Doctrine_Record::synchronizeWithArray() to update this record; then add it to a Doctrine_Collection. When complete, just call Doctrine_Collection::save()
A fairly rough 'n' ready implementation:
Pretty rough but something like this worked well for me. This is assuming that you can use your imported row data in some way to serve as a unique identifier.
NOTE: be wary of synchronizeWithArray() if you're using sf1.2/doctrine 1.0 - if I remember correctly it was not implemented correctly. it works fine in doctrine 1.2 though.
我从未从事过Doctrine_Collections的工作,但我可以从更广泛意义上的数据库查询和代码逻辑方面进行回答。我将应用以下逻辑:-
在单个查询中从数据库中获取 Excel 工作表的所有行并将它们存储在数组
$uploadedSheet
中。创建上传的 Excel 工作表的所有行的单个数组,将其命名为
$storedSheet
。我猜想 Doctrine_Collections$uploadedSheet
和$storedSheet
的结构会类似(都是二维的 - 行、单元格都可以识别和比较)。3.在
$uploadedSheet
上运行 foreach 循环,如下所示,仅识别需要插入的行和要更新的行(稍后进行实际查询)-4. 这样您就有了两个数组。现在执行 2 个数据库查询 -
批量插入
$uploadedSheet
的所有行,其编号存储在$rowsToBeInserted
数组中。批量更新
$uploadedSheet
的所有行,其编号存储在$rowsToBeUpdated
数组中。这些批量查询是提高性能的关键。
让我知道这是否有帮助,或者您想了解其他信息。
I have never worked on Doctrine_Collections, but I can answer in terms of database queries and code logic in a broader sense. I would apply the following logic:-
Fetch all the rows of the excel sheet from database in a single query and store them in an array
$uploadedSheet
.Create a single array of all the rows of the uploaded excel sheet, call it
$storedSheet
. I guess the structures of the Doctrine_Collections$uploadedSheet
and$storedSheet
will be similar (both two-dimensional - rows, cells can be identified and compared).3.Run foreach loops on the
$uploadedSheet
as follows and only identify the rows which need to be inserted and which to be updated (do actual queries later)-4. This way you have two arrays. Now perform 2 database queries -
bulk insert all those rows of
$uploadedSheet
whose numbers are stored in$rowsToBeInserted
array.bulk update all the rows of
$uploadedSheet
whose numbers are stored in$rowsToBeUpdated
array.These bulk queries are the key to faster performance.
Let me know if this helped, or you wanted to know something else.