优化多维数组的重排序和去重
我想知道是否有人对优化以下代码有什么好主意。我有一个多维数组($List),如下所示:
Array
(
[0] => Array
(
[id] => 1
[title] => A good read
[priority] => 10
)
[1] => Array
(
[id] => 2
[title] => A bad read
[priority] => 20
)
[2] => Array
(
[id] => 3
[title] => A good read
[priority] => 10
)
)
首先,我将删除共享相同标题的所有条目(无论其他值是什么),如下所示:
$List_new = array();
foreach ($List as $val) {
$List_new[$val['title']] = $val;
}
$List = array_values($List_new);
完美。然后我对数组重新排序,首先按优先级字段,然后按 id:
$sort_id = array();
$sort_priority = array();
foreach ($List as $key => $row) {
$sort_id[$key] = $row['id'];
$sort_priority[$key] = $row['priority'];
}
array_multisort($sort_priority, SORT_DESC, $sort_id, SORT_DESC, $List);
两个代码块都出现在循环中,因此在重新排序之前清除 $sort_id 和 $sort_priority。
有没有更好的方法来做到这一点 - 即使用排序过程来删除重复的标题条目?此代码块在最多 500,000 条记录的循环中执行,因此欢迎任何改进!
I wondering whether anyone has any good ideas on optimizing the following code. I have an multi-dimensional array ($List) as follows:
Array
(
[0] => Array
(
[id] => 1
[title] => A good read
[priority] => 10
)
[1] => Array
(
[id] => 2
[title] => A bad read
[priority] => 20
)
[2] => Array
(
[id] => 3
[title] => A good read
[priority] => 10
)
)
First I'm removing any entries that share the same title (no matter what the other values are) as follows:
$List_new = array();
foreach ($List as $val) {
$List_new[$val['title']] = $val;
}
$List = array_values($List_new);
Perfect. Then I'm reordering the array, first by the priority field and then id:
$sort_id = array();
$sort_priority = array();
foreach ($List as $key => $row) {
$sort_id[$key] = $row['id'];
$sort_priority[$key] = $row['priority'];
}
array_multisort($sort_priority, SORT_DESC, $sort_id, SORT_DESC, $List);
Both code blocks appear in a loop, hence the clearing of $sort_id and $sort_priority before reordering.
Is there a better way to do this - i.e. use the sorting process to remove duplicate title entries? This code block is being executed in a loop of up to 500,000 records and so any improvement would be welcome!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
一个循环,但有一些额外的函数调用,所以我无法告诉你 Big O 是如何变化的。需要注意的一件事是,数字周围的填充必须足够大以防止溢出,即 2 = 最多 99 个优先级,6 = 最多 999,999 个项目。
编辑:做了一些小的修改。
One loop, but a few extra function calls so I can't tell you how the Big O changes. One thing to note, the padding around numbers must be big enough to prevent overflow i.e. 2 = max 99 priorities and 6 = max 999,999 items.
Edit: made some minor modifications.