PHP 中的大型静态数组
我有两个数组,一个是节点-节点成本数组 [a_node,b_node,cost],它有 8000 个条目,另一个是节点与坐标 [node,x,y] 的关联,它也有大约 8000 个条目。拥有这两个的静态数组更好还是将它们存储在数据库中并从性能问题中创建一个数组更好?
这两个数组将用于运行最短路径算法。
I have two arrays, one is a node-node-cost array [a_node,b_node,cost] which has 8000 entries and the other one is an association of node with coordinates [node,x,y] which has around 8000 entries as well. Is it better to have a static array of these two or is it better to store these in the database and from that create an array as of performance issue?
These two array will be used to run the shortest path algorithm.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
听起来您可能想要使用动态编程解决方案,在每次迭代中计算问题的一小部分并存储中间结果,这将允许您在完成所有中间计算后计算最短路径。
我建议将所有信息存储在数据库中并选择记录的子集(也许一次 100 条?)。计算每个节点的中间信息并将其存储回数据库。如果您要重复使用此路径信息数千或数百万次,您不希望不断地重新计算它。您只想在图表发生变化时重新计算。
我首选的最短路径算法是 http://en.wikipedia.org/wiki/Dijkstra's_algorithm它适用于高效的动态规划解决方案。
It sounds like you'll probably want to use a dynamic programming solution where you calculate a small portion of the problem on each iteration and store the intermediate results which will allow you to calculate the shortest path once you are done with all of the intermediate calculations.
I would suggest storing all of your information in a database and selecting out a subset of records (maybe 100 at a time?). Calculate the intermediate information for each node and store that back to the database. If you are going to reuse this path information thousands or millions of times, you don't want to be recalculating it constantly. You'll only want to recalculate when the graph changes.
My preferred shortest path algorithm is http://en.wikipedia.org/wiki/Dijkstra's_algorithm and it lends itself to an efficient dynamic programming solution.
如果您测试过它,您会自己找到答案 - 很大程度上取决于您填充数组的方式以及从现有数组中取消引用元素的频率。当使用数据库存储/操作数据时,前者的速度要快得多。后者更多的是一个灰色地带——但在大多数情况下数据库仍然可以证明更快。
(除非这是你的作业,并且考虑到节点的数量,我建议考虑使用非确定性方法 - 遗传算法是一个明显的候选者)
If you had tested it you would have found the answer for yourself - a lot depends on how you are populating the arrays and how often you are dereferencing elements from an existing array. The former is massively faster when the data is stored / manipulated with a database. the latter is more of a grey area - but a database can still prove faster in most cases.
(unless this is your homework, and given the number of nodes, I'd recommend considering using a non-deterministic approach - the genetic algortihm is an obvious candidate)