存储 3d 空间中点的信息
我正在用 Python 编写一些代码(到目前为止只是为了好玩),它将在 3d 空间中的每个点上存储一些数据。 我基本上是在寻找一个存储任意对象的 3d 矩阵对象,它允许我进行一些高级选择,例如:
- 获取 x=1,y=2,z=3 的点。
- 获取 y=2 的所有点。
- 获取位置 x=1,y=2,z=3 3 个单位内的所有点。
- 获取所有点,其中 point.getType() == "Foo"
在上述所有内容中,我需要最终得到某种输出,该输出将为我提供空间中的原始位置以及该点存储的数据。
显然 numpy 可以做我想做的事,但它似乎针对科学计算进行了高度优化,并且到目前为止我还没有弄清楚如何获取我想要的数据。
有更好的选择吗?还是我应该继续用头撞墙? :)
编辑:前三个答案让我意识到我应该包括更多信息:我不担心性能,这纯粹是一个概念验证,我更喜欢干净的代码而不是良好的性能。 我还将获得给定 3d 空间中每个点的数据,所以我猜备用矩阵不好?
I'm writing some code (just for fun so far) in Python that will store some data on every point in a 3d space. I'm basically after a 3d matrix object that stores arbitary objects that will allow me to do some advanced selections, like:
- Get the point where x=1,y=2,z=3.
- Getting all points where y=2.
- Getting all points within 3 units of position x=1,y=2,z=3.
- Getting all points where point.getType() == "Foo"
In all of the above, I'd need to end up with some sort of output that would give me the original position in the space, and the data stored at that point.
Apparently numpy can do what I want, but it seems highly optimised for scientific computing and working out how to get the data like I want above has so far eluded me.
Is there a better alternative or should I return to banging my head on the numpy wall? :)
EDIT: some more info the first three answers made me realise I should include: I'm not worried about performance, this is purely a proof-of-concept where I'd prefer clean code to good performance. I will also have data for every point in the given 3d space, so I guess a Spare Matrix is bad?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
这是另一种常见方法
让我们看看您的用例。
求 x=1,y=2,z=3 的点。
获取 y=2 的所有点。
获取位置 x=1,y=2,z=3 3 个单位内的所有点。
获取 point.getType() == "Foo" 的所有点
Here's another common approach
Let's look at your use cases.
Get the point where x=1,y=2,z=3.
Getting all points where y=2.
Getting all points within 3 units of position x=1,y=2,z=3.
Getting all points where point.getType() == "Foo"
好吧...如果您希望真正填充该空间,那么您可能最好使用密集的类似矩阵的结构,基本上体素。
如果您不希望填充它,请研究一些更优化的东西。 我首先会看看 八叉树,它经常用于类似的事情。
Well ... If you expect to really fill that space, then you're probably best off with a densely packed matrix-like structure, basically voxels.
If you don't expect to fill it, look into something a bit more optimized. I would start by looking at octrees, which are often used for things like this.
numpy 的优点之一是它的速度非常快,
例如,计算 8000x8000 邻接矩阵的 pagerank 需要几毫秒。 尽管 numpy.ndarray 只接受数字,但您可以将数字/id 对象映射存储在外部哈希表(即字典)中(这又是一个高度优化的数据结构)。
切片就像 python 中的列表切片一样简单:
如果您将一些所需的函数(距离)包装在一些核心矩阵和 id-object-mapping 哈希周围,您可以让您的应用程序在短时间内运行。
祝你好运!
One advantage of numpy is that it is blazingly fast,
e.g. calculating the pagerank of a 8000x8000 adjacency matrix takes milliseconds. Even though
numpy.ndarray
will only accept numbers, you can store number/id-object mappings in an external hash-table i.e. dictionary (which in again is a highly optimized datastructure).The slicing would be as easy as list slicing in python:
If you wrap some of your desired functions (distances) around some core matrix and a id-object-mapping hash, you could have your application running within a short period of time.
Good luck!
这是一种可能有效的方法。
每个点都是一个 4 元组(x、y、z、数据),您的数据库如下所示:
让我们看看您的用例。
求 x=1,y=2,z=3 的点。
获取 y=2 的所有点。
获取位置 x=1,y=2,z=3 3 个单位内的所有点。
获取 point.getType() == "Foo" 的所有点
Here's an approach that may work.
Each point is a 4-tuple (x,y,z,data), and your database looks like this:
Let's look at your use cases.
Get the point where x=1,y=2,z=3.
Getting all points where y=2.
Getting all points within 3 units of position x=1,y=2,z=3.
Getting all points where point.getType() == "Foo"
您可以使用 numpy 中的切片来执行前两个查询:
对于第三个查询,如果您的意思是“获取半径为 3 且以 x=1,y=2,z=3 为中心的球体内的所有点”,则必须编写一个自定义函数来做到这一点; 如果您想要一个立方体,您可以继续进行切片,例如:
对于第四个查询,如果数组中存储的唯一数据是单元格类型,您可以将其编码为整数:
numpy 看起来是执行您想要的操作的好工具,因为数组在内存中会更小,可以在 C 中轻松访问(或者更好,cython !)和扩展切片语法将避免您编写代码。
You can do the first 2 queries with slicing in numpy :
For the third one if you mean "getting all points within a sphere of radius 3 and centered at x=1,y=2,z=3", you will have to write a custom function to do that ; if you want a cube you can proceed with slicing, e.g.:
For the fourth query if the only data stored in your array is the cells type, you could encode it as integers:
numpy looks like the good tool for doing what you want, as the arrays will be smaller in memory, easily accessible in C (or even better, cython !) and extended slicing syntax will avoid you writing code.
如果您想要使用标准库获得相对简单的解决方案,则使用以 x,y,z 元组作为键的字典是另一种解决方案。
由于 python 引用,您可以更改返回的字典中的“点”,并且原始点也会更改(我认为)。
Using a dictionary with x,y,z tuples as keys is another solution, if you want a relatively simple solution with the standard library.
And due to python referencing, you can alter "points" in the returned dictionaries, and have the original points change as well (I think).
何时使用二进制空间分区、四叉树、八叉树?
3d数组在我看来毫无价值。 特别是如果你的世界是动态的。 您应该在 BSP、四叉树或八叉树之间做出选择。 BSP 会做得很好。 由于您的世界是 3d 的,因此在分割 bsp 时需要平面,而不是直线。
干杯!
编辑
我想,如果始终知道您的数据集有多大并且它永远不会改变,即如果向其中添加了更多的点,而这些点又超出了范围,那么这是可以的。 在这种情况下,您必须调整 3d 数组的大小。
When to use Binary Space Partitioning, Quadtree, Octree?
3d array imo is worthless. Especially if your world is dynamic. You should decide between BSP, Quadtree or Octtree. BSP would do just fine. Since your world is in 3d, you need planes when splitting the bsp, not lines.
Cheers !
Edit
I guess this is alright if always know how large you data set is and that it never changes, i.e. if more points are added to it that in turn are out of bound. You would have to resize the 3d array in that case.
这取决于系统的精确配置,但从您给出的示例来看,您使用的是整数和离散点,因此考虑 稀疏矩阵数据结构。
It depends upon the precise configuration of your system, but from the example you give you are using integers and discrete points, so it would probably be appropriate to consider Sparse Matrix data structures.