SPATIAL 几何索引性能是否取决于几何形状的大小和密度？

发布于 2024-10-28 05:36:20 字数 1533 浏览 9 评论 0原文

空间索引

给定一个空间索引，是索引效用，也就是说索引的整体性能，仅与整体几何形状一样好。

例如，如果我要采用一百万个几何数据类型并将它们插入到表中，以便它们的相对点彼此密集，这是否会使该索引对于相对位置可能明显更加稀疏的相同几何形状表现更好。

问题1

例如，以这两个几何形状为例。

情况 1

LINESTRING(0 0,1 1,2 2)
LINESTRING(1 1,2 2,3 3)

从几何角度看，它们是相同的，但它们的坐标有一点偏差。想象一下这被重复了一百万次。

现在采用这种情况，

情况 2

LINESTRING(0 0,1 1,2 2)
LINESTRING(1000000 1000000,1000001 10000001,1000002 1000002)
LINESTRING(2000000 2000000,2000001 20000001,2000002 2000002)
LINESTRING(3000000 3000000,3000001 30000001,3000002 3000002)

在上面的示例中：

线的尺寸与情况 1 相同，
线的点数相同，
线的尺寸相同。

然而，

区别在于，线条之间的距离非常远。

为什么这对我很重要？

我问这个问题的原因是因为我想知道是否应该尽可能地从输入几何图形中删除精度并降低它们的密度和彼此的接近度我的应用程序可以在不损失准确性的情况下提供尽可能多的信息。

问题 2

这个问题与第一个问题类似，但不是在空间上接近另一个几何形状，而是应该将形状本身简化为尽可能小的形状来描述应用程序所需的内容。

例如，如果我要在几何数据类型上使用 SPATIAL 索引来提供日期数据。如果我想存储两个日期的日期范围，我可以在 mysql 中使用日期时间数据类型。但是，如果我想使用几何类型，以便通过获取每个单独的日期并将其转换为 unix_timestamp() 来传递日期范围，该怎么办？

例如：

 Date("1st January 2011") to Timestamp =  1293861600
 Date("31st January 2011") to Timestamp =  1296453600

现在，我可以根据这两个整数创建一个 LINESTRING。

 LINESTRING(1293861600 0,1296453600 1)

如果我的应用程序实际上只关心天数，并且秒数对于日期范围根本不重要，那么我是否应该重构我的几何图形，以便将它们减小到尽可能小的大小以满足它们的需求。

因此，我将使用“1293861600”/(3600 * 24)，而不是“1293861600”，它恰好是“14975.25”。

有人可以帮助填补这些空白吗？

原文

Spatial Indexes

Given a spatial index, is the index utility, that is to say the overall performance of the index, only as good as the overall geometrys.

For example, if I were to take a million geometry data types and insert them into a table so that their relative points are densely located to one another, does this make this index perform better to identical geometry shapes whose relative location might be significantly more sparse.

Question 1

For example, take these two geometry shapes.

Situation 1

LINESTRING(0 0,1 1,2 2)
LINESTRING(1 1,2 2,3 3)

Geometrically they are identical, but their coordinates are off by a single point. Imagine this was repeated one million times.

Now take this situation,

Situation 2

LINESTRING(0 0,1 1,2 2)
LINESTRING(1000000 1000000,1000001 10000001,1000002 1000002)
LINESTRING(2000000 2000000,2000001 20000001,2000002 2000002)
LINESTRING(3000000 3000000,3000001 30000001,3000002 3000002)

In the above example:

the lines dimensions are identical to the situation 1,
the lines are of the same number of points
the lines have identical sizes.

However,

the difference is that the lines are massively futher apart.

Why is this important to me?

The reason I ask this question is because I want to know if I should remove as much precision from my input geometries as I possibly can and reduce their density and closeness to each other as much as my application can provide without losing accuracy.

Question 2

This question is similar to the first question, but instead of being spatially close to another geometry shape, should the shapes themselves be reduced to the smalest possible shape to describe what it is that the application requires.

For example, if I were to use a SPATIAL index on a geometry datatype to provide data on dates.
If I wanted to store a date range of two dates, I could use a datetime data type in mysql. However, what if I wanted to use a geometry type, so that I convery the date range by taking each individual date and converting it into a unix_timestamp().

For example:

 Date("1st January 2011") to Timestamp =  1293861600
 Date("31st January 2011") to Timestamp =  1296453600

Now, I could create a LINESTRING based on these two integers.

 LINESTRING(1293861600 0,1296453600 1)

If my application is actually only concerned about days, and the number of seconds isn't important for date ranges at all, should I refactor my geometries so that they are reduced to their smallest possible size in order to fulfil what they need.

So that instead of "1293861600", I would use "1293861600" / (3600 * 24), which happens to be "14975.25".

Can someone help fill in these gaps?

分享到QQ

分享到微博