C 中的纬度/经度存储和压缩
有谁知道纬度/经度坐标的最有效表示? 精度水平对于消费类 GPS 设备来说应该足够了。
大多数实现似乎对每个单元使用 double ,但我怀疑浮点或定点格式应该足够了。 我很想听听任何尝试过压缩和/或存储这些值的大型数组的人的来信。
编辑:
换句话说,代表消费级设备的纬度/经度所需的最低精度是多少?
Does anyone know the most efficient representation for lat/long coordinates? Accuracy level should be enough for consumer GPS devices.
Most implementations seem to use double
for each unit, but I'm suspicious that a float
or fixed point format should be sufficient. I'd be curious to hear from anyone who has tried to compress and or store large arrays of these values.
EDIT:
In other words, whats the minimum accuracy required to represent lat/long for a consumer level device?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(13)
如果您要存储这些值的大型数组,则可以使用一些简单的技巧进行增量压缩并存储增量,从而可以大大减小数据流的大小。
您可以从“关键点”开始进行增量
KDDDDDDDDDDKDDDD ...
k + d 让您到达任何 d 点
增量都引用前一个 K,因此要重建任何点,您需要一个 K 和一个 D
,或者您可以进行增量增量
K IIIIIIIIIIK
这可能需要多次求和才能到达所需的位置。 但整体数据较小。 所以重构
k+i+i+i 到达第四点
最后你可以结合
K DIIIDIIIDIIIK
这就像 mpeg-2 与 IPB 帧,但这样你就不会超过 4 总和到任何位置,你得到Delta 和增量压缩的一些好处。
If you are storing large arrays of these values, there are a few simple tricks if you do a delta compression, and store deltas, you can greatly reduce the size of a data stream.
You can do deltas from a "key point"
K D D D D D D D D D D K D D D D ...
k + d get you to any d point
The deltas all reference the previous K, so to reconstruct any point, you need a K and a D
or you can do incrimental deltas
K I I I I I I I I I I K
This may take multiple sums to get to the desired position. but the data is smaller overall. SO to reconsturct
k+i+i+i to get to the 4th point
Finally you can combine both
K D I I I D I I I D I I I K
This is like mpeg-2 with IPB frames, but this way you are never more than 4 sums to any position, and you get the some of the benefit of Delta and Incrimental Compression.
地球的周长约为。 40.000 公里或 24900 英里。
您需要一米(3 英尺)的精度才能比 GPS 精度高出一个数量级。
因此,您需要精确存储 40,000,000 个不同的值。 这至少是 26 位信息。 32 位浮点数或整数就可以了。
The circumference of the Earth is approx. 40.000 km or 24900 miles.
You need one-meter accuracy (3ft) to be able to out-resolve gps precision by an order of magnitude.
Therefore you need precisiton to store 40.000.000 different values. That's at minimum 26 bits of information. A 32 bit float or int will do well.
编辑:从评论中添加了一些观点,32位值应该能够提供足够的精度。
我会使用 32 位定点表示。 如果值为:
42.915512
、-99.521654
,我会将值 * 100000
存储在int32_t
' 中s(它们可以是负数)。这是简单和准确之间的一个很好的折衷(
5
小数点通常就足够了,您可以随时将其提高到1000000
以获得 <代码>6(如果需要)。要向用户显示,请执行 caf 建议的操作:
这些也将以有效的方式进行比较/排序,因为相对顺序将被保留。
编辑:另一个好处是它可以通过网络发送或以便携式方式以二进制格式存储到磁盘。
EDIT: added some points from comments, 32-bit values should be capable of offering enough precision.
I would use a 32-bit fixed point representation. If the values are:
42.915512
,-99.521654
I would store thevalues * 100000
inint32_t
's (they can be negative).This is a good compromise between simple and accurate (
5
decimal points is usually good enough, you could always bump it up to1000000
to get6
if needed).To display to the user, do what caf suggests:
These will also be comparable/sortable in an efficient way since the relative ordering will be preserved.
EDIT: an additional benefit is that it can sent over a network or stored to disk in a binary format in a portable way.
即使消费级 GPS 设备的精度接近其声称的精度,浮点对于存储 GPS 坐标来说也绰绰有余。 如果您不相信这是真的,请尝试这两个简单的实验:
多年来,我一直在为支持 GPS 的 PDA 编写应用程序,并且我已经一次又一次地向可疑的客户验证了这一点(我什至通过这种方式赢得了赌注)。 市面上有更高质量的 GPS 设备,其精度确实比这更高,但更好的精度是通过更昂贵的芯片组实现的,并且这些设备会在一个地方放置数天甚至数周,并随着时间的推移取平均值。
四字节浮点数比设备本身精确得多。 当然,只要 2X 因子对您来说不是问题,那么使用双精度数对您来说根本没有什么坏处。
Floats would be way more than sufficient for storing GPS coordinates, even if consumer-level GPS devices had anywhere near the accuracy claimed for them. If you don't believe this is true, try these two simple experiments:
I've been writing applications for GPS-enabled PDAs for years, and I've verified this for dubious customers over and over again (I've even won bets this way). There are higher-quality GPS devices out there that do achieve better accuracy than this, but the better accuracy is achieved with more expensive chipsets, and the devices are left in one spot for days or even weeks, with the readings averaged over time.
A four-byte float is far more accurate than the devices themselves. It would of course not hurt you at all to use a double instead, as long as the 2X factor isn't a problem for you.
假设地球是一个完美的球体(事实并非如此,但足够接近),半径“R”为 3959 英里(或 ×5280 英尺/英里 = 20903520 英尺),周长为 131340690 英尺(使用 2×PI×R) 。
360 度经度覆盖 131340690 英尺。 180 度纬度覆盖 65670345 英尺。
如果您想要存储 lat/lng 的精度达到 3 英尺,则需要能够存储 43780230 (131340690/3) 经度值和 21890115 (65670345/3) 纬度值。 43780230 需要 25.38 位 (log(43780230)/log(2)) 来存储,21890115 需要 24.38 位 (log(21890115)/log(2)) 来存储,或者略低于 50 位(或 6.25 字节)。
那么显而易见的问题是,如果你想用 6 个字节来存储纬度和经度,那么精度会是多少? 嗯,6 个字节就是 48 位。 这意味着纬度为 23.5 位,经度为 24.5 位(经度的值是其两倍,只有一位,24.5-23.5=1 位)。 因此 23.5 位允许您表示 0 到 11863282 之间的数字(11863283 个值)。 65670345 英尺除以 11863283 值等于 5.53 英尺(经度的精度值相同)。
底线:因此,如果您可以接受 5.5 英尺的纬度和经度精度,则可以将这两个值打包到仅 6 个字节中。
*旁注:关于纬度和经度对于存储球体周围的位置信息来说很糟糕的评论(因为在两极存储的信息较少)——好吧,这些评论不符合数学! 让我们弄清楚一下。 假设我们想要设计一个新的完美系统,可以记录并在地球每平方英尺的中心放置一个桩。 地球表面积(R 为 3959 英里;球体表面积的公式)为 5490965469267303 SQ FT – 许多赌注需要 52.29 位来表示。 现在现有的经纬度系统采用的是直角坐标系统。 矩形的宽度是地球的周长,矩形的高度是周长的 1/2)——即 131340690 * 65670345(见上面),或 8625188424838050 SQ FT——需要 52.94 位来表示(该系统将电线杆周围地面上的木桩“太多”)。 因此,令人震惊的答案是,新的完美系统和旧的经纬度系统都需要 53 个实际位来存储地球上的单个位置,精度低至 1 英尺!
Assuming the earth is a perfect sphere (it is not, but close enough) with a radius ‘R’ of 3959 miles (or ×5280 ft/mi = 20903520 ft), the circumference is 131340690 feet (using 2×PI×R).
360 degrees of longitude covers 131340690 feet. 180 degrees of latitude covers 65670345 feet.
If you want to store lat/lng down to an accuracy of 3 feet, you need to be able to store 43780230 (131340690/3) longitude value and 21890115 (65670345/3) latitude values. 43780230 requires 25.38 bits (log(43780230)/log(2)) to store and 21890115 requires 24.38 bits (log(21890115)/log(2)) to store – or just under 50 bits (or 6.25 bytes).
So the obvious question becomes, if you want to store latitude and longitude in just 6 bytes, what will the accuracy be? Well, 6 bytes is 48 bits. That means 23.5 bits for latitude and 24.5 bits for longitude (longitude has twice as many values, which is just one bit and 24.5-23.5=1 bit). So 23.5 bits allows you to represent a number from 0 to 11863282 (11863283 values). And 65670345 feet divided by 11863283 values is 5.53 feet (and the same accuracy value for longitude).
THE BOTTOM LINE: So, if you can live with 5.5 feet of accuracy for both latitude and longitude, you can pack both values into just six bytes.
*A SIDE NOTE: Regarding comments that latitude and longitude are horrible for storing the positional information around a sphere (because there is less information to store at the poles) – well, those comments don’t hold up to the math! Let’s figure it out. Let’s say we want to design a new perfect system that can record and place a stake in the ground in the center of every square foot of earth. The surface area of earth (with a R of 3959 miles; formula for surface area of a sphere) is 5490965469267303 SQ FT – that many stakes requires 52.29 bits to represent. Now the existing latitude and longitude system uses a rectangular system. The width of the rectangle is the circumference of the earth and height of the rectangle is 1/2 the circumference) – which is 131340690 * 65670345 (see far above), or 8625188424838050 SQ FT – which requires 52.94 bits to represent (this system places ‘too many’ stakes in the ground around the poles). So, the shocking answer is that both the new perfect system, and the old lat/lng system, would BOTH require 53 actual bits to store a single location on earth, down to 1 foot accuracy!
在 Garmin 的 IMG 地图格式中,它们使用浮点来设置边界框的边缘,将坐标存储在边界框内。 框中的坐标是使用可变位数定义的,这些位数根据所需的精度在最小值和最大值之间呈线性关系。
例如:minlat=49.0、maxlat=50.0、minlon=122.0、maxlon=123.0、位数=16
所以值为:
32768,32768 将转换为 49.5, 122.5
16384,0 将是 49.25, 122.0
如果您需要较低的精度,则可以使用位数 = 4 生成相同的输出
8,8 将转换为 49.5, 122.5
4,0 将是 49.25, 122.0
In Garmin's IMG map format they store coordinates inside bounding boxes using floats to set the edges of the boxes. Coords within the boxes are defined using a variable number of bits that are are just linear between min and max values depending on the precision needed.
For example: minlat=49.0, maxlat=50.0, minlon=122.0, maxlon=123.0, number of bits=16
So a value of:
32768,32768 would be converted to 49.5, 122.5
16384,0 would be 49.25, 122.0
If you need less precision, the same output could be generated with a number of bits=4
8,8 would be converted to 49.5, 122.5
4,0 would be 49.25, 122.0
就我个人而言,我会使用 32 位十进制定点表示,根据 Evan 的回答和我的评论除以 1,000,000。
然而,如果空间确实非常宝贵,这里有一些额外的想法:
您可以在线上使用 26 位定点表示。 这将需要将纬度和经度编组和解编为一个大字节数组,但在 32 位值表示中,每个位置将节省 12 位 - 几乎节省 19%,因此它可能是值得的。
您可以利用这样一个事实:当您接近两极时,经度值需要的精度较低 - 它们在赤道处只需要 26 位。 因此,您可以编写一个方案,其中用于编码经度的位数取决于纬度值。
如果您的数据具有其他可压缩属性 - 例如,所有点通常都非常接近 - 您可以利用这些特定优势,例如使用增量编码方案(其中除第一个点之外的每个点都可以编码为增量) 除
Personally I would use a 32 bit decimal fixed point representation, dividing by 1,000,000 as per Evan's answer and my comments.
However, if space is truly at a premium, here are some additional ideas:
You could use a 26 bit fixed point representation on the wire. This will require marshalling and unmarshalling the latitude and longitude into a large array of bytes, but will save you 12 bits for each location over the 32 bit value representation - almost a 19% saving, so it might well be worthwhile.
You could take advantage of the fact that longitude values need less precision as you get closer to the poles - they only need 26 bits worth at the equator. So you could write a scheme where the number of bits used to encode the longitude depends on the value of the latitude.
If your data has other compressible attributes - say, all the points are usually quite close together - you could take specific advantage of those, like using a delta coding scheme (where each point other than the first can be encoded as a delta from the last point).
经度 179 度处的 23 位精度可提供低于 10 米的精度,这是普通 GPS 设备所能提供的最佳精度。 在赤道处:
因此 IEEE 754 单精度浮点数(C 编译器称为
float
)足以表示。 当心使用浮点数进行扩展计算! 舍入误差可能会吃掉你的午餐。 咨询数值分析师。23 bits of precision at 179 degrees of longitude gives under 10-meter accuracy, which is the best that ordinary GPS devices give. At the equator:
So an IEEE 754 single-precision floating-point number, known to your C compiler as
float
, wil be just adequate for representation. Beware of using floats for extended computations! Rounding error may eat your lunch. Consult a numerical analyst.因为我需要它,所以这里是 Jerry Jongerius 答案的 python 代码,它使用 23.5 和 24.5 位表示 6 字节的纬度/经度值,并且在赤道附近的精度约为 1.7m:
Because I needed it here's the python code for Jerry Jongerius's answer that represents Lat/Lon values with 6 Bytes and an accuracy of around 1.7m near the equator using 23.5 and 24.5 bits:
将纬度和经度值乘以 10,000,000 (10^7) 是在
int32_t
范围内实现更高精度的好方法。 使用此缩放因子,您可以以小数点后 7 位的精度表示纬度和经度坐标。 操作方法如下:对于纬度:
int32_t
范围内的整数。对于经度:
int32_t
范围内。int32_t
范围内映射从 -180.0000000 到 +180.0000000 度到 -1,800,000,000 到 +1,800,000,000 度的经度范围。应该是正确的吧?
Multiplying latitude and longitude values by 10,000,000 (10^7) is a good approach to achieve a higher precision within the
int32_t
range. With this scaling factor, you can represent latitude and longitude coordinates with 7 decimal places of precision. Here's how you can do it:For Latitude:
int32_t
range.For Longitude:
int32_t
range.int32_t
range.Should be correct right?
我同意萨姆·梅森的观点。 只需使用您需要的位数/字节即可获得所需的角度精度。 纬度和经度是角度。 获取总位数并将它们解释为 2*pi(或 360 度)的有符号(或无符号)分数。 它也会自动照顾绕地球飞行。
如果您使用 32 位(假设有符号),则最低有效位代表 1,4629 纳弧度或 527 纳度,这将为您提供赤道 10 毫米区域内最坏情况下的精度。 180 度将与 -180 度 (0x80000000) 相同,360 度将与 0 度 (0x00000000) 相同,在加或减时不进行任何检查或转换。
I agree with Sam Mason. Just use however many bits/bytes you need to get your required angular precision. Latitude and Longitude are angles. Take your total number of bits and interpret them as a signed (or unsigned) fraction of 2*pi (or 360 degrees). It automagically takes care of going around the earth as well.
If you use 32 bits (let's assume signed), the least significant bit represents 1,4629 nanoradians or 527 nanodegrees, which will give you a worst case accuracy in the region of 10mm at the equator. 180 degrees will be the same as -180 degrees (0x80000000) and 360 degrees will be the same as 0 degrees (0x00000000) without any checks or conversions when adding or subracting.
我很惊讶没有人发布这样的事实:长/纬度是在球体上存储数据的一种糟糕的方式(有人确实提到经度在两极附近需要较低的精度)。
基本上,您可以将数据位置存储为以米为单位的 X 和 Y 坐标。 想象一个围绕地球的立方体完全适合(哈哈好吧几乎适合它)。 您只需要存储 X 和 Y 位置,而不需要存储所有 3 个坐标,因为第 3 个坐标可以来自地球的半径,r = 平方根[x^2 + y^2 + z^2] 。
因此,请将您的纬度/经度转换为以米为单位的 x/y。 每个坐标总共只需要 12756200m(即地球的直径)。 因此,您的总价值只需在 0 到 25,512,400 之间(其他人声称为 40,000,000,因为他们使用经/纬度)即可精确到 +/- 0.5m。
这将导致每个位置只有 25 位。 如果我是你,我只会将精度控制在 2m 以内,并且每个位置使用 24 位,因为这是一个整洁的 3 个字节。
此外,如果您要存储路径上的航路点信息,则可以将每个航路点存储为距最后一个航路点的偏移量。 就像从 24 位 x/y 坐标开始一样。 然后进行 16 位“更新”,通过添加/减去 x/y 米来调整位置。 16 位允许航路点更新距离超过 400m。 因此,如果您知道该设备不适用于飞机并且经常更新,那么这也可能是可以接受的。
I am suprised that no one has posted the fact that long/lat is a terrible way to store data on a sphere (someone did mention that longitude requires less precision near the poles).
Basically you can store data position as X and Y co-ords in meters. Imagine a cube around the earth that exactally fits (haha ok almost fits it). You only need store X and Y position, not all 3 co-ords, because the 3-rd co-ord can come from the redius of the earth, r = square root[x^2 + y^2 + z^2].
So convert your lat/long to x/y in meters. You will only need a total of 12756200m per co-ord (thats the diameters of the earth). So your total value will only have to span 0 to 25,512,400 (someone else claimed 40,000,000 because they were using long/lat) to be accurate to +/- 0.5m.
That will result in just 25 bits per position. If I were you I would just do accuracy to within 2m and use 24 bits per position, as that is a tidy 3 bytes.
Also if you are storing way-point information on a path, you can store each way-point as an offset from the last waypoint. Like start with a 24bit x/y co-ord. And then have a 16bit 'update' that adjusts the position by adding/subtracting x/y meters. 16bit would allow a waypoint update to be over 400m away. So if you know the device is not meant for planes and updates every so often, this might be acceptable too.
如果您使用递归平铺系统。 每个级别使用两位,您可以在 32 位中存储 16 个级别。 您可以查看这篇关于虚拟地球平铺的文章,了解其工作原理系统。 这使用了墨卡托,所以它会给你的两极带来问题。 您可以使用不同的投影,但仍然会得到非常相似的结果。
这也可以用于粗略过滤器来查找给定父图块内的任何点,因为前 N 位将是相同的(因此搜索变为位掩码)。
You can pack both the latitude and longitude values in a single 32-bit integer with a resolution of at worst ~2.4 meters/pixel (at the equator) if you use a recursive tiling system. Using two bits per level, you can store 16 levels in 32 bits. You can get an idea of how that would work looking at this article about Virtual Earth's tiling system. This uses Mercator, so it would give you problems for the poles. You could instead use a different projection and still get very similar results.
This can also be used for a rough filter to find any points within a given parent tile since the first N bits will be the same (and so search becomes bit masking).