R-Tree 实现 Java
最近几天我一直在寻找支持无限维度(20 左右就足够了)的 R 树的稳定实现。我只找到了这个 http://sourceforge.net/projects/jsi/ 但它们只支持 2 维。
另一种选择是区间树的多维实现。
也许我对使用 R 树或区间树来解决我的问题的想法完全错误,所以我简短地说明了问题,您可以将您对此的想法发送给我。
我需要解决的问题是某种最近邻搜索。我有一组天线和房间,每个天线都有一个整数间隔。例如,天线 1,最小 -92,最大 -85。事实上,它可以表示为 room ->天线组->天线间隔。 这个想法是,每个房间在天线维度上跨越 R 树中的一个盒子,并在每个维度上跨越间隔。
如果我收到包含 N 个天线和每个天线的值的查询,那么我可以将信息表示为房间中的查询点,并检索距离该点“最近”的房间。
希望您对问题和我的想法有所了解。
I was searching the last few days for a stable implementation of the R-Tree with support of unlimited dimensions (20 or so would be enough). I only found this http://sourceforge.net/projects/jsi/ but they only support 2 dimensions.
Another Option would be a multidimensional implementation of an interval-tree.
Maybe I'm completly wrong with the idea of using an R-Tree or Intervall-Tree for my Problem so i state the Problem in short, that you can send me your thoughts about this.
The Problem I need to solve is some kind of nearest-neighbour search. I have a set of Antennas and rooms and for each antenna an interval of Integers. E.g. antenna 1, min -92, max -85. In fact it could be represented as room -> set of antennas -> interval for antenna.
The idea was that each room spans a box in the R-Tree over the dimension of the antennas and in each dimension by the interval.
If I get a query with N-Antennas and values for each antenna I then could just represent the Information as a query point in the room and retrieve the rooms "nearest" to the point.
Hope you got an Idea of the problem and my idea.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
请注意,当您拥有离散数据时,R 树可能会严重退化。您真正需要找出的第一件事是适当的数据表示,然后测试您的查询是否适用于数据子集。
R-Tree 只会让您的查询更快。如果它们从一开始就不起作用,那就无济于事。 您应该首先在不使用 R-Tree 的情况下测试您的方法。除非您处理大量数据(例如 100.000 个对象),否则内存中的线性扫描可以轻松胜过 R-Tree,特别是当您需要一些适配器层时,因为它与您的代码没有很好地集成。
这里显而易见的方法是仅使用边界矩形,并线性扫描它们。如果它们有效,您可以将 MBR 存储在 R 树中以获得一些性能改进。 但是如果它不能与线性扫描一起工作,它也不能与 R 树一起工作(它不会工作得更快。)
Be aware that R-Trees can degrade badly when you have discrete data. The first thing you really need to find out is an appropriate data representation, then test if your queries work on a subset of the data.
R-Trees will only make your queries faster. If they don't work in the first place, it will not help. You should test your approach without using R-Trees first. Unless you hit a large amount of data (say, 100.000 objects), a linear scan in-memory can easily outperform an R-Tree, in particular when you need some adapter layer because it is not well-intergrated with your code.
The obvious approach here is to just use bounding rectangles, and linearly scan over them. If they work, you can then store the MBRs in an R-Tree to get some performance improvements. But if it doesn't work with a linear scan, it won't work with an R-Tree either (it will not work faster.)
我不完全清楚你的确切问题是什么,但 R 树或区间树在 20 维中不能很好地工作。这并不是一个巨大的维度,但它足够大,足以让维度灾难开始显现。
为了明白我的意思,考虑一下尝试查看一个盒子的所有邻居,包括那些远离角落和边缘的邻居。如果有 20 个维度,您将拥有 320 - 1 个或 3,486,784,400 个相邻盒子。 (您可以通过认识到沿每个轴的邻居可以是 -1 单位、0 单位或 +1 单位来了解这一点,但 (0,0,0) 不是邻居,因为它代表原始框。)
抱歉,但你要么需要接受强力搜索,要么更好地分析你的问题并提出更聪明的解决方案。
I'm not entirely clear on what your exact problem is, but an R-Tree or interval tree would not work well in 20 dimensions. That's not a huge number of dimensions, but it is large enough for the curse of dimensionality to begin showing up.
To see what I mean, consider just trying to look at all of the neighbors of a box, including ones off of corners and edges. With 20 dimensions, you'll have 320 - 1 or 3,486,784,400 neighboring boxes. (You get that by realizing that along each axis a neighbor can be -1 unit, 0 unit, or +1 unit, but (0,0,0) is not a neighbor because it represents the original box.)
I'm sorry, but you either need to accept brute force searching, or else analyze your problem better and come up with a cleverer solution.
我发现 Java 中的 R*-Tree 实现似乎提供了许多功能:
https://github.com/davidmoten /rtree
你可能想看看!
I have found this R*-Tree implementation in Java which seems to offer many features:
https://github.com/davidmoten/rtree
You might want to check it out!
Java 中另一个很好的实现是 ELKI:https://elki-project.github.io/。
Another good implementation in Java is ELKI: https://elki-project.github.io/.
您可以使用 PostgreSQL 的通用搜索树索引工具。
GiST
快速演示
You can use PostgreSQL’s Generalized Search Tree indexing facility.
GiST
Quick demo