P2P网络中的空间搜索可能吗?
我想构建一个基于 Javascript/HTML5 地理定位的社交网络,我想知道可能架构的最佳选择。客户端-服务器开发起来很简单,但缺点是系统资源可能非常高,特别是因为应用程序必须管理移动(最坏的情况:车里的用户必须看到他周围车里的其他用户)。
基本上,在客户端-服务器体系结构中,服务器任务将是:
- 收集并存储用户的纬度和经度(可能有数千个)
- 对该用户进行地理距离搜索(以获取用户的列表)半径范围内存在于他周围的用户)
- 构建并向客户端发送一个 XML 文件,其中包含列表中用户的位置
这 3 个操作必须定期执行,每 3 或 5 秒一次,因为我想要一个“实时”地图来显示用户在名单在他们的移动环境(城市、城镇)。
所有这 3 点都可以优化:
- 客户端在移动 10 米时发送他的位置,以减少
- 在 MyISAM 表中处理“球形矩形”搜索的数据量,并使用空间索引(使用 MBRContains)来卸载MySQL 数据库。
- 公共输出文件:如果 2 个用户位于 x 米的半径范围内(这 2 个用户彼此靠近),则发送的 XML 可以相同。
在这个阶段很难进行负载估计,但我认为客户端-服务器架构不适合这种类型的应用程序,如果两个客户端在彼此靠近时可以进行通信,那么点对点可能是一个很好的答案。
我的观点是:
有没有什么方法可以让客户端在没有中央服务器帮助的情况下盲目搜索位于一定半径内的其他客户端? (可以使用 UDP 广播:-)
编辑:更正。 UDP Brodcast 允许客户端轮询位于特定范围或 IP 地址内的任何计算机。
感谢您的帮助, 弗洛朗
I want to build a Javascript/HTML5 geolocation based social network and I wonder the best choice of possible architectures. Client-server can be simple to develop but drawback is the system ressources that could be very high, especially because the application must manage moves (worst case: a user that is in a car must see others users that are around him in cars).
Basicaly, in a client-server architecture, server tasks will be :
- collects and stores latitude and longitude of the users (could have thousands of them)
- makes geo distance search for that user (to get the list of users present around him in a radius)
- builds and sends to the client an XML file with position of the users in the list
These 3 operation must be done periodically, every 3 or 5 seconds because I want a "live" map that shows users in the list moving in their environnement (city, town).
All these 3 points could be optimized :
- client send his position when moving of 10 meters to reduce amount of data to process
- "spherical rectangle" search in MyISAM table with spatial index (use of MBRContains) to off load MySQL database.
- common output file : the XML that is sent can be the same if 2 users are located in a radius of x meters (the 2 users are close each-other).
It is hard to make load estimation at this stage but I think client-server architecture is not appropriate for that type of application and peer2peer could be a nice answer if 2 clients could communicate when they are near each other.
My point is:
Is there any methode to make possible a client to blind search other clients that are located in a certain radius without the help of a central server ? (it is possible with UDP broadcast :-)
edit : Correction. UDP Brodcast allow a client to poll a machine wherever it is, in certain range or IP address.
Thank you for your help,
Florent
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您必须拥有中央对等点/服务器,因为您需要集中一些信息才能执行功能。
我会进行以下操作:
将平方英里(或您想要的任何大小)分配给特定服务器。
让设备将带有坐标的“我在这里”消息发送给某个调度员,调度员会将这些消息转发到正确的平方英里服务器进行处理。
当设备进入其管理的一平方英里时,让服务器进行注册。这可以是一个中央地图,以确保设备注册到一个且仅一个正方形。
将此消息转发至广场中的所有其他设备。
和/或确保您包含此消息的目标方格,并确保设备在将其显示给用户之前对其进行检查。
调整方块的大小和“我在这里”消息的速率。就是这样。
You will have to have central peers/servers, because you need to centralize some information to be able to perform you functionalities.
I would go for the following:
Assign square miles (or whatever size you want) to specific servers.
Have devices send a 'I am here' message with their coordinates to some dispatcher that will forward these to the correct square mile server for handling.
Have servers register when a device enters a square mile they manage. This could be a central map to make sure a device is registered to one and only one square.
Forward this message to all other devices in the square.
And/or make sure you include to which square this message is intended and make sure the devices checks it before displays it to the user.
Tune the size of the square and the rate of 'I am here' message. That's it.
答案实际上取决于很多因素,所以我将帮助制定基本策略。要了解清楚,您需要了解 Kademlia 的工作原理(Kademlia 是一个存储信息的 DHT P2P 网络)。
在 Kademlia 中,首次启动时,每个节点都会选择随机 ID,该 ID 是一个 160 位数字,表示所有可能的 160 位 ID 空间中的点。
通过SHA-1函数获取需要存储的信息的ID(它接收任意字符串,并输出160位数字,该数字被视为需要存储的信息的ID
)信息,您发布它,该信息物理存储在其 ID 接近信息 ID 的节点上。
(插图取自此处)
通过其 ID 查询信息。信息查找或节点查找都需要 O(log(N)) 跳才能获取所需信息。 Kademlia 中使用“XOR”度量(在您的情况下,它可以是普通的欧几里得度量)。
每个节点维护一个桶数组,每个桶包含适合当前桶的节点的地址。适当性是对 ID 的接近程度的度量。考虑示例:
将 XOR 度量应用于节点 #1,2 后,即(计算代表这些节点之间的虚拟距离的数字),我们得到:
将 XOR 度量应用于节点 #1,3 后,我们得到:
显然,节点 1 更接近节点3,因为它比从节点 1 到节点 2 的距离具有较低有效位的差异。因此,从节点 1 的角度来看,它的邻居节点 3 进入第 13 个桶(索引越高意味着越近) ID),节点 2 转到第 5 个存储桶,其中包含一组距当前节点 ID 5 个 MSB 基数的节点。
这样的数据结构允许每个节点以 160 个级别的距离了解其周围环境。
回到您的示例,为了允许有效的地理空间查询,您需要将 Kademlias XOR 度量替换为普通的欧几里得度量。在这种情况下,您的 ID 将为 3D 或 2D 向量,不幸的是,由于欧几里得度量结果带有浮点数,这些浮点数不直接适合此类算法,因此您需要将它们转换为离散二进制数某种程度上类似于 XOR 函数的作用。之后,找到节点的邻居节点就是一项简单的任务。
希望这有帮助。哦,顺便看看 HyperDex,与欧几里得度量密切相关的新的可搜索分布式数据存储,可能会有所帮助......
The answer actually depends on many things so I'll help out with basic strategy. To understand things out you'll need to understand how does Kademlia works (Kademlia is a DHT P2P network that stores information).
In Kademlia at first startup each node picks random ID which is a 160 bit number that represents point in a space of all possible 160 bit IDs.
The ID of the information that needs to be stored is obtained with SHA-1 function (it receives arbitrary string, and outputs 160 bit number that is treated like ID of the information that needs to be stored)
After that you have the ID of the information, you publish it, the information is physically stored on a node that has it's ID close to information ID.
(The illustration is taken from here)
The information is queried via it's ID. Both the information lookups or node lookups takes O(log(N)) hops to obtain the required information. The "XOR" metric is used in Kademlia (in your case it can be ordinary Euclidian metric).
Each node maintains an array of buckets, each bucket contains addresses of nodes that are appropriate to the current bucket. The appropriate'ness is a measure of how close the IDs are. consider example:
After applying XOR metric to Nodes #1,2 i.e (computing the number that represents the virtual distance between these nodes) we get:
After applying Xor metric to Nodes #1,3 we get:
Apparently Node 1 is closer to Node 3 since it has difference in less significant bits than the distance from Node 1 to Node 2. And therefore from a point of view of a Node 1, it's neighbor Node 3 goes to 13-th bucket(higher index means closer IDs), and Node 2 goes to to 5-th bucket which contains a group of nodes that are 5 MSB radixes away from a current node ID.
Such data structure allows each node to know it's surroundings in variety of 160 levels of distances.
Back to your example, to allow efficient geospacial queries you'll need to replace Kademlias XOR metric with ordinary Euclidian metric. In this case you will have your ID's as a 3D or 2D vectors, and unfortunately due to fact that Euclidian metric results with floating point numbers which are not directly suitable for this type of algorithm so you will need to convert them to a discrete binary numbers somehow in a way similar to what XOR function does. After that, finding node's neighboring nodes is a trivial task.
Hope this helps. Oh by the way look to HyperDex, new searchable distributed datastore closely tied to euclidian metric, might help...