双氢睾酮如何发挥作用?

发布于 2024-12-08 21:53:31 字数 1090 浏览 0 评论 0 原文

我从 wiki 中获取了有关 DHT 的基本思想:

存储数据

在 DHT 网络中,每个节点都负责特定范围的密钥空间。要将文件存储在 DHT 中,首先,对文件名进行哈希处理以获取文件的密钥;其次,发送一条消息 put(key, file-content) 到 DHT 的任意节点,该消息将被转发到负责 key 的节点和该节点将存储(密钥,文件内容)对

获取数据

从 DHT 获取文件时,首先对文件名进行哈希处理以获取key;第二个发送消息 get(key) 到任何节点,转发消息直到...

问题:

  1. 要存储文件,我们可以对文件的名称进行哈希处理以获取其 < code>key,但 wiki 说:

在现实世界中,密钥 k 可能是文件内容的哈希值,而不是 比文件名的哈希值提供内容可寻址存储, 这样文件的重命名不会妨碍用户找到它。

哈希文件的内容?我应该如何知道文件的内容?如果我已经知道文件的内容,那么我为什么要在 DHT 中搜索它呢?

  1. 根据wiki,每个参与节点都会腾出一些空间来存储文件。那么这是否意味着,如果我参与 DHT,我必须腾出 10G 磁盘空间来存储那些密钥落入特定密钥空间的文件。负责什么?

  2. 如果我确实应该腾出一些磁盘空间来存储这些文件,那么我应该如何在磁盘上存储这些(key, file-content)?我的意思是,该文件应该排列成 B 树还是磁盘上的其他内容吗?

  3. 当查询发生时,我的计算机如何响应?我假设,首先检查查询的密钥,如果在我的密钥空间中,然后在我的磁盘上找到相应的文件。对吗?

I grabbed the basic idea about DHT from wiki:

Store Data:

In a DHT-network, every node is responsible for a specific range of key-space. To store a file in the DHT, first, hash the file's name to get the file's key; second, send a message put(key, file-content) to any node of the DHT, the message will be relayed to the node which is responsible for key and that node will store the pair (key, file-content).

Get Data:

When getting a file from DHT, first, hash the file's name to get the key; second send a message get(key) to any node, relay the message until...

Questions:

  1. To store a file, we can hash the file's name to get its key, but wiki says:

In the real world the key k could be a hash of a file's content rather
than a hash of a file's name to provide content-addressable storage,
so that renaming of the file does not prevent users from finding it.

Hash file's content? How am I supposed to know the file's content? If I've already know the file's content, then WHY would I search it in the DHT?

  1. According to the wiki, every participating node will spare some space to store files. So does it mean that, if I participate in a DHT, I have to spare 10G disk space to store those files whose key falls into the specific key-space I'm responsible for?

  2. If indeed I should spare some disk space to store those files, then how should I store those (key, file-content) on the disk? I mean, should the file be arranged into a B-tree or something on my disk?

  3. When a query happens, how does my computer respond? I assume, first, check the queried key, if in my key-space, then find the corresponding file on my disk. right?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

别挽留 2024-12-15 21:53:31

DHT 只是一种算法。它的基础是提供分布式键值 PUT 和 GET 操作。类似于许多编程语言中的普通 Map 或关联数组。

由于现实世界的限制,例如不可信节点、失败率等,实际的 DHT 实现不提供任意长度的 PUT(, )< /代码> 操作。

示例:

例如,bittorrent 的 kademlia 实现提供了以下接口:

  • PUT(uint8[20], uint16)
  • GET(uint8[20]) -> List> 其中列表仅表示实际数据的随机采样子集

如您所见,与更通用的关联数组相比,它实际上是一个专门的非对称接口。
IP 地址始终源自PUT 发送方的源地址,即无法显式设置。
并且 GET 返回一个列表而不是单个值,因此它实现了 MultiMapMap(如果您想这样查看的话)。

在 BitTorrent 的情况下,哈希值被用作内容描述符,其中拥有内容的对等点在 DHT 上宣布自己。如果有人想要这些文件,他们会在 DHT 上查找 IP/端口对,然后通过单独的协议联系对等点,然后下载数据。

但 DHT 的其他用途也是可能的,即它们可以存储签名的结构化数据、类似推文的文本片段或其他任何内容。这始终取决于您的应用程序的需求。

这只是一个基本的构建块。

A DHT is just an algorithm. At its base it provides distributed key-value PUT and GET operations. Similar to a normal Map or associative array found in many programming languages.

Due to the real-world limitations such as untrustworthy nodes, failure rates and so on actual DHT implementations don't provide an arbitrary-length PUT(<uint8[]>, <uint8[]>) operation.

Example:

The kademlia implementation for bittorrent for example provides the following interfaces:

  • PUT(uint8[20], uint16)
  • GET(uint8[20]) -> List<Pair<IP, uint16>> where the list only represents a randomly sampled subset of the actual data

As you can see it actually is a specialized asymmetric interface when compared to more generic associative arrays.
The IP address is always derived from the PUT sender's source address, i.e. cannot be explicitly set.
And the GET returns a list instead of a single value, so it implements a MultiMap or Map<List>, if you want to see it like that.

In bittorrent's case a hash is used as content descriptor, where peers which have the content announce themselves on the DHT. If someone wants the file(s) they look up IP/Port pairs on the DHT, then contact the peers via a separate protocol and then download the data.

But other uses for a DHT are also possible, i.e. they could store signed, structured data, tweet-like text snippets or whatever. It always depends on your applications' needs.

It's just a basic building block.

蓝颜夕 2024-12-15 21:53:31

DHT(分布式哈希表)是分布式的哈希表。话虽这么说,但实际情况比这要复杂得多。路由部分是最难的部分。 DHT 有 2 个主要变体。

1.) Chord

Chord 在基于循环的系统中工作,因此 dht 中的节点数量受到限制。 DHT 适用于私有系统,并且希望在离开之前有更多优雅的关闭来将数据传递到其他节点。当它尝试查找文件或数据时,它会搜索其他节点上的文件或数据哈希,即具有与文件或数据哈希最接近的 XOR ID 的节点。

  • 通常是 TCP。

https://github.com/DrBrad/JChord

https://en.wikipedia.org/wiki/Chord_(点对点)

命令:

  • JOIN - 加入网络,使用 ID 查找位置
  • LEAVE - 告诉邻居你离开
  • PUT - 将文件或数据提供给最接近数据 ID 和邻居的节点
  • FIND - 获取最接近文件的节点,然后返回文件或数据

2.) Kademlia

Kademlia 工作在基于 XOR 桶的系统上。每个节点都会根据 IP 或 SHA1 或 CRC32c 为自己分配一个 ID。然后当它成为网络的一部分时使用它。当它尝试查找文件或数据时,它会搜索其他节点上的文件或数据哈希,即具有与文件或数据哈希最接近的 XOR ID 的节点。

  • 通常是 UDP。

https://github.com/DrBrad/Kad4

https://github.com/DrBrad/Kad4/wiki

命令:

  • PING - 检查节点是否存活
  • FIND_NODE -查找最接近给定查询 ID 的 XOR 的节点
  • PUT - 将文件或数据放置到基于散列的与 ID 最接近的 5 个节点
  • GET - 基于 ID XOR 获取最接近文件散列的文件或数据。

基本 jist 是基于哈希最接近的文件或数据,应存储在最接近它的节点上。

A DHT (Distributed Hash Table) is a Hash Table that is distributed. That being said it is much more complex than that. The routing portion is the hardest part. There are 2 major variants of DHTs out there.

1.) Chord

Chord works in a circle based system so there is a limit to the amount of nodes in the dht. The DHT is for systems that are private and want more of a graceful close to pass data to the other nodes before the leave. When it tries to find a file or data it searches the file or data hash on other nodes the node with the closest XOR ID to the file or data hash.

  • Typically is TCP.

https://github.com/DrBrad/JChord

https://en.wikipedia.org/wiki/Chord_(peer-to-peer)

Commands:

  • JOIN - Join network use ID to find place
  • LEAVE - Tell neighbors your leaving
  • PUT - Give file or data to node closest to id of data and neighbor
  • FIND - Get node closest to file then returns file or data

2.) Kademlia

Kademlia works on an XOR bucket based system. Each node assigns itself an ID based off of IP or something with SHA1 or CRC32c. It then uses this when a part of the network. When it tries to find a file or data it searches the file or data hash on other nodes the node with the closest XOR ID to the file or data hash.

  • Typically is UDP.

https://github.com/DrBrad/Kad4

https://github.com/DrBrad/Kad4/wiki

Commands:

  • PING - Check if node it alive
  • FIND_NODE - find node closest to XOR of given query ID
  • PUT - put file or data to 5 nodes closest XOR to ID based off of hash
  • GET - get file or data closest to hash of file based off of ID XOR.

Basic jist is file or data closest based off of a hash should be stored on node closest to it.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文