树中兄弟姐妹之间的亲密程度如何表示?

发布于 2024-11-04 07:10:36 字数 1224 浏览 1 评论 0原文

例如,人们可能想说“鲸鱼”是动物的“孩子”,但“鲸鱼”更像“海豚”而不是“狗”。在这种情况下,“鲸鱼”、“海豚”、“狗”都是动物的子代,但“鲸鱼”和“海豚”显然有关系。

我对简单地定义更多子类(例如“海洋动物”、“陆地动物”)不感兴趣,上面的例子只是为了说明......假设我们无法“定义”解决问题的方法。

人们是否可以简单地定义一个加权部分无环图,并且知道该图的某个子集实际上是一棵树(不一定是跨越树)?

编辑:许多人要求更多澄清。我将使用相同的示例,但可能会更详细地说明

假设我们有以下类别:

    Animals, Place, Object.
    The following sub categories: [land animals, sea animals], [country, state],
 [heavy object, light object]
    And we have the following entries: Whale, Dolphin, Dog, Cat, Hawaii, Japan,
 London, Stone, Rock, Leaf, Car.

    I have an isLike(entry x) function that I can call on any of the entries.

    for example say whale.isLike(dolphin) = 0.7, whale.isLike(dog) = 0.2 and
a table like the following one stores all the values for the isLike() function

            Whale dolphin dog cat hawaii japan london stone
    whale   1     0.7     0.2 0.2  0.01   0.01  0.01   0.008
    dolphin 0.7   1       0.2 0.2  0.01   0.01  0.01   0.008
    dog      etc
    cat      etc
    hawaii    etc 
    japan    etc
    london   etc
    stone    etc

表示此数据的最佳方式是什么?

我最关心的是如何在 isLike() (加权图)中保留层次信息(树)以及关系信息,

所以只是问标准的做法是否是使用有向图(对于树)+加权无向图(用于关系)结构类型?这是标准还是有更标准的方法?

E.g. one might want to say "whale" is a "child" of animal but "whale" is more like "dolphin" than "dog". "whale", "dolphin", "dog" are all children of animal in this case but "whale" and "dolphin" clearly have a relationship.

I AM NOT interested in simply defining more sub-classes (for example "sea animals", "land animals") the above example is just for illustration...assume we can't "define" our way out of the problem.

Does one simply just define a weighted part-acyclic graph with the knowledge that some subset of that graph is really a tree (not necessarily spanning)?

EDIT: A number of people have asked for more clarification. I'll use the same example but probably go into more detail

Say we have the following categories:

    Animals, Place, Object.
    The following sub categories: [land animals, sea animals], [country, state],
 [heavy object, light object]
    And we have the following entries: Whale, Dolphin, Dog, Cat, Hawaii, Japan,
 London, Stone, Rock, Leaf, Car.

    I have an isLike(entry x) function that I can call on any of the entries.

    for example say whale.isLike(dolphin) = 0.7, whale.isLike(dog) = 0.2 and
a table like the following one stores all the values for the isLike() function

            Whale dolphin dog cat hawaii japan london stone
    whale   1     0.7     0.2 0.2  0.01   0.01  0.01   0.008
    dolphin 0.7   1       0.2 0.2  0.01   0.01  0.01   0.008
    dog      etc
    cat      etc
    hawaii    etc 
    japan    etc
    london   etc
    stone    etc

What is the best way to represent this data?

I am most concerned about how to keep the hierarchical information (tree) as well as the relationship information in isLike() (weighted graph)

so just asking if the standard thing to do is to use a directed graph (for the tree) + weighted undirected graph (for relations) type of structure? Is this standard or is there a more standard way?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

层林尽染 2024-11-11 07:10:36

您可能想要使用加权的无向边来表示图中的接近度。但目前尚不清楚您想在这里完成什么。根据您想要完成的任务,您可能希望将关系与分类层次结构分开。

You probably want to use a weighted, undirected edge to represent closeness in the graph. It's not clear, though, what you are trying to accomplish here. Depending on what you are trying to accomplish, you may want to separate the relationships from the classification hierarchy.

难如初 2024-11-11 07:10:36

有多种方法可以定义树中节点之间的距离。您可以使用父母、兄弟姐妹、叔叔等。要了解更多信息,请查看红黑树。

你的定义没有意义。我们定义距离的唯一方法是向树添加一些结构信息,以便我们知道如何排列节点。这就是“子类”在层次关系中所做的事情。这些链接本质上只是“边”,因为任何树都可以转换为图

如果您的节点只是标签,那么它们就是名义部分 数据。您无法计算任何比率或间隔,因此任何距离度量都必须等于来自所需节点的链接数。

如果树中的节点对应于数据结构(例如,动物),那么我们可以假设每个结构都具有共享属性。 (例如:眼睛颜色、体重、高度、isFurry 等)这些属性可能具有间隔或比例尺度的域和范围,在这种情况下,我们可以计算有意义的距离。

为了在这里表示对象之间的距离,您可以意识到您真正所做的是定义一组变量之间的坐标空间(x = 眼睛颜色,y = 重量,z = 高度,isFurry = q)。因此,每个单独的节点实际上是由一组公共属性定义的坐标空间中的向量。因此,您可以计算欧几里德距离、马哈波利斯距离、曼哈顿距离、余弦相似度或您想要的任何其他距离度量。

There are all kinds of ways to define distance between nodes in a tree. You can use parents, siblings, uncles, etc. To learn more, check out Red-Black Trees.

Your stipulation of definition doesn't make sense. The only way that we can define distance is by adding some structural information to the tree such that we know how to arrange the nodes. That's what "sub-classes" do in a hierarchical relationship. The links are essentially just "edges", as any tree can be transformed into a graph.

If your nodes are just labels, then they are nominal pieces of data. There's no way that you can calculate any ratios or intervals, so any distance metric would have to be equal to the number of links from the desired node.

If your nodes in the tree correspond to data structures (for example, Animals), then we can assume that each of those structures have shared attributes. (for example: eye color, weight, height, isFurry, etc) These attributes may have domain and range in interval or ratio scales, in which case we can compute a meaningful distance.

To represent the distance between objects here, you can realize that what you are really doing is defining a coordinate space across a set of variables (x= eye color, y=weight, z=height, isFurry=q). So each individual node is actually a vector in the coordinate space defined by the set of common attributes. Consequently, you can calculate a Euclidean distance, Mahabolis Distance, Manhattan Distance, Cosine Similarity, or any other distance metric you want.

孤云独去闲 2024-11-11 07:10:36

我认为你想做的是层次聚类,你所拥有的叫做距离矩阵。

I think that what you are trying to do is hierarchical clustering, and what you have is called distance matrix.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文