就计算复杂性而言,这有多难?
所以我有一个基本上是这样的问题:我有一堆字符串,我想构造一个 DAG,使得每个路径对应一个字符串,反之亦然。然而,我可以自由地任意排列我的字符串。字符的顺序并不重要。我生成的 DAG 具有与之相关的成本。基本上,DAG 中分支的成本与其子路径的长度成正比。
例如,假设我有字符串 BAAA、CAAA、DAAA,并且我构造了一个表示它们的 DAG,但没有对它们进行排列。我得到:(
) -> (B、C、D)→ A-> A-> A
,其中元组代表分支。
对于我的目的来说,更便宜的表示是:
() -> A-> A-> A-> (B,C,D)
问题是:给定n个字符串,对字符串进行排列,使得相应的DAG具有最便宜的成本,其中成本函数为:如果我们从源开始深度遍历图,从左到右的顺序,我们访问的节点总数,具有多重性。
所以第一个例子的成本是 12,因为我们必须在遍历时多次访问 A。第二个例子的成本是 6,因为我们在处理分支之前只访问 A 一次。
我感觉这个问题是NP Hard。这似乎是一个关于形式语言的问题,我对这些算法不够熟悉,无法弄清楚应该如何进行简化。我本身不需要完整的答案,但如果有人能指出一类看似相关的众所周知的问题,我将非常感激。
So I have a problem that is basically like this: I have a bunch of strings, and I want to construct a DAG such that every path corresponds to a string and vice versa. However, I have the freedom to permute my strings arbitrarily. The order of characters does not matter. The DAGs that I generate have a cost associated with them. Basically, the cost of a branch in the DAG is proportional to the length of its child paths.
For example, let's say I have the strings BAAA, CAAA, DAAA, and I construct a DAG representing them without permuting them. I get:
() -> (B, C, D) -> A -> A -> A
where the tuple represents branching.
A cheaper representation for my purposes would be:
() -> A -> A -> A -> (B, C, D)
The problem is: Given n strings, permute the strings such that the corresponding DAG has the cheapest cost, where the cost function is: If we traverse the graph from the source in depth first, left to right order, the total number of nodes we visit, with multiplicity.
So the cost of the first example is 12, because we must visit the A's multiple times on the traversal. The cost of the second example is 6, because we only visit the A's once before we deal with the branches.
I have a feeling this problem is NP Hard. It seems like a question about formal languages and I'm not familiar enough with those sorts of algorithms to figure out how I should go about the reduction. I don't need a complete answer per se, but if someone could point out a class of well known problems that seem related, I would much appreciate it.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
改写一下:
给定单词 w1, …, wn,计算 w1, … 的排列 x1 , xn of wn 来最小化存储 x1, …, xn 的 trie 的大小。
假设字母表的大小无限,这个问题通过减少顶点覆盖而成为 NP 困难问题。 (我相信它可能是字母表大小的固定参数易于处理的。)简化很简单:给定一个图,让每个顶点都是它自己的字母,并为每个边创建一个两个字母的单词。
深度 0 处只有一个节点,深度 2 处的节点数量与边数一样多。深度一处可能的节点集正是顶点覆盖的节点集。
To rephrase:
Given words w1, …, wn, compute permutations x1 of w1, …, xn of wn to minimize the size of the trie storing x1, …, xn.
Assuming an alphabet of unlimited size, this problem is NP-hard via a reduction from vertex cover. (I believe it might be fixed-parameter tractable in the size of the alphabet.) The reduction is easy: given a graph, let each vertex be its own letter and create a two-letter word for each edge.
There is exactly one node at depth zero, and as many nodes at depth two as there are edges. The possible sets of nodes at depth one are exactly the sets of nodes that are vertex covers.