柯尔莫哥洛夫复杂度近似算法

发布于 2024-09-15 15:06:27 字数 133 浏览 6 评论 0 原文

我正在寻找一种算法,可以计算给定输入字符串的柯尔莫哥洛夫复杂度的近似值。因此,如果 K 是字符串 S 的柯尔莫哥洛夫复杂度,并且 t 代表时间,那么该函数的行为将类似于以下所示: limit(t->inf)[K_approx(t,S)] = K。

I'm looking for a algorithm that can compute an approximation of the Kolmogorov complexity of given input string. So if K is the Kolmogorov complexity of a string S, and t represents time, then the function would behave something like this.. limit(t->inf)[K_approx(t,S)] = K.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

帝王念 2024-09-22 15:06:27

理论上,当运行时间接近无穷大时,程序可以收敛于其输入字符串的柯尔莫哥洛夫复杂度。它可以通过并行运行每个可能的程序(输入字符串的长度或更短)来工作。当找到给定长度的节目时,该长度被识别为目前已知的最小长度,被打印,并且不再尝试>=该长度的节目。该算法(很可能)将永远运行,打印越来越短的长度,在给定无限时间的情况下收敛于精确的柯尔莫哥洛夫复杂度。

当然,运行指数数量的程序是非常困难的。更有效的算法是在 StackOverflow 上发布代码高尔夫 。一些缺点:

  • 可能需要几天时间才能找到好的结果。
  • 它使用了大量我们最宝贵的计算资源,造成了数千美元的生产力损失。
  • 随着时间的推移,由于资源被转移到其他 计算
  • 该算法提前终止,这意味着它不会工作一般。

In theory, a program could converge on the Kolmogorov complexity of its input string as the running time approaches infinity. It could work by running every possible program in parallel that is the length of the input string or shorter. When a program of a given length is found, that length is identified as the minimum length known for now, is printed, and no more programs >= that length are tried. This algorithm will (most likely) run forever, printing shorter and shorter lengths, converging on the exact Kolmogorov complexity given infinite time.

Of course, running an exponential number of programs is highly intractible. A more efficient algorithm is to post a code golf on StackOverflow. A few drawbacks:

  • It can take a few days before good results are found.
  • It uses vast amounts of our most valuable computing resources, costing thousands of dollars in productivity loss.
  • Results are produced with less frequency over time as resources are diverted to other computations.
  • The algorithm terminates prematurely for many inputs, meaning it does not work in general.
撑一把青伞 2024-09-22 15:06:27

我想这可能有用吗?如果有人发现错误,请指出。

function KApprox(S:string,t:integer,TapeSizeMax:integer) : Turing Machine of size k
  begin

    // An abstract data type that represents a turing machine of size k
    var TM(k:integer) : Turing Machine of size k;
    var TMSmallest(k:integer) : Turing Machine of size k;  

    var j : integer;
    var i : integer;

    for (j = t to 0 step -1) // reduce the time counter by 1
      begin
       for (i = TMax to 1 step -1) // go to the next smaller size of TM
         begin
          foreach (TM(i)) // enumerate each TM of size i
             begin 
               if (TM(i).halt(TapeSizeMax) == true) and (TM(i).output() == S) then
                 begin
                   if (sizeof(TM(i)) < sizeof(TMSmallest(i))) then
                      TMSmallest(i): = TM(i);
                 end;
             end;
         end;
      end;      
    return TMSmallest;
 end;

I think this might work? If somebody sees an error, please point it out.

function KApprox(S:string,t:integer,TapeSizeMax:integer) : Turing Machine of size k
  begin

    // An abstract data type that represents a turing machine of size k
    var TM(k:integer) : Turing Machine of size k;
    var TMSmallest(k:integer) : Turing Machine of size k;  

    var j : integer;
    var i : integer;

    for (j = t to 0 step -1) // reduce the time counter by 1
      begin
       for (i = TMax to 1 step -1) // go to the next smaller size of TM
         begin
          foreach (TM(i)) // enumerate each TM of size i
             begin 
               if (TM(i).halt(TapeSizeMax) == true) and (TM(i).output() == S) then
                 begin
                   if (sizeof(TM(i)) < sizeof(TMSmallest(i))) then
                      TMSmallest(i): = TM(i);
                 end;
             end;
         end;
      end;      
    return TMSmallest;
 end;
镜花水月 2024-09-22 15:06:27

柯尔莫哥洛夫复杂度的维基百科页面有一个标题为“柯尔莫哥洛夫复杂度的不可计算性”的小节,位于“基本结果”部分。这并不是一个可以有效计算甚至近似的基本度量。

毫无疑问,有更好的方法可以实现您想要的目标。如果您想要衡量随机性,您可以尝试二元熵函数。标准算法之一的可压缩性也可能符合要求。

The wikipedia page for Kolmogorov complexity has a subsection entitled "Incomputability of Kolmogorov complexity", under the "Basic results" section. This is not intended to be a basic measure that you can compute, or even approximate productively.

There are better ways of achieving what you want, without a doubt. If a measure of randomness is what you want, you could try the binary entropy function. Compressibility by one of the standard algorithms might also fit the bill.

想你只要分分秒秒 2024-09-22 15:06:27

我注意到的第一个问题是“柯尔莫哥洛夫复杂性”没有明确定义。这在某种程度上取决于如何表示程序的选择。因此,您需要做的第一件事是修复程序的某些编码(例如,Joey Adams 的程序用 J 编写的规范)。

一旦你有了编码,你正在寻找的算法就非常简单了。请参阅乔伊的回答。

但情况比必须运行指数级数量的程序还要糟糕。这些程序中的每一个都可以像您想象的那样运行(从技术上讲:作为函数输入大小的运行时间可能比任何递归函数增长得更快)。更重要的是,有些最短的程序可能运行时间最长。因此,虽然并行方法会随着时间趋向无穷大而接近正确值,但其速度会慢得难以想象。

您可以提前停止程序,认为此时的近似值已经足够好。但是,您通常不知道该近似值有多好。事实上,有些定理表明你永远无法知道。

所以简短的答案是“简单,只需使用乔伊的算法”,但无论从实用性的角度来看,答案是“你没有机会”。正如 rwong 所建议的,您最好只使用重型压缩算法。

The first issue that I notice is that "the Kolmogorov Complexity" isn't well defined. It depends to some degree on the choice of how to represent programs. So, the first thing you would need to do is fix some encoding of programs (for example, Joey Adams' specification that programs be written in J).

Once you have the encoding, the algorithm you are looking for is quite simple. See Joey's answer for that.

But the situation is even worse than having to run exponentially many programs. Each of those programs could run as long as you could possibly imagine (technically: running time as a function input size could grow faster than any recursive function). What's more, it could be the case that some of the shortest programs are the ones that run the longest. So while the parallel approach will approach the correct value as time goes to infinity, it will do so unimaginably slowly.

You could stop the program prematurely, figuring that the approximation at that point is good enough. However, you have no idea in general how good that approximation is. In fact, there are theorems that show you can never know.

So the short answer is "easy, just use Joey's algorithm", but by any measure of practicality, the answer is, "you don't have a chance". As has been recommended by rwong, you are better off just using a heavy-duty compression algorithm.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文