Python big_o似乎返回了完全不正确的结果 - 我在做什么错？

发布于 2025-02-04 19:16:57 字数 1633 浏览 4 评论 0原文

我正在比较使用 big_o 模块，对于以下方法，我的函数不会返回预期的结果：这是


    def itertools_chain_from_iterable(arr): 
        return list(chain.from_iterable(arr))

返回“常数”，这可能是不正确的。

这是：

    def merge_extend(self,arr):
        output = []
        for l in arr:
            output.extend(l)
        return output

（最多不应该是二次吗？

    def merge_w_sum(self,arr): 
        return sum(arr,[])

返回“立方” //mathieularose.com/how-not-not-not-to-flatten-a-list-list-lists-lists-in-python#:%7e:text = we%20can%20Conclude%20that%20％20Sum(LST%2C%20%5B%5B%5B%5B%5B%5B% 5d）％20HAS％20季度％20综合性。％20综合％20to％3A“ rel =“ nofollow noreferrer”>在这里进行证明。

此外，列表理解一个

    def merge_bucket(self,bucket):
        return [number for n in bucket for number in n]

返回“ polynomial”，这似乎很恐怖好吧）

用于计算复杂性的代码：

print('<function name>:', big_o.big_o(<function name>, 
       lambda n:[big_o.datagen.integers(9900,1,9999999) for n in range(50)],
       n_measures=20)[0])

输出：

complexity of itertools_chain_from_iterable: Constant: time = 0.0013 (sec)
complexity of merge_w_sum: Linear: time = 0.46 + 6.2E-07*n (sec)
complexity of merge_extend: Cubic: time = 0.048 + -2.3E-18*n^3 (sec)
complexity of merge_bucket: Polynomial: time = 0.2 * x^-0.019 (sec)

我在做什么（或理解）错误？

原文

I am comparing runtimes of different ways of flattening a list of lists using the big_o module, and for following methods my function does not return the expected results, namely:

This one:


    def itertools_chain_from_iterable(arr): 
        return list(chain.from_iterable(arr))

returns "constant", which can't possibly be true.

This one:

    def merge_extend(self,arr):
        output = []
        for l in arr:
            output.extend(l)
        return output

returns "cubic" (shouldn't it be quadratic at most?), while...

..this one

    def merge_w_sum(self,arr): 
        return sum(arr,[])

returns "linear" (I'm quite sure it should be quadratic, see proof here.

Furthermore, the list comprehension one

    def merge_bucket(self,bucket):
        return [number for n in bucket for number in n]

returns "polynomial", which seems terrifying (would expect linear here as well)

Code used to calculate the complexities:

print('<function name>:', big_o.big_o(<function name>, 
       lambda n:[big_o.datagen.integers(9900,1,9999999) for n in range(50)],
       n_measures=20)[0])

Output:

complexity of itertools_chain_from_iterable: Constant: time = 0.0013 (sec)
complexity of merge_w_sum: Linear: time = 0.46 + 6.2E-07*n (sec)
complexity of merge_extend: Cubic: time = 0.048 + -2.3E-18*n^3 (sec)
complexity of merge_bucket: Polynomial: time = 0.2 * x^-0.019 (sec)

What is it that I'm doing (or understanding) wrong? Many thanks in advance for useful tips!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

君勿笑 2025-02-11 19:16:57

您的lambda忽略了其参数n，而始终产生相同的常数 size输入。大小的输入n。

（注意：最初的问题有8个功能，其中7个被判断为“恒定时间”。它被编辑为更大的常数，然后得到其他判断。我猜您的计算机的速度有些不稳定，因为恒定输入仍应引导为了持续的运行时间，因此在任何情况下都应该像我所说的那样固定输入的功能。）

鉴于它是二维n，或sqrt（n）sqrt（n）的列表。如果您的目标是证明sum（arr，[]）是不好的，则大概是尺寸1的n列表。例如：

lambda n: [[i] for i in range(n)]

一个完整程序：

import big_o
from itertools import chain

def chained(arr):
    return list(chain.from_iterable(arr))

def merge_extend(arr):
    output = []
    for l in arr:
        output.extend(l)
    return output

def merge_w_sum(arr): 
    return sum(arr,[])

def merge_bucket(bucket):
    return [number for n in bucket for number in n]

funcs = [
    (chained, 10**5),
    (merge_extend, 10**5),
    (merge_w_sum, 10**4),
    (merge_bucket, 10**5),
]

for _ in range(3):
    for func, max_n in funcs:
        complexity = big_o.big_o(
            func,
            lambda n: [[0]] * n,
            max_n=max_n,
            n_timings=10,
        )[0]
        print(
            f'{func.__name__:13}',
            complexity
        )
    print()

示例结果：

chained       Linear: time = 8.2E-05 + 5.8E-08*n (sec)
merge_extend  Linear: time = -4.2E-06 + 8.4E-08*n (sec)
merge_w_sum   Quadratic: time = 0.00013 + 2.4E-09*n^2 (sec)
merge_bucket  Linear: time = 0.00046 + 8E-08*n (sec)

chained       Logarithmic: time = -0.0075 + 0.0014*log(n) (sec)
merge_extend  Linear: time = -2E-05 + 8.5E-08*n (sec)
merge_w_sum   Quadratic: time = 0.00011 + 2.4E-09*n^2 (sec)
merge_bucket  Linear: time = -4.2E-05 + 2.6E-07*n (sec)

chained       Linear: time = -1.8E-05 + 1.6E-07*n (sec)
merge_extend  Logarithmic: time = -0.01 + 0.0019*log(n) (sec)
merge_w_sum   Quadratic: time = 8.3E-05 + 2.4E-09*n^2 (sec)
merge_bucket  Linear: time = 7.1E-05 + 8.3E-08*n (sec)

您可以在大多数时候看到它正确，但有时仍然会认为对数而不是线性。在更稳定的系统上，它可能会更好。较大的max_n值也应该有所帮助，但是我尝试了，然后big_o用一些已知内部错误。

另请注意，我为不同功能使用了不同的max_n值。所有人都使用同样的东西是不好的。如果您对所有人都使用相同的内容，那么如果它很大，则二次时间解决方案需要难以忍受的时间很长，如果它很小，线性时间解决方案花费的时间很少，以至于big_o很难区分它与对数或对数或对数或线性化。似乎没有一个对所有人有益的中等价值。理想情况下，big_o将适当地调整max_n，但我认为它不支持它。

Your lambda ignores its argument n and instead always produce the same constant-size input. Produce input of size n instead.

(A note: originally the question had 8 functions and 7 of them were judged "constant time". It was edited to a larger constant and then got other judgements. I guess your computer's speed is somewhat unstable, as the constant input should still lead to constant runtimes and thus "constant time" judgements. In any case, the input-generating function should be fixed like I said.)

Given that it's two-dimensional, you could for example produce n lists of size 1, one list of size n, or sqrt(n) lists of size sqrt(n). Presumably n lists of size 1 is what you want if your goal is to demonstrate that sum(arr, []) is bad. For example:

lambda n: [[i] for i in range(n)]

A full program:

import big_o
from itertools import chain

def chained(arr):
    return list(chain.from_iterable(arr))

def merge_extend(arr):
    output = []
    for l in arr:
        output.extend(l)
    return output

def merge_w_sum(arr): 
    return sum(arr,[])

def merge_bucket(bucket):
    return [number for n in bucket for number in n]

funcs = [
    (chained, 10**5),
    (merge_extend, 10**5),
    (merge_w_sum, 10**4),
    (merge_bucket, 10**5),
]

for _ in range(3):
    for func, max_n in funcs:
        complexity = big_o.big_o(
            func,
            lambda n: [[0]] * n,
            max_n=max_n,
            n_timings=10,
        )[0]
        print(
            f'{func.__name__:13}',
            complexity
        )
    print()

Sample results:

chained       Linear: time = 8.2E-05 + 5.8E-08*n (sec)
merge_extend  Linear: time = -4.2E-06 + 8.4E-08*n (sec)
merge_w_sum   Quadratic: time = 0.00013 + 2.4E-09*n^2 (sec)
merge_bucket  Linear: time = 0.00046 + 8E-08*n (sec)

chained       Logarithmic: time = -0.0075 + 0.0014*log(n) (sec)
merge_extend  Linear: time = -2E-05 + 8.5E-08*n (sec)
merge_w_sum   Quadratic: time = 0.00011 + 2.4E-09*n^2 (sec)
merge_bucket  Linear: time = -4.2E-05 + 2.6E-07*n (sec)

chained       Linear: time = -1.8E-05 + 1.6E-07*n (sec)
merge_extend  Logarithmic: time = -0.01 + 0.0019*log(n) (sec)
merge_w_sum   Quadratic: time = 8.3E-05 + 2.4E-09*n^2 (sec)
merge_bucket  Linear: time = 7.1E-05 + 8.3E-08*n (sec)

You can see it gets it right most of the time, but still sometimes thinks logarithmic instead of linear. On a more stable system it might work better. Larger max_n values should help, too, but I tried and then big_o crashed with some known internal error.

Also note that I used different max_n values for the different functions. It's no good to use the same for all. If you use the same for all, then if it's large, a quadratic time solution takes unbearably long, and if it's small, a linear time solution takes so little time that big_o has trouble differentiating it from logarithmic or linearithmic. There doesn't seem to be a medium value that's good for all. Ideally, big_o would automatically adjust max_n appropriately, but I don't think it supports that.

回复收藏 0 原文