Code Golf:找到加权中位数的最短代码?

发布于 2024-07-23 08:48:07 字数 2180 浏览 8 评论 0原文

我对代码高尔夫的尝试。

寻找 ΣW_i* 最小值的问题|X-X_i| 简化为查找权重为 w[i]x[i] 列表的加权中位数(见下文)用于定义)。 你将如何用一个最短、最简单、最漂亮的程序来做到这一点?

这是我的代码最初的样子(解释在问题的答案中,并发布了简短版本作为下面的答案之一)。

    #define zero(x) ( abs(x) < 1e-10 )  /* because == doesn't work for floats */

    float sum = 0;
    int i;

    for (i = 0; i < n; i++) 
         sum += w[i];
    while (sum > 0) 
         sum -= 2*w[--i];

    right = x[i]             // the rightmost minimum point
    left  = ( zero(sum) && zero(w[i]-w[i-1]) ) ? x[i-1] : right;
    answer = (left + right) / 2;

(实际上,它已经被大量优化,因为您可以看到变量isum的重用)

规则

浮点和整数:不同的语言有不同的浮点运算标准,所以我重新表述问题,使 x[i]w[i]整数,并且您可以返回两倍的值如果您愿意,可以选择答案(始终为整数)。 您可以返回、打印答案或将答案分配给变量。

加权中位数的定义和说明:

  • 长度为 n 的排序数组 x[i] 的中位数为 x[n/2](x[n/2-1/2]+x[n/2+1/2])/2 取决于 n 是否为奇数甚至
  • 未排序数组的中位数是排序后数组的中位数(正确,但我们的数组已排序)
  • 具有整数正权重的x[i]的加权中位数w[i] 被定义为较大数组的中位数,其中每次出现的 x[i] 都已更改为 w[i] 出现的 x[i]

我希望看到的问题

的原因之一是我认为最合适的语言将具有简单的数组求和和 lambda 迭代。 我认为函数式语言可能是合理的,但我对此不确定 - 所以这是问题的一部分。 我希望看到像

    // standard function   add  :=  (a,b) :-> a + b 
    myreduce := w.reduce  
        with:  add  
        until: (value) :-> 2*value >= (w.reduce with:add)
    answer = x [myreduce  from:Begin] + x [myreduce  from:End]

Dunno 这样的语言,是否有任何语言可以做到这一点并且实际上更短。

测试数据

static int n = 10;
for (int j = 0; j < n; j++) {
        w[j] = j + 1;
        x[j] = j;
}

答案:6或12。

static int n = 9;
int w[n], x[n] ;
for (int j = 0; j < n; j++) {
    w[j] = j + ((j<6) ? 1 : 0);
    x[j] = j + 1;
}

答案:6.5或13。

My try at code golfing.

The problem of finding the minimum value of ∑W_i*|X-X_i| reduces to finding the weighted median of a list of x[i] with weights w[i] (see below for definition). How will you do that with a shortest, simplest and most beautiful program?

Here's how my code looked originally (explanation is in the answer to the question and short version is posted as one of the answers below).

    #define zero(x) ( abs(x) < 1e-10 )  /* because == doesn't work for floats */

    float sum = 0;
    int i;

    for (i = 0; i < n; i++) 
         sum += w[i];
    while (sum > 0) 
         sum -= 2*w[--i];

    right = x[i]             // the rightmost minimum point
    left  = ( zero(sum) && zero(w[i]-w[i-1]) ) ? x[i-1] : right;
    answer = (left + right) / 2;

(Actually, it's been already heavily optimized as you see variables i and sum reused)

Rules

Floats and integers: different languages have different floating point arithmetic standards, so I reformulate the problem to have x[i] and w[i] to be integers and you can return twice the value of the answer (which is always integer) if you prefer. You can return, print or assign the answer to variable.

Definition of weighted median and clarifications:

  • Median of sorted array x[i] of length n is either x[n/2] or (x[n/2-1/2]+x[n/2+1/2])/2 depending on whether n is odd or even
  • Median of unsorted array is the median of array after sort (true, but our array is sorted)
  • Weighted median of x[i] with integer positive weights w[i] is defined as the median of larger array where each occurrence of x[i] has been changed into w[i] occurrences of x[i].

What I hope to see

One of the reasons for asking is that I assume the most suitable language will have trivial array summation and iteration with lambdas. I thought a functional language could be reasonable, but I'm not sure about that - so it's part of the question. My hope is to see something like

    // standard function   add  :=  (a,b) :-> a + b 
    myreduce := w.reduce  
        with:  add  
        until: (value) :-> 2*value >= (w.reduce with:add)
    answer = x [myreduce  from:Begin] + x [myreduce  from:End]

Dunno if there's any language where this is possible and is actually shorter.

Test data

static int n = 10;
for (int j = 0; j < n; j++) {
        w[j] = j + 1;
        x[j] = j;
}

Answer: 6 or 12.

static int n = 9;
int w[n], x[n] ;
for (int j = 0; j < n; j++) {
    w[j] = j + ((j<6) ? 1 : 0);
    x[j] = j + 1;
}

Answer: 6.5 or 13.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

忘年祭陌 2024-07-30 08:48:07

J

直接将其输入到解释器中。 提示符是三个空格,因此缩进的行是用户输入的。

   m=:-:@+/@(((2*+/\)I.+/)"1@(,:(\:i.@#))@[{"0 1(,:(\:i.@#))@])

我在其他答案中使用的测试数据:

   1 1 1 1 m 1 2 3 4
2.5
   1 1 2 1 m 1 2 3 4
3
   1 2 2 5 m 1 2 3 4
3.5
   1 2 2 6 m 1 2 3 4
4

添加到问题中的测试数据:

   (>:,:[)i.10
1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8  9
   (>:m[)i.10
6
   (([+<&6),:>:)i.9
1 2 3 4 5 6 6 7 8
1 2 3 4 5 6 7 8 9
   (([+<&6)m>:)i.9
6.5

   i =: (2 * +/\) I. +/

第一个索引,使总和大于或等于累积和的两倍。

   j =: ,: (\: i.@#)

列表及其相反。

   k =: i"1 @ j @ [

第一个索引使得左参数及其相反的-见上文-

   l =: k {"(0 1) j @ ]

这些索引分别从正确的参数及其相反的参数中提取。

   m =: -: @ +/ @ l

结果列表总和的一半。

J

Go ahead and type this directly into the interpreter. The prompt is three spaces, so the indented lines are user input.

   m=:-:@+/@(((2*+/\)I.+/)"1@(,:(\:i.@#))@[{"0 1(,:(\:i.@#))@])

The test data I used in my other answer:

   1 1 1 1 m 1 2 3 4
2.5
   1 1 2 1 m 1 2 3 4
3
   1 2 2 5 m 1 2 3 4
3.5
   1 2 2 6 m 1 2 3 4
4

The test data added to the question:

   (>:,:[)i.10
1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8  9
   (>:m[)i.10
6
   (([+<&6),:>:)i.9
1 2 3 4 5 6 6 7 8
1 2 3 4 5 6 7 8 9
   (([+<&6)m>:)i.9
6.5

   i =: (2 * +/\) I. +/

First index such that total sum is greater than or equal to double the accumulated sum.

   j =: ,: (\: i.@#)

List and its reverse.

   k =: i"1 @ j @ [

First indices such that -see above- of the left argument and its reverse.

   l =: k {"(0 1) j @ ]

Those indices extracted from the right argument and its reverse, respectively.

   m =: -: @ +/ @ l

Half the sum of the resulting list.

肩上的翅膀 2024-07-30 08:48:07

所以,这就是我如何压缩自己的解决方案:仍然留下一些空格:

    int s = 0, i = 0;
    for (; i < n; s += w[i++]) ;
    while ( (s -= 2*w[--i] ) > 0) ;
    a  =  x[i]  +  x[ !s && (w[i]==w[i-1]) ? i-1 : i ]; 

So, here's how I could squeeze my own solution:, still leaving some whitespaces:

    int s = 0, i = 0;
    for (; i < n; s += w[i++]) ;
    while ( (s -= 2*w[--i] ) > 0) ;
    a  =  x[i]  +  x[ !s && (w[i]==w[i-1]) ? i-1 : i ]; 
久光 2024-07-30 08:48:07

Haskell 代码,非高尔夫:尝试合理的功能解决方案。

import Data.List (zip4)
import Data.Maybe (listToMaybe)

mid :: (Num a, Ord a) => [a] -> (Int, Bool)
mid w = (i, total == part && maybe False (l ==) r) where
    (i, l, r, part):_ = dropWhile less . zip4 [0..] w v $ map (2*) sums
    _:sums = scanl (+) 0 w; total = last sums; less (_,_,_,x) = x < total
    v = map Just w ++ repeat Nothing

wmedian :: (Num a, Ord a) => [a] -> [a] -> (a, Maybe a)
wmedian w x = (left, if rem then listToMaybe rest else Nothing) where
    (i, rem) = mid w; left:rest = drop i x
> wmedian [1,1,1,1] [1,2,3,4]
(2,Just 3)
> wmedian [1,1,2,1] [1,2,3,4]
(3,Nothing)
> wmedian [1,2,2,5] [1,2,3,4]
(3,Just 4)
> wmedian [1,2,2,6] [1,2,3,4]
(4,Nothing)
> wmedian [1..10] [0..9]
(6,Nothing)
> wmedian ([1..6]++[6..8]) [1..9]
(6,Just 7)

我最初的 J 解决方案是上述 Haskell 代码的直接翻译。

这是当前 J 代码的 Haskell 翻译:

{-# LANGUAGE ParallelListComp #-}
import Data.List (find); import Data.Maybe (fromJust)
w&x=foldr((+).fst.fromJust.find((>=sum w).snd))0[f.g(+)0$map
    (2*)w|f<-[zip x.tail,reverse.zip x]|g<-[scanl,scanr]]/2

是的……请不要编写这样的代码。

> [1,1,1,1]&[1,2,3,4]
2.5
> [1,1,2,1]&[1,2,3,4]
3
> [1,2,2,5]&[1,2,3,4]
3.5
> [1,2,2,6]&[1,2,3,4]
4
> [1..10]&[0..9]
6
> ([1..6]++[6..8])&[1..9]
6.5

Haskell code, ungolfed: trying for a reasonable functional solution.

import Data.List (zip4)
import Data.Maybe (listToMaybe)

mid :: (Num a, Ord a) => [a] -> (Int, Bool)
mid w = (i, total == part && maybe False (l ==) r) where
    (i, l, r, part):_ = dropWhile less . zip4 [0..] w v $ map (2*) sums
    _:sums = scanl (+) 0 w; total = last sums; less (_,_,_,x) = x < total
    v = map Just w ++ repeat Nothing

wmedian :: (Num a, Ord a) => [a] -> [a] -> (a, Maybe a)
wmedian w x = (left, if rem then listToMaybe rest else Nothing) where
    (i, rem) = mid w; left:rest = drop i x
> wmedian [1,1,1,1] [1,2,3,4]
(2,Just 3)
> wmedian [1,1,2,1] [1,2,3,4]
(3,Nothing)
> wmedian [1,2,2,5] [1,2,3,4]
(3,Just 4)
> wmedian [1,2,2,6] [1,2,3,4]
(4,Nothing)
> wmedian [1..10] [0..9]
(6,Nothing)
> wmedian ([1..6]++[6..8]) [1..9]
(6,Just 7)

My original J solution was a straightforward translation of the above Haskell code.

Here's a Haskell translation of the current J code:

{-# LANGUAGE ParallelListComp #-}
import Data.List (find); import Data.Maybe (fromJust)
w&x=foldr((+).fst.fromJust.find((>=sum w).snd))0[f.g(+)0$map
    (2*)w|f<-[zip x.tail,reverse.zip x]|g<-[scanl,scanr]]/2

Yeah… please don't write code like this.

> [1,1,1,1]&[1,2,3,4]
2.5
> [1,1,2,1]&[1,2,3,4]
3
> [1,2,2,5]&[1,2,3,4]
3.5
> [1,2,2,6]&[1,2,3,4]
4
> [1..10]&[0..9]
6
> ([1..6]++[6..8])&[1..9]
6.5
小猫一只 2024-07-30 08:48:07

简短,并且执行您所期望的操作。 不是特别节省空间。

def f(l,i):
   x,y=[],sum(i)
   map(x.extend,([m]*n for m,n in zip(l,i)))
   return (x[y/2]+x[(y-1)/2])/2.

这是使用 itertools 的恒定空间版本。 它仍然需要迭代 sum(i)/2 次,因此它不会击败索引计算算法。

from itertools import *
def f(l,i):
   y=sum(i)-1
   return sum(islice(
       chain(*([m]*n for m,n in zip(l,i))),
       y/2,
       (y+1)/2+1
   ))/(y%2+1.)

short, and does what you'd expect. Not particularly space-efficient.

def f(l,i):
   x,y=[],sum(i)
   map(x.extend,([m]*n for m,n in zip(l,i)))
   return (x[y/2]+x[(y-1)/2])/2.

here's the constant-space version using itertools. it still has to iterate sum(i)/2 times so it won't beat the index-calculating algorithms.

from itertools import *
def f(l,i):
   y=sum(i)-1
   return sum(islice(
       chain(*([m]*n for m,n in zip(l,i))),
       y/2,
       (y+1)/2+1
   ))/(y%2+1.)
猫烠⑼条掵仅有一顆心 2024-07-30 08:48:07

Python:

a=sum([[X]*W for X,W in zip(x,w)],[]);l=len(a);a[l/2]+a[(l-1)/2]

Python:

a=sum([[X]*W for X,W in zip(x,w)],[]);l=len(a);a[l/2]+a[(l-1)/2]
扮仙女 2024-07-30 08:48:07

像这样的东西吗? O(n) 运行时间。

for(int i = 0; i < x.length; i++)
{
sum += x[i] * w[i];
sums.push(sum);
}

median = sum/2;

for(int i = 0; i < array.length - 1; i++)
{
    if(median > sums[element] and median < sums[element+1]
         return x[i];
    if(median == sums[element])
         return (x[i] + x[i+1])/2
}

不确定如何获得中位数的两个答案,您的意思是 sum/2 是否完全等于边界?

编辑:查看格式化的代码后,我的代码基本上执行相同的操作,您想要更有效的方法吗?

EDIT2:搜索部分可以使用修改后的二进制搜索来完成,这将使其稍微快一些。

index = sums.length /2;
finalIndex = binarySearch(index);

int binarySearch(i)
{
    if(median > sums[i+1])
    {
        i += i/2
        return binarySearch(i);
    }
    else if(median < sums[i])
    {
        i -= i/2
        return binarySearch(i);
    }
    return i;
}

必须进行一些检查以确保它不会在边缘情况下无限地进行。

Something like this? O(n) running time.

for(int i = 0; i < x.length; i++)
{
sum += x[i] * w[i];
sums.push(sum);
}

median = sum/2;

for(int i = 0; i < array.length - 1; i++)
{
    if(median > sums[element] and median < sums[element+1]
         return x[i];
    if(median == sums[element])
         return (x[i] + x[i+1])/2
}

Not sure how you can get two answers for the median, do you mean if sum/2 is exactly equal to a boundary?

EDIT: After looking at your formatted code, my code does essentially the same thing, did you want a MORE efficient method?

EDIT2: The search part can be done using a modified binary search, that would make it slightly faster.

index = sums.length /2;
finalIndex = binarySearch(index);

int binarySearch(i)
{
    if(median > sums[i+1])
    {
        i += i/2
        return binarySearch(i);
    }
    else if(median < sums[i])
    {
        i -= i/2
        return binarySearch(i);
    }
    return i;
}

Will have to do some checking to make sure it doesn't go on infinitely on edge cases.

踏雪无痕 2024-07-30 08:48:07

只是对您的代码的评论:我真的希望我不必维护它,除非您还编写了此处所需的所有单元测试:-)

当然,这与您的问题无关,但通常是“最短路线”编码”也是“最难维护的方式”。 对于科学应用来说,它可能不是一个阻碍。 但对于 IT 应用程序来说,确实如此。

我觉得有必要说一下。 一切顺利。

Just a comment about your code : I really hope I will not have to maintain it, unless you also wrote all the unit tests that are required here :-)

It is not related to your question of course, but usually, the "shortest way to code" is also the "hardest way to maintain". For scientific applications, it is probably not a show stopper. But for IT applications, it is.

I think it has to be said. All the best.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文