资源配置(最优策略)

发布于 2024-10-19 12:55:05 字数 879 浏览 3 评论 0原文

我知道这不是问这个问题的正确地方,但也许一个聪明的人会遇到并找到解决方案。

我正在尝试编写一个电脑游戏,我需要一个算法来解决这个问题:

游戏是在 2 个玩家之间进行的。每方各有 1000 美元。有三个“盒子”,每个玩家写下他要放入这些盒子中的钱数。然后比较这些金额。谁在盒子里放的钱多,就得 1 分(如果每人抽半分)。谁得分多,谁就赢得对手 1000 美元。示例游戏:

玩家 A:[500, 500, 0] 玩家 B:[333, 333, 334]

玩家 A 获胜,因为他赢得了 A 区和 B 区(但失去了 C 区)。

问题:放置资金的最佳策略是什么?

我还有更多问题要问(与算法相关,而不是与数学相关),但我需要首先知道这个问题的答案。

更新(1):经过更多研究,我了解到这些类型的问题/游戏被称为布洛托上校游戏。我尽了最大努力,发现了一些关于这个主题的(技术性很强的)文档。简而言之,我遇到的问题(如上所述)被称为简单的 Blotto 游戏(只有三个具有对称资源的战场)。困难的是那些拥有10多个资源不对称战场的战场。我读过的所有文档都说简单的 Blotto 游戏很容易解决。问题是,他们都没有真正说出“简单”的解决方案是什么。

更新 (2): 我编写了一个小的动作脚本文件来演示 Tom Sirgedas 提到的论文中的策略。您可以在 megaswf 进行测试。说明: 单击三角形内的一点。红色区域代表获胜案例。蓝色区域代表失败案例,细小的白色线条代表平局。

I know that this is not exactly the right place to ask this question, but maybe a wise guy comes across and has the solution.

I'm trying to write a computer game and I need an algorithm to solve this question:

The game is played between 2 players. Each side has 1.000 dollars. There are three "boxes" and each player writes down the amount of money he is going to place into those boxes. Then these amounts are compared. Whoever placed more money in a box scores 1 point (if draw half point each). Whoever scores more points wins his opponents 1.000 dollars. Example game:

Player A: [500, 500, 0]
Player B: [333, 333, 334]

Player A wins because he won Box A and Box B (but lost Box C).

Question: What is the optimal strategy to place the money?

I have more questions to ask (algorithm related, not math related) but I need to know the answer to this one first.

Update (1): After some more research I've learned that these type of problems/games are called Colonel Blotto Games. I did my best and found few (highly technical) documents on the subject. Cutting it short, the problem I have (as described above) is called simple Blotto Game (only three battlefields with symmetric resources). The difficult ones are the ones with, say, 10+ battle fields with non-symmetric resources. All the documents I've read say that the simple Blotto game is easy to solve. The thing is, none of them actually say what that "easy" solution is.

Update (2): I wrote a small actionscript file to demonstrate the strategy in the paper mentioned by Tom Sirgedas. You can test it at megaswf. Instructions: Click a point inside the triangle. Red region represent winning cases. Blue region represents losing cases, tiny whitish lines represents draw.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

高速公鹿 2024-10-26 12:55:05

我在本文中找到了最佳策略: http://www.rand.org /pubs/research_memoranda/2006/RM408.pdf

让我们称之为 Blotto 策略。

在此处输入图像描述

查看上图。你所做的任何移动都可以用三角形上的一个点来表示。论文中的策略是在六边形中随机选择一个点。以较高概率选择更接近六边形边缘的点(六边形中心为 0 概率,并线性放大到六边形轮廓处的最大概率。六边形轮廓上的每个点都有相等的概率。)

此解决方案适用于“连续”Blotto,但我假设您对离散情况感兴趣(将 N 支部队分为 3 组)。当 N 是 3 的倍数时,将 Blotto 策略应用于离散情况效果非常好。对于 N 的其他值,我能够对六边形边界进行小的调整,效果很好,但并不完美。

如果有一种策略可以打败这个策略,那么一定有一些静态的动作可以战胜 Blotto 的策略。没有,除了当N不是3的倍数时,那么似乎大三角形和六边形相交线上的移动(例如移动<0,.5,.5>)将战胜Blotto的策略多于失败。对于 N=100,差异似乎小于 1%,并且对于更大的 N 继续缩小。

实现 Blotto 策略的代码:

// generate a random number in the range [0,x) -- compensate for rand()%x being slightly un-uniform
int rnd( int x ) { int r; while ( 1 ) { r = rand(); if ( r < RAND_MAX/x*x ) return r % x; } }

// distance from center of triangle/hexagon to (x,y,z), multiplied by 3 (trilinear coordinates)
int hexagonalDist3( int x, int y, int z, int N ) { return max(max(abs(N-x*3),abs(N-y*3)),abs(N-z*3)); }

void generateRandomSimpleBlottoMove( int& x, int& y, int& z, int totalTroops )
{
   int N = totalTroops;
   while ( true )
   {
      x = rnd(N+1);
      y = rnd(N+1);
      z = N-x-y;

      // keep only moves in hexagon, with moves closer to the border having higher probability
      double relativeProbabilityOfKeepingThisMove = hexagonalDist3(x,y,z,N) > N ? 0 : hexagonalDist3(x,y,z,N);

      // minor adjustment for hexagon border when N is not a multiple of 3 -- not perfect, but "very close"
      if ( N % 3 != 0 && hexagonalDist3(x,y,z,N) == N ) 
         relativeProbabilityOfKeepingThisMove = N*(N%3)/3; 

      // possibly keep our move 
      if ( rnd(N) < relativeProbabilityOfKeepingThisMove )
         break;
   }
}

I found an optimal strategy in this paper: http://www.rand.org/pubs/research_memoranda/2006/RM408.pdf

Let's call this Blotto's strategy.

enter image description here

Look at the diagram above. Any move you make can be represented by a point on the triangle. The strategy in the paper says to pick a point at random in the hexagon. Choose points closer to the edge of the hexagon with higher probability (0 probability for the center of the hexagon, and linearly scale up to the maximum probability at the hexagon outline. Every point on the hexagon outline has equal probability.)

This solution is for "continuous" Blotto, but I assume you are interested in the discrete case (dividing N troops into 3 groups). Applying Blotto's strategy to the discrete case works perfectly well, when N is a multiple of 3. For other values of N, I was able to make a small adjustment on the hexagon border that works very well, but not perfectly.

If there is a strategy that can defeat this one, there must be some static move which wins against Blotto's strategy. There is none, except for when N is not a multiple of 3, then it seems that a move on the line where the big triangle and hexagon meet (e.g. the move <0,.5,.5>) will win against Blotto's strategy slightly more than lose. For N=100, the difference seems to be less than 1%, and continues to shrink for larger N.

Code to implement Blotto's strategy:

// generate a random number in the range [0,x) -- compensate for rand()%x being slightly un-uniform
int rnd( int x ) { int r; while ( 1 ) { r = rand(); if ( r < RAND_MAX/x*x ) return r % x; } }

// distance from center of triangle/hexagon to (x,y,z), multiplied by 3 (trilinear coordinates)
int hexagonalDist3( int x, int y, int z, int N ) { return max(max(abs(N-x*3),abs(N-y*3)),abs(N-z*3)); }

void generateRandomSimpleBlottoMove( int& x, int& y, int& z, int totalTroops )
{
   int N = totalTroops;
   while ( true )
   {
      x = rnd(N+1);
      y = rnd(N+1);
      z = N-x-y;

      // keep only moves in hexagon, with moves closer to the border having higher probability
      double relativeProbabilityOfKeepingThisMove = hexagonalDist3(x,y,z,N) > N ? 0 : hexagonalDist3(x,y,z,N);

      // minor adjustment for hexagon border when N is not a multiple of 3 -- not perfect, but "very close"
      if ( N % 3 != 0 && hexagonalDist3(x,y,z,N) == N ) 
         relativeProbabilityOfKeepingThisMove = N*(N%3)/3; 

      // possibly keep our move 
      if ( rnd(N) < relativeProbabilityOfKeepingThisMove )
         break;
   }
}
指尖凝香 2024-10-26 12:55:05

所以这实际上是一个博弈论问题(经济学)而不是一个离散数学问题,重新标记你的问题可能会吸引你想要的关注。在博弈论中,“解决方案”的概念通常是纳什均衡。对于零和游戏,事实证明您想要解决的算法是线性规划问题。有关如何设置的示例,请参阅此维基百科页面

So this actually is a game theory question (economics) not a discrete math problem, re-tagging your question might attract the kind of attention you want. In game theory, the idea of a "solution" is usually Nash equilibrium. for zero sum games, it turns out the algorithm you want to solve this is a linear programming problem. See this wikipedia page for an example of how to set it up.

榆西 2024-10-26 12:55:05

在我看来,证明这个博弈不存在纯纳什均衡是相当容易的。纯策略是固定的,例如[333,333,334]。我的证明草图如下:

对于玩家玩的任何纯策略
A、玩家B可以再找一个纯粹的
获得 B 2 分的策略。为了
例如,如果 A 玩 [500,500,0],则 B
播放 [501,0,499],或者如果 A 播放
[333,333,334] 然后 B 上场
[500,500,0],等等。有
总是有办法得到2分。的
当然,这意味着玩家A会
获得 1 分。

同样,对于任何策略
玩家B,玩家A可以找到另一个
纯粹的策略让他 2. 因此,
不存在纯粹的纳什。

另外,我认为可以证明

1/3 [500,500,0] 的策略(对于两者),
1/3 [500,0,500],
1/3 [0,500,500]

(以 1/3 概率玩 [500,500,0],以 1/3 概率玩 [500,0,500],以 1/3 概率玩 [0,500,500])是此游戏的混合纳什均衡。在此策略下,他们的预期收益 (#points) 为 3/2。证明对我来说似乎很困难。也许其他人会有一个简单的证明。

混合纳什是我们所能得到的最接近“最佳”的结果。这个游戏可能还有其他混合纳什均衡。

It seems to me that it would be fairly easy to prove that this game has no pure Nash equilibrium. A pure strategy is a fixed one, like [333,333,334]. My sketch of a proof is as follows:

For any pure strategy played by player
A, player B can find another pure
strategy that gets B 2 points. For
example, if A plays [500,500,0] then B
plays [501,0,499], or if A plays
[333,333,334] then B plays
[500,500,0], and so on. There is
always a way to get 2 points. Of
course, this means that player A would
get 1 point.

Similarly, for any strategy played by
player B, player A can find another
pure strategy that gets him 2. Thus,
no pure Nash exists.

Also, I think that it can be proven that the strategy (for both) of

1/3 [500,500,0],
1/3 [500,0,500],
1/3 [0,500,500]

(play [500,500,0] with 1/3 probability and [500,0,500] with 1/3 probability and [0,500,500] with 1/3 probability) is a mixed Nash equilibrium for this game. Under this strategy their expected gain (#points) is 3/2. The proof seems laborious to me. Maybe someone else will have a simple proof.

A mixed Nash is as closed to "optimal" as we can get. There are probably other mixed Nash equilibria for this game.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文